The
C++
Programming
Language
Third Edition
Bjarne Stroustrup
AT&T Labs
Murray Hill, New Jersey
Addison-Wesley
An Imprint of Addison Wesley Longman, Inc.
Reading, Massachusetts • Harlow, England • Menlo Park, California
Berkeley, California • Don Mills, Ontario • Sydney
Bonn • Amsterdam • Tokyo • Mexico City
ii
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where
those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been
printed in initial capital letters or all capital letters
The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any
kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in
connection with or arising out of the use of the information contained herein.
The publisher offers discounts on this book when ordered in quantity for special sales. For more information please contact:
Corporate & Professional Publishing Group
Addison-Wesley Publishing Company
One Jacob Way
Reading, Massachusetts 01867
Library of Congress Cataloging-in-Publication Data
Stroustrup, Bjarne
The C++ Programming Language / Bjarne Stroustrup. — 3rd. ed.
p.
cm.
Includes index.
ISBN 0-201-88954-4
1. C++ (Computer Programming Language) I. Title
QA76.73.C153S77
1997
97-20239
005.13’3—dc21
CIP
Copyright © 1997 by AT&T
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or
by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the
publisher. Printed in the United States of America.
This book was typeset in Times and Courier by the author.
ISBN 0-201-88954-4
Printed on recycled paper
1 2 3 4 5 6 7 8 9—CRW—0100999897
First printing, June 1997
Contents
Contents
iii
Preface
v
Preface to Second Edition
vii
Preface to First Edition
ix
Introductory Material
1
1 Notes to the Reader .....................................................................
2 A Tour of C++ .............................................................................
3 A Tour of the Standard Library ..................................................
3
21
45
Part I: Basic Facilities
4
5
6
7
8
9
Types and Declarations ...............................................................
Pointers, Arrays, and Structures ..................................................
Expressions and Statements ........................................................
Functions .....................................................................................
Namespaces and Exceptions .......................................................
Source Files and Programs ..........................................................
67
69
87
107
143
165
197
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
iv
Contents
Part II: Abstraction Mechanisms
10
11
12
13
14
15
Classes ........................................................................................
Operator Overloading .................................................................
Derived Classes ...........................................................................
Templates ....................................................................................
Exception Handling ....................................................................
Class Hierarchies ........................................................................
Part III: The Standard Library
16
17
18
19
20
21
22
Library Organization and Containers ..........................................
Standard Containers ....................................................................
Algorithms and Function Objects ...............................................
Iterators and Allocators ...............................................................
Strings .........................................................................................
Streams ........................................................................................
Numerics .....................................................................................
Part IV: Design Using C++
23 Development and Design ............................................................
24 Design and Programming ...........................................................
25 Roles of Classes ..........................................................................
Appendices
A The C++ Grammar ......................................................................
B Compatibility ..............................................................................
C Technicalities ..............................................................................
Index
221
223
261
301
327
355
389
427
429
461
507
549
579
605
657
689
691
723
765
791
793
815
827
869
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Preface
Programming is understanding.
– Kristen Nygaard
I find using C++ more enjoyable than ever. C++’s support for design and programming has
improved dramatically over the years, and lots of new helpful techniques have been developed for
its use. However, C++ is not just fun. Ordinary practical programmers have achieved significant
improvements in productivity, maintainability, flexibility, and quality in projects of just about any
kind and scale. By now, C++ has fulfilled most of the hopes I originally had for it, and also succeeded at tasks I hadn’t even dreamt of.
This book introduces standard C++† and the key programming and design techniques supported
by C++. Standard C++ is a far more powerful and polished language than the version of C++ introduced by the first edition of this book. New language features such as namespaces, exceptions,
templates, and run-time type identification allow many techniques to be applied more directly than
was possible before, and the standard library allows the programmer to start from a much higher
level than the bare language.
About a third of the information in the second edition of this book came from the first. This
third edition is the result of a rewrite of even larger magnitude. It offers something to even the
most experienced C++ programmer; at the same time, this book is easier for the novice to approach
than its predecessors were. The explosion of C++ use and the massive amount of experience accumulated as a result makes this possible.
The definition of an extensive standard library makes a difference to the way C++ concepts can
be presented. As before, this book presents C++ independently of any particular implementation,
and as before, the tutorial chapters present language constructs and concepts in a ‘‘bottom up’’
order so that a construct is used only after it has been defined. However, it is much easier to use a
well-designed library than it is to understand the details of its implementation. Therefore, the standard library can be used to provide realistic and interesting examples well before a reader can be
assumed to understand its inner workings. The standard library itself is also a fertile source of programming examples and design techniques.
__________________
† ISO/IEC 14882, Standard for the C++ Programming Language.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
vi
Preface
This book presents every major C++ language feature and the standard library. It is organized
around language and library facilities. However, features are presented in the context of their use.
That is, the focus is on the language as the tool for design and programming rather than on the language in itself. This book demonstrates key techniques that make C++ effective and teaches the
fundamental concepts necessary for mastery. Except where illustrating technicalities, examples are
taken from the domain of systems software. A companion, The Annotated C++ Language Standard, presents the complete language definition together with annotations to make it more comprehensible.
The primary aim of this book is to help the reader understand how the facilities offered by C++
support key programming techniques. The aim is to take the reader far beyond the point where he
or she gets code running primarily by copying examples and emulating programming styles from
other languages. Only a good understanding of the ideas behind the language facilities leads to
mastery. Supplemented by implementation documentation, the information provided is sufficient
for completing significant real-world projects. The hope is that this book will help the reader gain
new insights and become a better programmer and designer.
Acknowledgments
In addition to the people mentioned in the acknowledgement sections of the first and second editions, I would like to thank Matt Austern, Hans Boehm, Don Caldwell, Lawrence Crowl, Alan
Feuer, Andrew Forrest, David Gay, Tim Griffin, Peter Juhl, Brian Kernighan, Andrew Koenig,
Mike Mowbray, Rob Murray, Lee Nackman, Joseph Newcomer, Alex Stepanov, David Vandevoorde, Peter Weinberger, and Chris Van Wyk for commenting on draft chapters of this third edition.
Without their help and suggestions, this book would have been harder to understand, contained
more errors, been slightly less complete, and probably been a little bit shorter.
I would also like to thank the volunteers on the C++ standards committees who did an immense
amount of constructive work to make C++ what it is today. It is slightly unfair to single out individuals, but it would be even more unfair not to mention anyone, so I’d like to especially mention
..
Mike Ball, Dag Bruck, Sean Corfield, Ted Goldstein, Kim Knuttila, Andrew Koenig, Josée Lajoie,
Dmitry Lenkov, Nathan Myers, Martin O’Riordan, Tom Plum, Jonathan Shopiro, John Spicer,
Jerry Schwarz, Alex Stepanov, and Mike Vilot, as people who each directly cooperated with me
over some part of C++ and its standard library.
Murray Hill, New Jersey
Bjarne Stroustrup
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Preface to the Second Edition
The road goes ever on and on.
– Bilbo Baggins
As promised in the first edition of this book, C++ has been evolving to meet the needs of its users.
This evolution has been guided by the experience of users of widely varying backgrounds working
in a great range of application areas. The C++ user-community has grown a hundredfold during the
six years since the first edition of this book; many lessons have been learned, and many techniques
have been discovered and/or validated by experience. Some of these experiences are reflected here.
The primary aim of the language extensions made in the last six years has been to enhance C++
as a language for data abstraction and object-oriented programming in general and to enhance it as
a tool for writing high-quality libraries of user-defined types in particular. A ‘‘high-quality
library,’’ is a library that provides a concept to a user in the form of one or more classes that are
convenient, safe, and efficient to use. In this context, safe means that a class provides a specific
type-safe interface between the users of the library and its providers; efficient means that use of the
class does not impose significant overheads in run-time or space on the user compared with handwritten C code.
This book presents the complete C++ language. Chapters 1 through 10 give a tutorial introduction; Chapters 11 through 13 provide a discussion of design and software development issues; and,
finally, the complete C++ reference manual is included. Naturally, the features added and resolutions made since the original edition are integral parts of the presentation. They include refined
overloading resolution, memory management facilities, and access control mechanisms, type-safe
linkage, ccoonnsstt and ssttaattiicc member functions, abstract classes, multiple inheritance, templates, and
exception handling.
C++ is a general-purpose programming language; its core application domain is systems programming in the broadest sense. In addition, C++ is successfully used in many application areas
that are not covered by this label. Implementations of C++ exist from some of the most modest
microcomputers to the largest supercomputers and for almost all operating systems. Consequently,
this book describes the C++ language itself without trying to explain a particular implementation,
programming environment, or library.
This book presents many examples of classes that, though useful, should be classified as
‘‘toys.’’ This style of exposition allows general principles and useful techniques to stand out more
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
viii
Preface to the Second Edition
clearly than they would in a fully elaborated program, where they would be buried in details. Most
of the useful classes presented here, such as linked lists, arrays, character strings, matrices, graphics
classes, associative arrays, etc., are available in ‘‘bulletproof’’ and/or ‘‘goldplated’’ versions from a
wide variety of commercial and non-commercial sources. Many of these ‘‘industrial strength’’
classes and libraries are actually direct and indirect descendants of the toy versions found here.
This edition provides a greater emphasis on tutorial aspects than did the first edition of this
book. However, the presentation is still aimed squarely at experienced programmers and endeavors
not to insult their intelligence or experience. The discussion of design issues has been greatly
expanded to reflect the demand for information beyond the description of language features and
their immediate use. Technical detail and precision have also been increased. The reference manual, in particular, represents many years of work in this direction. The intent has been to provide a
book with a depth sufficient to make more than one reading rewarding to most programmers. In
other words, this book presents the C++ language, its fundamental principles, and the key techniques needed to apply it. Enjoy!
Acknowledgments
In addition to the people mentioned in the acknowledgements section in the preface to the first edition, I would like to thank Al Aho, Steve Buroff, Jim Coplien, Ted Goldstein, Tony Hansen, Lorraine Juhl, Peter Juhl, Brian Kernighan, Andrew Koenig, Bill Leggett, Warren Montgomery, Mike
Mowbray, Rob Murray, Jonathan Shopiro, Mike Vilot, and Peter Weinberger for commenting on
draft chapters of this second edition. Many people influenced the development of C++ from 1985
to 1991. I can mention only a few: Andrew Koenig, Brian Kernighan, Doug McIlroy, and Jonathan
Shopiro. Also thanks to the many participants of the ‘‘external reviews’’ of the reference manual
drafts and to the people who suffered through the first year of X3J16.
Murray Hill, New Jersey
Bjarne Stroustrup
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Preface to the First Edition
Language shapes the way we think,
and determines what we can think about.
– B.L.Whorf
C++ is a general purpose programming language designed to make programming more enjoyable
for the serious programmer. Except for minor details, C++ is a superset of the C programming language. In addition to the facilities provided by C, C++ provides flexible and efficient facilities for
defining new types. A programmer can partition an application into manageable pieces by defining
new types that closely match the concepts of the application. This technique for program construction is often called data abstraction. Objects of some user-defined types contain type information.
Such objects can be used conveniently and safely in contexts in which their type cannot be determined at compile time. Programs using objects of such types are often called object based. When
used well, these techniques result in shorter, easier to understand, and easier to maintain programs.
The key concept in C++ is class. A class is a user-defined type. Classes provide data hiding,
guaranteed initialization of data, implicit type conversion for user-defined types, dynamic typing,
user-controlled memory management, and mechanisms for overloading operators. C++ provides
much better facilities for type checking and for expressing modularity than C does. It also contains
improvements that are not directly related to classes, including symbolic constants, inline substitution of functions, default function arguments, overloaded function names, free store management
operators, and a reference type. C++ retains C’s ability to deal efficiently with the fundamental
objects of the hardware (bits, bytes, words, addresses, etc.). This allows the user-defined types to
be implemented with a pleasing degree of efficiency.
C++ and its standard libraries are designed for portability. The current implementation will run
on most systems that support C. C libraries can be used from a C++ program, and most tools that
support programming in C can be used with C++.
This book is primarily intended to help serious programmers learn the language and use it for
nontrivial projects. It provides a complete description of C++, many complete examples, and many
more program fragments.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
x
Preface to the First Edition
Acknowledgments
C++ could never have matured without the constant use, suggestions, and constructive criticism of
many friends and colleagues. In particular, Tom Cargill, Jim Coplien, Stu Feldman, Sandy Fraser,
Steve Johnson, Brian Kernighan, Bart Locanthi, Doug McIlroy, Dennis Ritchie, Larry Rosler, Jerry
Schwarz, and Jon Shopiro provided important ideas for development of the language. Dave Presotto wrote the current implementation of the stream I/O library.
In addition, hundreds of people contributed to the development of C++ and its compiler by
sending me suggestions for improvements, descriptions of problems they had encountered, and
compiler errors. I can mention only a few: Gary Bishop, Andrew Hume, Tom Karzes, Victor
Milenkovic, Rob Murray, Leonie Rose, Brian Schmult, and Gary Walker.
Many people have also helped with the production of this book, in particular, Jon Bentley,
Laura Eaves, Brian Kernighan, Ted Kowalski, Steve Mahaney, Jon Shopiro, and the participants in
the C++ course held at Bell Labs, Columbus, Ohio, June 26-27, 1985.
Murray Hill, New Jersey
Bjarne Stroustrup
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Introduction
This introduction gives an overview of the major concepts and features of the C++ programming language and its standard library. It also provides an overview of this book
and explains the approach taken to the description of the language facilities and their
use. In addition, the introductory chapters present some background information about
C++, the design of C++, and the use of C++.
Chapters
1 Notes to the Reader
2 A Tour of C++
3 A Tour of the Standard Library
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
2
Introduction
Introduction
‘‘... and you, Marcus, you have given me many things; now I shall give you this good
advice. Be many people. Give up the game of being always Marcus Cocoza. You
have worried too much about Marcus Cocoza, so that you have been really his slave
and prisoner. You have not done anything without first considering how it would
affect Marcus Cocoza’s happiness and prestige. You were always much afraid that
Marcus might do a stupid thing, or be bored. What would it really have mattered? All
over the world people are doing stupid things ... I should like you to be easy, your little heart to be light again. You must from now, be more than one, many people, as
many as you can think of ...’’
– Karen Blixen
(‘‘The Dreamers’’ from ‘‘Seven Gothic Tales’’
written under the pseudonym Isak Dinesen,
Random House, Inc.
Copyright, Isac Dinesen, 1934 renewed 1961)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
1
________________________________________
________________________________________________________________________________________________________________________________________________________________
Notes to the Reader
"The time has come," the Walrus said,
"to talk of many things."
– L.Carroll
Structure of this book — how to learn C++ — the design of C++ — efficiency and structure — philosophical note — historical note — what C++ is used for — C and C++ —
suggestions for C programmers — suggestions for C++ programmers — thoughts about
programming in C++ — advice — references.
1.1 The Structure of This Book
This book consists of six parts:
Introduction: Chapters 1 through 3 give an overview of the C++ language, the key programming
styles it supports, and the C++ standard library.
Part I: Chapters 4 through 9 provide a tutorial introduction to C++’s built-in types and the
basic facilities for constructing programs out of them.
Part II: Chapters 10 through 15 are a tutorial introduction to object-oriented and generic programming using C++.
Part III: Chapters 16 through 22 present the C++ standard library.
Part IV: Chapters 23 through 25 discuss design and software development issues.
Appendices: Appendices A through E provide language-technical details.
Chapter 1 provides an overview of this book, some hints about how to use it, and some background
information about C++ and its use. You are encouraged to skim through it, read what appears interesting, and return to it after reading other parts of the book.
Chapters 2 and 3 provide an overview of the major concepts and features of the C++ programming language and its standard library. Their purpose is to motivate you to spend time on fundamental concepts and basic language features by showing what can be expressed using the complete
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
4
Notes to the Reader
Chapter 1
C++ language. If nothing else, these chapters should convince you that C++ isn’t (just) C and that
C++ has come a long way since the first and second editions of this book. Chapter 2 gives a highlevel acquaintance with C++. The discussion focuses on the language features supporting data
abstraction, object-oriented programming, and generic programming. Chapter 3 introduces the
basic principles and major facilities of the standard library. This allows me to use standard library
facilities in the following chapters. It also allows you to use library facilities in exercises rather
than relying directly on lower-level, built-in features.
The introductory chapters provide an example of a general technique that is applied throughout
this book: to enable a more direct and realistic discussion of some technique or feature, I occasionally present a concept briefly at first and then discuss it in depth later. This approach allows me to
present concrete examples before a more general treatment of a topic. Thus, the organization of
this book reflects the observation that we usually learn best by progressing from the concrete to the
abstract – even where the abstract seems simple and obvious in retrospect.
Part I describes the subset of C++ that supports the styles of programming traditionally done in
C or Pascal. It covers fundamental types, expressions, and control structures for C++ programs.
Modularity – as supported by namespaces, source files, and exception handling – is also discussed.
I assume that you are familiar with the fundamental programming concepts used in Part I. For
example, I explain C++’s facilities for expressing recursion and iteration, but I do not spend much
time explaining how these concepts are useful.
Part II describes C++’s facilities for defining and using new types. Concrete and abstract
classes (interfaces) are presented here (Chapter 10, Chapter 12), together with operator overloading
(Chapter 11), polymorphism, and the use of class hierarchies (Chapter 12, Chapter 15). Chapter 13
presents templates, that is, C++’s facilities for defining families of types and functions. It demonstrates the basic techniques used to provide containers, such as lists, and to support generic programming. Chapter 14 presents exception handling, discusses techniques for error handling, and
presents strategies for fault tolerance. I assume that you either aren’t well acquainted with objectoriented programming and generic programming or could benefit from an explanation of how the
main abstraction techniques are supported by C++. Thus, I don’t just present the language features
supporting the abstraction techniques; I also explain the techniques themselves. Part IV goes further in this direction.
Part III presents the C++ standard library. The aim is to provide an understanding of how to use
the library, to demonstrate general design and programming techniques, and to show how to extend
the library. The library provides containers (such as lliisstt, vveeccttoorr, and m
maapp; Chapter 16, Chapter 17),
standard algorithms (such as ssoorrtt, ffiinndd, and m
meerrggee; Chapter 18, Chapter 19), strings (Chapter 20),
Input/Output (Chapter 21), and support for numerical computation (Chapter 22).
Part IV discusses issues that arise when C++ is used in the design and implementation of large
software systems. Chapter 23 concentrates on design and management issues. Chapter 24 discusses
the relation between the C++ programming language and design issues. Chapter 25 presents some
ways of using classes in design.
Appendix A is C++’s grammar, with a few annotations. Appendix B discusses the relation
between C and C++ and between Standard C++ (also called ISO C++ and ANSI C++) and the versions of C++ that preceded it. Appendix C presents some language-technical examples. Appendix
D explains the standard library’s facilities supporting internationalization. Appendix E discusses
the exception-safety guarantees and requirements of the standard library.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.1.1
Examples and References
5
1.1.1 Examples and References
This book emphasizes program organization rather than the writing of algorithms. Consequently, I
avoid clever or harder-to-understand algorithms. A trivial algorithm is typically better suited to
illustrate an aspect of the language definition or a point about program structure. For example, I
use a Shell sort where, in real code, a quicksort would be better. Often, reimplementation with a
more suitable algorithm is an exercise. In real code, a call of a library function is typically more
appropriate than the code used here for illustration of language features.
Textbook examples necessarily give a warped view of software development. By clarifying and
simplifying the examples, the complexities that arise from scale disappear. I see no substitute for
writing realistically-sized programs for getting an impression of what programming and a programming language are really like. This book concentrates on the language features, the basic techniques from which every program is composed, and the rules for composition.
The selection of examples reflects my background in compilers, foundation libraries, and simulations. Examples are simplified versions of what is found in real code. The simplification is necessary to keep programming language and design points from getting lost in details. There are no
‘‘cute’’ examples without counterparts in real code. Wherever possible, I relegated to Appendix C
language-technical examples of the sort that use variables named x and yy, types called A and B
B, and
functions called ff() and gg().
In code examples, a proportional-width font is used for identifiers. For example:
#iinncclluuddee<iioossttrreeaam
m>
iinntt m
maaiinn()
{
ssttdd::ccoouutt << "H
Heelllloo, nneew
w w
woorrlldd!\\nn";
}
At first glance, this presentation style will seem ‘‘unnatural’’ to programmers accustomed to seeing
code in constant-width fonts. However, proportional-width fonts are generally regarded as better
than constant-width fonts for presentation of text. Using a proportional-width font also allows me
to present code with fewer illogical line breaks. Furthermore, my experiments show that most people find the new style more readable after a short while.
Where possible, the C++ language and library features are presented in the context of their use
rather than in the dry manner of a manual. The language features presented and the detail in which
they are described reflect my view of what is needed for effective use of C++. A companion, The
Annotated C++ Language Standard, authored by Andrew Koenig and myself, is the complete definition of the language together with comments aimed at making it more accessible. Logically,
there ought to be another companion, The Annotated C++ Standard Library. However, since both
time and my capacity for writing are limited, I cannot promise to produce that.
References to parts of this book are of the form §2.3.4 (Chapter 2, section 3, subsection 4),
§B.5.6 (Appendix B, subsection 5.6), and §6.6[10] (Chapter 6, exercise 10). Italics are used sparingly for emphasis (e.g., ‘‘a string literal is not acceptable’’), for first occurrences of important concepts (e.g., polymorphism), for nonterminals of the C++ grammar (e.g., for-statement), and for comments in code examples. Semi-bold italics are used to refer to identifiers, keywords, and numeric
values from code examples (e.g., ccoouunntteerr, ccllaassss, and 11771122).
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
6
Notes to the Reader
Chapter 1
1.1.2 Exercises
Exercises are found at the ends of chapters. The exercises are mainly of the write-a-program variety. Always write enough code for a solution to be compiled and run with at least a few test cases.
The exercises vary considerably in difficulty, so they are marked with an estimate of their difficulty. The scale is exponential so that if a (∗1) exercise takes you ten minutes, a (∗2) might take an
hour, and a (∗3) might take a day. The time needed to write and test a program depends more on
your experience than on the exercise itself. A (∗1) exercise might take a day if you first have to get
acquainted with a new computer system in order to run it. On the other hand, a (∗5) exercise might
be done in an hour by someone who happens to have the right collection of programs handy.
Any book on programming in C can be used as a source of extra exercises for Part I. Any book
on data structures and algorithms can be used as a source of exercises for Parts II and III.
1.1.3 Implementation Note
The language used in this book is ‘‘pure C++’’ as defined in the C++ standard [C++,1998]. Therefore, the examples ought to run on every C++ implementation. The major program fragments in
this book were tried using several C++ implementations. Examples using features only recently
adopted into C++ didn’t compile on every implementation. However, I see no point in mentioning
which implementations failed to compile which examples. Such information would soon be out of
date because implementers are working hard to ensure that their implementations correctly accept
every C++ feature. See Appendix B for suggestions on how to cope with older C++ compilers and
with code written for C compilers.
1.2 Learning C++
The most important thing to do when learning C++ is to focus on concepts and not get lost in
language-technical details. The purpose of learning a programming language is to become a better
programmer; that is, to become more effective at designing and implementing new systems and at
maintaining old ones. For this, an appreciation of programming and design techniques is far more
important than an understanding of details; that understanding comes with time and practice.
C++ supports a variety of programming styles. All are based on strong static type checking, and
most aim at achieving a high level of abstraction and a direct representation of the programmer’s
ideas. Each style can achieve its aims effectively while maintaining run-time and space efficiency.
A programmer coming from a different language (say C, Fortran, Smalltalk, Lisp, ML, Ada, Eiffel,
Pascal, or Modula-2) should realize that to gain the benefits of C++, they must spend time learning
and internalizing programming styles and techniques suitable to C++. The same applies to programmers used to an earlier and less expressive version of C++.
Thoughtlessly applying techniques effective in one language to another typically leads to awkward, poorly performing, and hard-to-maintain code. Such code is also most frustrating to write
because every line of code and every compiler error message reminds the programmer that the language used differs from ‘‘the old language.’’ You can write in the style of Fortran, C, Smalltalk,
etc., in any language, but doing so is neither pleasant nor economical in a language with a different
philosophy. Every language can be a fertile source of ideas of how to write C++ programs.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.2
Learning C++
7
However, ideas must be transformed into something that fits with the general structure and type
system of C++ in order to be effective in the different context. Over the basic type system of a language, only Pyrrhic victories are possible.
C++ supports a gradual approach to learning. How you approach learning a new programming
language depends on what you already know and what you aim to learn. There is no one approach
that suits everyone. My assumption is that you are learning C++ to become a better programmer
and designer. That is, I assume that your purpose in learning C++ is not simply to learn a new syntax for doing things the way you used to, but to learn new and better ways of building systems.
This has to be done gradually because acquiring any significant new skill takes time and requires
practice. Consider how long it would take to learn a new natural language well or to learn to play a
new musical instrument well. Becoming a better system designer is easier and faster, but not as
much easier and faster as most people would like it to be.
It follows that you will be using C++ – often for building real systems – before understanding
every language feature and technique. By supporting several programming paradigms (Chapter 2),
C++ supports productive programming at several levels of expertise. Each new style of programming adds another tool to your toolbox, but each is effective on its own and each adds to your
effectiveness as a programmer. C++ is organized so that you can learn its concepts in a roughly linear order and gain practical benefits along the way. This is important because it allows you to gain
benefits roughly in proportion to the effort expended.
In the continuing debate on whether one needs to learn C before C++, I am firmly convinced
that it is best to go directly to C++. C++ is safer, more expressive, and reduces the need to focus on
low-level techniques. It is easier for you to learn the trickier parts of C that are needed to compensate for its lack of higher-level facilities after you have been exposed to the common subset of C
and C++ and to some of the higher-level techniques supported directly in C++. Appendix B is a
guide for programmers going from C++ to C, say, to deal with legacy code.
Several independently developed and distributed implementations of C++ exist. A wealth of
tools, libraries, and software development environments are also available. A mass of textbooks,
manuals, journals, newsletters, electronic bulletin boards, mailing lists, conferences, and courses
are available to inform you about the latest developments in C++, its use, tools, libraries, implementations, etc. If you plan to use C++ seriously, I strongly suggest that you gain access to such
sources. Each has its own emphasis and bias, so use at least two. For example, see [Barton,1994],
[Booch,1994], [Henricson,1997], [Koenig,1997], [Martin,1995].
1.3 The Design of C++
Simplicity was an important design criterion: where there was a choice between simplifying the
language definition and simplifying the compiler, the former was chosen. However, great importance was attached to retaining a high degree of compatibility with C [Koenig,1989] [Stroustrup,1994] (Appendix B); this precluded cleaning up the C syntax.
C++ has no built-in high-level data types and no high-level primitive operations. For example,
the C++ language does not provide a matrix type with an inversion operator or a string type with a
concatenation operator. If a user wants such a type, it can be defined in the language itself. In fact,
defining a new general-purpose or application-specific type is the most fundamental programming
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
8
Notes to the Reader
Chapter 1
activity in C++. A well-designed user-defined type differs from a built-in type only in the way it is
defined, not in the way it is used. The C++ standard library described in Part III provides many
examples of such types and their uses. From a user’s point of view, there is little difference
between a built-in type and a type provided by the standard library.
Features that would incur run-time or memory overheads even when not used were avoided in
the design of C++. For example, constructs that would make it necessary to store ‘‘housekeeping
information’’ in every object were rejected, so if a user declares a structure consisting of two 16-bit
quantities, that structure will fit into a 32-bit register.
C++ was designed to be used in a traditional compilation and run-time environment, that is, the
C programming environment on the UNIX system. Fortunately, C++ was never restricted to UNIX;
it simply used UNIX and C as a model for the relationships between language, libraries, compilers,
linkers, execution environments, etc. That minimal model helped C++ to be successful on essentially every computing platform. There are, however, good reasons for using C++ in environments
that provide significantly more support. Facilities such as dynamic loading, incremental compilation, and a database of type definitions can be put to good use without affecting the language.
C++ type-checking and data-hiding features rely on compile-time analysis of programs to prevent accidental corruption of data. They do not provide secrecy or protection against someone who
is deliberately breaking the rules. They can, however, be used freely without incurring run-time or
space overheads. The idea is that to be useful, a language feature must not only be elegant; it must
also be affordable in the context of a real program.
For a systematic and detailed description of the design of C++, see [Stroustrup,1994].
1.3.1 Efficiency and Structure
C++ was developed from the C programming language and, with few exceptions, retains C as a
subset. The base language, the C subset of C++, is designed to ensure a very close correspondence
between its types, operators, and statements and the objects that computers deal with directly: numbers, characters, and addresses. Except for the nneew
w, ddeelleettee, ttyyppeeiidd, ddyynnaam
miicc__ccaasstt, and tthhrroow
w operators and the try-block, individual C++ expressions and statements need no run-time support.
C++ can use the same function call and return sequences as C – or more efficient ones. When
even such relatively efficient mechanisms are too expensive, a C++ function can be substituted
inline, so that we can enjoy the notational convenience of functions without run-time overhead.
One of the original aims for C was to replace assembly coding for the most demanding systems
programming tasks. When C++ was designed, care was taken not to compromise the gains in this
area. The difference between C and C++ is primarily in the degree of emphasis on types and structure. C is expressive and permissive. C++ is even more expressive. However, to gain that increase
in expressiveness, you must pay more attention to the types of objects. Knowing the types of
objects, the compiler can deal correctly with expressions when you would otherwise have had to
specify operations in painful detail. Knowing the types of objects also enables the compiler to
detect errors that would otherwise persist until testing – or even later. Note that using the type system to check function arguments, to protect data from accidental corruption, to provide new types,
to provide new operators, etc., does not increase run-time or space overheads in C++.
The emphasis on structure in C++ reflects the increase in the scale of programs written since C
was designed. You can make a small program (say, 1,000 lines) work through brute force even
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.3.1
Efficiency and Structure
9
when breaking every rule of good style. For a larger program, this is simply not so. If the structure
of a 100,000-line program is bad, you will find that new errors are introduced as fast as old ones are
removed. C++ was designed to enable larger programs to be structured in a rational way so that it
would be reasonable for a single person to cope with far larger amounts of code. In addition, the
aim was to have an average line of C++ code express much more than the average line of C or Pascal code. C++ has by now been shown to over-fulfill these goals.
Not every piece of code can be well-structured, hardware-independent, easy-to-read, etc. C++
possesses features that are intended for manipulating hardware facilities in a direct and efficient
way without regard for safety or ease of comprehension. It also possesses facilities for hiding such
code behind elegant and safe interfaces.
Naturally, the use of C++ for larger programs leads to the use of C++ by groups of programmers. C++’s emphasis on modularity, strongly typed interfaces, and flexibility pays off here. C++
has as good a balance of facilities for writing large programs as any language has. However, as
programs get larger, the problems associated with their development and maintenance shift from
being language problems to more global problems of tools and management. Part IV explores
some of these issues.
This book emphasizes techniques for providing general-purpose facilities, generally useful
types, libraries, etc. These techniques will serve programmers of small programs as well as programmers of large ones. Furthermore, because all nontrivial programs consist of many semiindependent parts, the techniques for writing such parts serve programmers of all applications.
You might suspect that specifying a program by using a more detailed type structure would lead
to a larger program source text. With C++, this is not so. A C++ program declaring function argument types, using classes, etc., is typically a bit shorter than the equivalent C program not using
these facilities. Where libraries are used, a C++ program will appear much shorter than its C equivalent, assuming, of course, that a functioning C equivalent could have been built.
1.3.2 Philosophical Note
A programming language serves two related purposes: it provides a vehicle for the programmer to
specify actions to be executed, and it provides a set of concepts for the programmer to use when
thinking about what can be done. The first purpose ideally requires a language that is ‘‘close to the
machine’’ so that all important aspects of a machine are handled simply and efficiently in a way
that is reasonably obvious to the programmer. The C language was primarily designed with this in
mind. The second purpose ideally requires a language that is ‘‘close to the problem to be solved’’
so that the concepts of a solution can be expressed directly and concisely. The facilities added to C
to create C++ were primarily designed with this in mind.
The connection between the language in which we think/program and the problems and solutions we can imagine is very close. For this reason, restricting language features with the intent of
eliminating programmer errors is at best dangerous. As with natural languages, there are great benefits from being at least bilingual. A language provides a programmer with a set of conceptual
tools; if these are inadequate for a task, they will simply be ignored. Good design and the absence
of errors cannot be guaranteed merely by the presence or the absence of specific language features.
The type system should be especially helpful for nontrivial tasks. The C++ class concept has, in
fact, proven itself to be a powerful conceptual tool.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
10
Notes to the Reader
Chapter 1
1.4 Historical Note
I invented C++, wrote its early definitions, and produced its first implementation. I chose and formulated the design criteria for C++, designed all its major facilities, and was responsible for the
processing of extension proposals in the C++ standards committee.
Clearly, C++ owes much to C [Kernighan,1978]. Except for closing a few serious loopholes in
the type system (see Appendix B), C is retained as a subset. I also retained C’s emphasis on facilities that are low-level enough to cope with the most demanding systems programming tasks. C in
turn owes much to its predecessor BCPL [Richards,1980]; in fact, BCPL’s // comment convention
was (re)introduced in C++. The other main source of inspiration for C++ was Simula67
[Dahl,1970] [Dahl,1972]; the class concept (with derived classes and virtual functions) was borrowed from it. C++’s facility for overloading operators and the freedom to place a declaration
wherever a statement can occur resembles Algol68 [Woodward,1974].
Since the original edition of this book, the language has been extensively reviewed and refined.
The major areas for revision were overload resolution, linking, and memory management facilities.
In addition, several minor changes were made to increase C compatibility. Several generalizations
and a few major extensions were added: these included multiple inheritance, ssttaattiicc member functions, ccoonnsstt member functions, pprrootteecctteedd members, templates, exception handling, run-time type
identification, and namespaces. The overall theme of these extensions and revisions was to make
C++ a better language for writing and using libraries. The evolution of C++ is described in [Stroustrup,1994].
The template facility was primarily designed to support statically typed containers (such as lists,
vectors, and maps) and to support elegant and efficient use of such containers (generic programming). A key aim was to reduce the use of macros and casts (explicit type conversion). Templates
were partly inspired by Ada’s generics (both their strengths and their weaknesses) and partly by
Clu’s parameterized modules. Similarly, the C++ exception-handling mechanism was inspired
..
partly by Ada [Ichbiah,1979], Clu [Liskov,1979], and ML [Wikstrom,1987]. Other developments
in the 1985 to 1995 time span – such as multiple inheritance, pure virtual functions, and namespaces – were primarily generalizations driven by experience with the use of C++ rather than ideas
imported from other languages.
Earlier versions of the language, collectively known as ‘‘C with Classes’’ [Stroustrup,1994],
have been in use since 1980. The language was originally invented because I wanted to write some
event-driven simulations for which Simula67 would have been ideal, except for efficiency considerations. ‘‘C with Classes’’ was used for major projects in which the facilities for writing programs
that use minimal time and space were severely tested. It lacked operator overloading, references,
virtual functions, templates, exceptions, and many details. The first use of C++ outside a research
organization started in July 1983.
The name C++ (pronounced ‘‘see plus plus’’) was coined by Rick Mascitti in the summer of
1983. The name signifies the evolutionary nature of the changes from C; ‘‘++’’ is the C increment
operator. The slightly shorter name ‘‘C+’’ is a syntax error; it has also been used as the name of an
unrelated language. Connoisseurs of C semantics find C++ inferior to ++C. The language is not
called D, because it is an extension of C, and it does not attempt to remedy problems by removing
features. For yet another interpretation of the name C++, see the appendix of [Orwell,1949].
C++ was designed primarily so that my friends and I would not have to program in assembler,
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.4
Historical Note
11
C, or various modern high-level languages. Its main purpose was to make writing good programs
easier and more pleasant for the individual programmer. In the early years, there was no C++ paper
design; design, documentation, and implementation went on simultaneously. There was no ‘‘C++
project’’ either, or a ‘‘C++ design committee.’’ Throughout, C++ evolved to cope with problems
encountered by users and as a result of discussions between my friends, my colleagues, and me.
Later, the explosive growth of C++ use caused some changes. Sometime during 1987, it
became clear that formal standardization of C++ was inevitable and that we needed to start preparing the ground for a standardization effort [Stroustrup,1994]. The result was a conscious effort to
maintain contact between implementers of C++ compilers and major users through paper and electronic mail and through face-to-face meetings at C++ conferences and elsewhere.
AT&T Bell Laboratories made a major contribution to this by allowing me to share drafts of
revised versions of the C++ reference manual with implementers and users. Because many of these
people work for companies that could be seen as competing with AT&T, the significance of this
contribution should not be underestimated. A less enlightened company could have caused major
problems of language fragmentation simply by doing nothing. As it happened, about a hundred
individuals from dozens of organizations read and commented on what became the generally
accepted reference manual and the base document for the ANSI C++ standardization effort. Their
names can be found in The Annotated C++ Reference Manual [Ellis,1989]. Finally, the X3J16
committee of ANSI was convened in December 1989 at the initiative of Hewlett-Packard. In June
1991, this ANSI (American national) standardization of C++ became part of an ISO (international)
standardization effort for C++. From 1990, these joint C++ standards committees have been the
main forum for the evolution of C++ and the refinement of its definition. I served on these committees throughout. In particular, as the chairman of the working group for extensions, I was directly
responsible for the handling of proposals for major changes to C++ and the addition of new language features. An initial draft standard for public review was produced in April 1995. The ISO
C++ standard (ISO/IEC 14882) was ratified in 1998.
C++ evolved hand-in-hand with some of the key classes presented in this book. For example, I
designed complex, vector, and stack classes together with the operator overloading mechanisms.
String and list classes were developed by Jonathan Shopiro and me as part of the same effort.
Jonathan’s string and list classes were the first to see extensive use as part of a library. The string
class from the standard C++ library has its roots in these early efforts. The task library described in
[Stroustrup,1987] and in §12.7[11] was part of the first ‘‘C with Classes’’ program ever written. I
wrote it and its associated classes to support Simula-style simulations. The task library has been
revised and reimplemented, notably by Jonathan Shopiro, and is still in extensive use. The stream
library as described in the first edition of this book was designed and implemented by me. Jerry
Schwarz transformed it into the iostreams library (Chapter 21) using Andrew Koenig’s manipulator
technique (§21.4.6) and other ideas. The iostreams library was further refined during standardization, when the bulk of the work was done by Jerry Schwarz, Nathan Myers, and Norihiro Kumagai.
The development of the template facility was influenced by the vveeccttoorr, m
maapp, lliisstt, and ssoorrtt templates devised by Andrew Koenig, Alex Stepanov, me, and others. In turn, Alex Stepanov’s work
on generic programming using templates led to the containers and algorithms parts of the standard
C++ library (§16.3, Chapter 17, Chapter 18, §19.2). The vvaallaarrrraayy library for numerical computation (Chapter 22) is primarily the work of Kent Budge.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
12
Notes to the Reader
Chapter 1
1.5 Use of C++
C++ is used by hundreds of thousands of programmers in essentially every application domain.
This use is supported by about a dozen independent implementations, hundreds of libraries, hundreds of textbooks, several technical journals, many conferences, and innumerable consultants.
Training and education at a variety of levels are widely available.
Early applications tended to have a strong systems programming flavor. For example, several
major operating systems have been written in C++ [Campbell,1987] [Rozier,1988] [Hamilton,1993]
[Berg,1995] [Parrington,1995] and many more have key parts done in C++. I considered uncompromising low-level efficiency essential for C++. This allows us to use C++ to write device drivers
and other software that rely on direct manipulation of hardware under real-time constraints. In such
code, predictability of performance is at least as important as raw speed. Often, so is compactness
of the resulting system. C++ was designed so that every language feature is usable in code under
severe time and space constraints [Stroustrup,1994,§4.5].
Most applications have sections of code that are critical for acceptable performance. However,
the largest amount of code is not in such sections. For most code, maintainability, ease of extension, and ease of testing is key. C++’s support for these concerns has led to its widespread use
where reliability is a must and in areas where requirements change significantly over time. Examples are banking, trading, insurance, telecommunications, and military applications. For years, the
central control of the U.S. long-distance telephone system has relied on C++ and every 800 call
(that is, a call paid for by the called party) has been routed by a C++ program [Kamath,1993].
Many such applications are large and long-lived. As a result, stability, compatibility, and scalability have been constant concerns in the development of C++. Million-line C++ programs are not
uncommon.
Like C, C++ wasn’t specifically designed with numerical computation in mind. However, much
numerical, scientific, and engineering computation is done in C++. A major reason for this is that
traditional numerical work must often be combined with graphics and with computations relying on
data structures that don’t fit into the traditional Fortran mold [Budge,1992] [Barton,1994]. Graphics and user interfaces are areas in which C++ is heavily used. Anyone who has used either an
Apple Macintosh or a PC running Windows has indirectly used C++ because the primary user interfaces of these systems are C++ programs. In addition, some of the most popular libraries supporting X for UNIX are written in C++. Thus, C++ is a common choice for the vast number of applications in which the user interface is a major part.
All of this points to what may be C++’s greatest strength: its ability to be used effectively for
applications that require work in a variety of application areas. It is quite common to find an application that involves local and wide-area networking, numerics, graphics, user interaction, and database access. Traditionally, such application areas have been considered distinct, and they have
most often been served by distinct technical communities using a variety of programming languages. However, C++ has been widely used in all of those areas. Furthermore, it is able to coexist
with code fragments and programs written in other languages.
C++ is widely used for teaching and research. This has surprised some who – correctly – point
out that C++ isn’t the smallest or cleanest language ever designed. It is, however
– clean enough for successful teaching of basic concepts,
– realistic, efficient, and flexible enough for demanding projects,
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.5
Use of C++
13
– available enough for organizations and collaborations relying on diverse development and
execution environments,
– comprehensive enough to be a vehicle for teaching advanced concepts and techniques, and
– commercial enough to be a vehicle for putting what is learned into non-academic use.
C++ is a language that you can grow with.
1.6 C and C++
C was chosen as the base language for C++ because it
[1] is versatile, terse, and relatively low-level;
[2] is adequate for most systems programming tasks;
[3] runs everywhere and on everything; and
[4] fits into the UNIX programming environment.
C has its problems, but a language designed from scratch would have some too, and we know C’s
problems. Importantly, working with C enabled ‘‘C with Classes’’ to be a useful (if awkward) tool
within months of the first thought of adding Simula-like classes to C.
As C++ became more widely used, and as the facilities it provided over and above those of C
became more significant, the question of whether to retain compatibility was raised again and
again. Clearly some problems could be avoided if some of the C heritage was rejected (see, e.g.,
[Sethi,1981]). This was not done because
[1] there are millions of lines of C code that might benefit from C++, provided that a complete
rewrite from C to C++ were unnecessary;
[2] there are millions of lines of library functions and utility software code written in C that
could be used from/on C++ programs provided C++ were link-compatible with and syntactically very similar to C;
[3] there are hundreds of thousands of programmers who know C and therefore need only learn
to use the new features of C++ and not relearn the basics; and
[4] C++ and C will be used on the same systems by the same people for years, so the differences should be either very large or very small so as to minimize mistakes and confusion.
The definition of C++ has been revised to ensure that a construct that is both legal C and legal C++
has the same meaning in both languages (with a few minor exceptions; see §B.2).
The C language has itself evolved, partly under the influence of the development of C++
[Rosler,1984]. The ANSI C standard [C,1990] contains a function declaration syntax borrowed
from ‘‘C with Classes.’’ Borrowing works both ways. For example, the vvooiidd* pointer type was
invented for ANSI C and first implemented in C++. As promised in the first edition of this book,
the definition of C++ has been reviewed to remove gratuitous incompatibilities; C++ is now more
compatible with C than it was originally. The ideal was for C++ to be as close to ANSI C as possible – but no closer [Koenig,1989]. One hundred percent compatibility was never a goal because
that would compromise type safety and the smooth integration of user-defined and built-in types.
Knowing C is not a prerequisite for learning C++. Programming in C encourages many techniques and tricks that are rendered unnecessary by C++ language features. For example, explicit
type conversion (casting) is less frequently needed in C++ than it is in C (§1.6.1). However, good
C programs tend to be C++ programs. For example, every program in Kernighan and Ritchie, The
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
14
Notes to the Reader
Chapter 1
C Programming Language (2nd Edition) [Kernighan,1988], is a C++ program. Experience with
any statically typed language will be a help when learning C++.
1.6.1 Suggestions for C Programmers
The better one knows C, the harder it seems to be to avoid writing C++ in C style, thereby losing
some of the potential benefits of C++. Please take a look at Appendix B, which describes the differences between C and C++. Here are a few pointers to the areas in which C++ has better ways of
doing something than C has:
[1] Macros are almost never necessary in C++. Use ccoonnsstt (§5.4) or eennuum
m (§4.8) to define manifest constants, iinnlliinnee (§7.1.1) to avoid function-calling overhead, tteem
mppllaattees (Chapter 13) to
specify families of functions and types, and nnaam
meessppaaccees (§8.2) to avoid name clashes.
[2] Don’t declare a variable before you need it so that you can initialize it immediately. A
declaration can occur anywhere a statement can (§6.3.1), in for-statement initializers
(§6.3.3), and in conditions (§6.3.2.1).
[3] Don’t use m
maalllloocc(). The nneew
w operator (§6.2.6) does the same job better, and instead of
rreeaalllloocc(), try a vveeccttoorr (§3.8).
[4] Try to avoid vvooiidd*, pointer arithmetic, unions, and casts, except deep within the implementation of some function or class. In most cases, a cast is an indication of a design error. If
you must use an explicit type conversion, try using one of the ‘‘new casts’’ (§6.2.7) for a
more precise statement of what you are trying to do.
[5] Minimize the use of arrays and C-style strings. The C++ standard library ssttrriinngg (§3.5) and
vveeccttoorr (§3.7.1) classes can often be used to simplify programming compared to traditional C
style. In general, try not to build yourself what has already been provided by the standard
library.
To obey C linkage conventions, a C++ function must be declared to have C linkage (§9.2.4).
Most important, try thinking of a program as a set of interacting concepts represented as classes
and objects, instead of as a bunch of data structures with functions twiddling their bits.
1.6.2 Suggestions for C++ Programmers
By now, many people have been using C++ for a decade. Many more are using C++ in a single
environment and have learned to live with the restrictions imposed by early compilers and firstgeneration libraries. Often, what an experienced C++ programmer has failed to notice over the
years is not the introduction of new features as such, but rather the changes in relationships between
features that make fundamental new programming techniques feasible. In other words, what you
didn’t think of when first learning C++ or found impractical just might be a superior approach
today. You find out only by re-examining the basics.
Read through the chapters in order. If you already know the contents of a chapter, you can be
through in minutes. If you don’t already know the contents, you’ll have learned something unexpected. I learned a fair bit writing this book, and I suspect that hardly any C++ programmer knows
every feature and technique presented. Furthermore, to use the language well, you need a perspective that brings order to the set of features and techniques. Through its organization and examples,
this book offers such a perspective.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.7
Thinking about Programming in C++
15
1.7 Thinking about Programming in C++
Ideally, you approach the task of designing a program in three stages. First, you gain a clear understanding of the problem (analysis), then you identify the key concepts involved in a solution
(design), and finally you express that solution in a program (programming). However, the details
of the problem and the concepts of the solution often become clearly understood only through the
effort to express them in a program and trying to get it to run acceptably. This is where the choice
of programming language matters.
In most applications, there are concepts that are not easily represented as one of the fundamental
types or as a function without associated data. Given such a concept, declare a class to represent it
in the program. A C++ class is a type. That is, it specifies how objects of its class behave: how they
are created, how they can be manipulated, and how they are destroyed. A class may also specify
how objects are represented, although in the early stages of the design of a program that should not
be the major concern. The key to writing good programs is to design classes so that each cleanly
represents a single concept. Often, this means that you must focus on questions such as: How are
objects of this class created? Can objects of this class be copied and/or destroyed? What operations can be applied to such objects? If there are no good answers to such questions, the concept
probably wasn’t ‘‘clean’’ in the first place. It might then be a good idea to think more about the
problem and its proposed solution instead of immediately starting to ‘‘code around’’ the problems.
The concepts that are easiest to deal with are the ones that have a traditional mathematical formalism: numbers of all sorts, sets, geometric shapes, etc. Text-oriented I/O, strings, basic containers, the fundamental algorithms on such containers, and some mathematical classes are part of the
standard C++ library (Chapter 3, §16.1.2). In addition, a bewildering variety of libraries supporting
general and domain-specific concepts are available.
A concept does not exist in a vacuum; there are always clusters of related concepts. Organizing
the relationship between classes in a program – that is, determining the exact relationship between
the different concepts involved in a solution – is often harder than laying out the individual classes
in the first place. The result had better not be a muddle in which every class (concept) depends on
every other. Consider two classes, A and B. Relationships such as ‘‘A calls functions from B,’’
‘‘A creates Bs,’’ and ‘‘A has a B member’’ seldom cause major problems, while relationships such
as ‘‘A uses data from B’’ can typically be eliminated.
One of the most powerful intellectual tools for managing complexity is hierarchical ordering,
that is, organizing related concepts into a tree structure with the most general concept as the root.
In C++, derived classes represent such structures. A program can often be organized as a set of
trees or directed acyclic graphs of classes. That is, the programmer specifies a number of base
classes, each with its own set of derived classes. Virtual functions (§2.5.5, §12.2.6) can often be
used to define operations for the most general version of a concept (a base class). When necessary,
the interpretation of these operations can be refined for particular special cases (derived classes).
Sometimes even a directed acyclic graph seems insufficient for organizing the concepts of a
program; some concepts seem to be inherently mutually dependent. In that case, we try to localize
cyclic dependencies so that they do not affect the overall structure of the program. If you cannot
eliminate or localize such mutual dependencies, then you are most likely in a predicament that no
programming language can help you out of. Unless you can conceive of some easily stated relationships between the basic concepts, the program is likely to become unmanageable.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
16
Notes to the Reader
Chapter 1
One of the best tools for untangling dependency graphs is the clean separation of interface and
implementation. Abstract classes (§2.5.4, §12.3) are C++’s primary tool for doing that.
Another form of commonality can be expressed through templates (§2.7, Chapter 13). A class
template specifies a family of classes. For example, a list template specifies ‘‘list of T,’’ where
‘‘T’’ can be any type. Thus, a template is a mechanism for specifying how one type is generated
given another type as an argument. The most common templates are container classes such as lists,
vectors, and associative arrays (maps) and the fundamental algorithms using such containers. It is
usually a mistake to express parameterization of a class and its associated functions with a type
using inheritance. It is best done using templates.
Remember that much programming can be simply and clearly done using only primitive types,
data structures, plain functions, and a few library classes. The whole apparatus involved in defining new types should not be used except when there is a real need.
The question ‘‘How does one write good programs in C++?’’ is very similar to the question
‘‘How does one write good English prose?’’ There are two answers: ‘‘Know what you want to
say’’ and ‘‘Practice. Imitate good writing.’’ Both appear to be as appropriate for C++ as they are
for English – and as hard to follow.
1.8 Advice
Here is a set of ‘‘rules’’ you might consider while learning C++. As you get more proficient you
can evolve them into something suitable for your kind of applications and your style of programming. They are deliberately very simple, so they lack detail. Don’t take them too literally. To
write a good program takes intelligence, taste, and patience. You are not going to get it right the
first time. Experiment!
[1] When you program, you create a concrete representation of the ideas in your solution to some
problem. Let the structure of the program reflect those ideas as directly as possible:
[a] If you can think of ‘‘it’’ as a separate idea, make it a class.
[b] If you can think of ‘‘it’’ as a separate entity, make it an object of some class.
[c] If two classes have a common interface, make that interface an abstract class.
[d] If the implementations of two classes have something significant in common, make that
commonality a base class.
[e] If a class is a container of objects, make it a template.
[f] If a function implements an algorithm for a container, make it a template function implementing the algorithm for a family of containers.
[g] If a set of classes, templates, etc., are logically related, place them in a common namespace.
[2] When you define either a class that does not implement either a mathematical entity like a
matrix or a complex number or a low-level type such as a linked list:
[a] Don’t use global data (use members).
[b] Don’t use global functions.
[c] Don’t use public data members.
[d] Don’t use friends, except to avoid [a] or [c].
[e] Don’t put a ‘‘type field’’ in a class; use virtual functions.
[f] Don’t use inline functions, except as a significant optimization.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.8
Advice
17
More specific or detailed rules of thumb can be found in the ‘‘Advice’’ section of each chapter.
Remember, this advice is only rough rules of thumb, not immutable laws. A piece of advice should
be applied only ‘‘where reasonable.’’ There is no substitute for intelligence, experience, common
sense, and good taste.
I find rules of the form ‘‘never do this’’ unhelpful. Consequently, most advice is phrased as
suggestions of what to do, while negative suggestions tend not to be phrased as absolute prohibitions. I know of no major feature of C++ that I have not seen put to good use. The ‘‘Advice’’ sections do not contain explanations. Instead, each piece of advice is accompanied by a reference to
the appropriate section of the book. Where negative advice is given, that section usually provides a
suggested alternative.
1.8.1 References
There are few direct references in the text, but here is a short list of books and papers that are mentioned directly or indirectly.
[Barton,1994]
John J. Barton and Lee R. Nackman: Scientific and Engineering C++.
Addison-Wesley. Reading, Mass. 1994. ISBN 0-201-53393-6.
[Berg,1995]
William Berg, Marshall Cline, and Mike Girou: Lessons Learned from the
OS/400 OO Project. CACM. Vol. 38 No. 10. October 1995.
[Booch,1994]
Grady Booch: Object-Oriented Analysis and Design. Benjamin/Cummings.
Menlo Park, Calif. 1994. ISBN 0-8053-5340-2.
[Budge,1992]
Kent Budge, J. S. Perry, and A. C. Robinson: High-Performance Scientific
Computation using C++. Proc. USENIX C++ Conference. Portland, Oregon.
August 1992.
[C,1990]
X3 Secretariat: Standard – The C Language. X3J11/90-013. ISO Standard
ISO/IEC 9899. Computer and Business Equipment Manufacturers Association.
Washington, DC, USA.
[C++,1998]
X3 Secretariat: International Standard – The C++ Language. X3J16-14882.
Information Technology Council (NSITC). Washington, DC, USA.
[Campbell,1987] Roy Campbell, et al.: The Design of a Multiprocessor Operating System. Proc.
USENIX C++ Conference. Santa Fe, New Mexico. November 1987.
[Coplien,1995]
James O. Coplien and Douglas C. Schmidt (editors): Pattern Languages of
Program Design. Addison-Wesley. Reading, Mass. 1995. ISBN 0-20160734-4.
[Dahl,1970]
O-J. Dahl, B. Myrhaug, and K. Nygaard: SIMULA Common Base Language.
Norwegian Computing Center S-22. Oslo, Norway. 1970.
[Dahl,1972]
O-J. Dahl and C. A. R. Hoare: Hierarchical Program Construction in Structured Programming. Academic Press, New York. 1972.
[Ellis,1989]
Margaret A. Ellis and Bjarne Stroustrup: The Annotated C++ Reference Manual. Addison-Wesley. Reading, Mass. 1990. ISBN 0-201-51459-1.
[Gamma,1995]
Erich Gamma, et al.: Design Patterns. Addison-Wesley. Reading, Mass.
1995. ISBN 0-201-63361-2.
[Goldberg,1983] A. Goldberg and D. Robson: SMALLTALK-80 – The Language and Its Implementation. Addison-Wesley. Reading, Mass. 1983.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
18
Notes to the Reader
[Griswold,1970]
[Griswold,1983]
[Hamilton,1993]
[Henricson,1997]
[Ichbiah,1979]
[Kamath,1993]
[Kernighan,1978]
[Kernighan,1988]
[Koenig,1989]
[Koenig,1997]
[Knuth,1968]
[Liskov,1979]
[Martin,1995]
[Orwell,1949]
[Parrington,1995]
[Richards,1980]
[Rosler,1984]
[Rozier,1988]
[Sethi,1981]
[Stepanov,1994]
Chapter 1
R. E. Griswold, et al.: The Snobol4 Programming Language. Prentice-Hall.
Englewood Cliffs, New Jersey. 1970.
R. E. Griswold and M. T. Griswold: The ICON Programming Language.
Prentice-Hall. Englewood Cliffs, New Jersey. 1983.
G. Hamilton and P. Kougiouris: The Spring Nucleus: A Microkernel for
Objects. Proc. 1993 Summer USENIX Conference. USENIX.
Mats Henricson and Erik Nyquist: Industrial Strength C++: Rules and Recommendations. Prentice-Hall. Englewood Cliffs, New Jersey. 1997. ISBN 013-120965-5.
Jean D. Ichbiah, et al.: Rationale for the Design of the ADA Programming Language. SIGPLAN Notices. Vol. 14 No. 6. June 1979.
Yogeesh H. Kamath, Ruth E. Smilan, and Jean G. Smith: Reaping Benefits with
Object-Oriented Technology. AT&T Technical Journal. Vol. 72 No. 5.
September/October 1993.
Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language.
Prentice-Hall. Englewood Cliffs, New Jersey. 1978.
Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language
(Second Edition). Prentice-Hall. Englewood Cliffs, New Jersey. 1988. ISBN
0-13-110362-8.
Andrew Koenig and Bjarne Stroustrup: C++: As close to C as possible – but no
closer. The C++ Report. Vol. 1 No. 7. July 1989.
Andrew Koenig and Barbara Moo: Ruminations on C++. Addison Wesley
Longman. Reading, Mass. 1997. ISBN 0-201-42339-1.
Donald Knuth: The Art of Computer Programming. Addison-Wesley. Reading, Mass.
Barbara Liskov et al.: Clu Reference Manual. MIT/LCS/TR-225. MIT Cambridge. Mass. 1979.
Robert C. Martin: Designing Object-Oriented C++ Applications Using the
Booch Method. Prentice-Hall. Englewood Cliffs, New Jersey. 1995. ISBN
0-13-203837-4.
George Orwell: 1984. Secker and Warburg. London. 1949.
Graham Parrington et al.: The Design and Implementation of Arjuna. Computer Systems. Vol. 8 No. 3. Summer 1995.
Martin Richards and Colin Whitby-Strevens: BCPL – The Language and Its
Compiler. Cambridge University Press, Cambridge. England. 1980. ISBN
0-521-21965-5.
L. Rosler: The Evolution of C – Past and Future. AT&T Bell Laboratories
Technical Journal. Vol. 63 No. 8. Part 2. October 1984.
M. Rozier, et al.: CHORUS Distributed Operating Systems. Computing Systems. Vol. 1 No. 4. Fall 1988.
Ravi Sethi: Uniform Syntax for Type Expressions and Declarations. Software
Practice & Experience. Vol. 11. 1981.
Alexander Stepanov and Meng Lee: The Standard Template Library. HP Labs
Technical Report HPL-94-34 (R. 1). August, 1994.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
Section 1.8.1
References
19
[Stroustrup,1986] Bjarne Stroustrup: The C++ Programming Language. Addison-Wesley.
Reading, Mass. 1986. ISBN 0-201-12078-X.
[Stroustrup,1987] Bjarne Stroustrup and Jonathan Shopiro: A Set of C Classes for Co-Routine
Style Programming. Proc. USENIX C++ Conference. Santa Fe, New Mexico.
November 1987.
[Stroustrup,1991] Bjarne Stroustrup: The C++ Programming Language (Second Edition).
Addison-Wesley. Reading, Mass. 1991. ISBN 0-201-53992-6.
[Stroustrup,1994] Bjarne Stroustrup: The Design and Evolution of C++. Addison-Wesley. Reading, Mass. 1994. ISBN 0-201-54330-3.
[Tarjan,1983]
Robert E. Tarjan: Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics. Philadelphia, Penn. 1983. ISBN 0-89871187-8.
[Unicode,1996]
The Unicode Consortium: The Unicode Standard, Version 2.0. AddisonWesley Developers Press. Reading, Mass. 1996. ISBN 0-201-48345-9.
[UNIX,1985]
UNIX Time-Sharing System: Programmer’s Manual. Research Version, Tenth
Edition. AT&T Bell Laboratories, Murray Hill, New Jersey. February 1985.
[Wilson,1996]
Gregory V. Wilson and Paul Lu (editors): Parallel Programming Using C++.
The MIT Press. Cambridge. Mass. 1996. ISBN 0-262-73118-5.
..
..
[Wikstrom,1987] Åke Wikstrom: Functional Programming Using ML. Prentice-Hall. Englewood Cliffs, New Jersey. 1987.
[Woodward,1974] P. M. Woodward and S. G. Bond: Algol 68-R Users Guide. Her Majesty’s Stationery Office. London. England. 1974.
References to books relating to design and larger software development issues can be found at the
end of Chapter 23.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
20
Notes to the Reader
Chapter 1
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley, Inc. ISBN 0-201-70073-5. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
2
________________________________________
________________________________________________________________________________________________________________________________________________________________
A Tour of C++
The first thing we do, let´s
kill all the language lawyers.
– Henry VI, part II
What is C++? — programming paradigms — procedural programming — modularity —
separate compilation — exception handling — data abstraction — user-defined types —
concrete types — abstract types — virtual functions — object-oriented programming —
generic programming — containers — algorithms — language and programming —
advice.
2.1 What is C++? [tour.intro]
C++ is a general-purpose programming language with a bias towards systems programming that
– is a better C,
– supports data abstraction,
– supports object-oriented programming, and
– supports generic programming.
This chapter explains what this means without going into the finer details of the language definition. Its purpose is to give you a general overview of C++ and the key techniques for using it, not
to provide you with the detailed information necessary to start programming in C++.
If you find some parts of this chapter rough going, just ignore those parts and plow on. All will
be explained in detail in later chapters. However, if you do skip part of this chapter, do yourself a
favor by returning to it later.
Detailed understanding of language features – even of all features of a language – cannot compensate for lack of an overall view of the language and the fundamental techniques for using it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
22
A Tour of C++
Chapter 2
2.2 Programming Paradigms [tour.paradigm]
Object-oriented programming is a technique for programming – a paradigm for writing ‘‘good’’
programs for a set of problems. If the term ‘‘object-oriented programming language’’ means anything, it must mean a programming language that provides mechanisms that support the objectoriented style of programming well.
There is an important distinction here. A language is said to support a style of programming if
it provides facilities that make it convenient (reasonably easy, safe, and efficient) to use that style.
A language does not support a technique if it takes exceptional effort or skill to write such programs; it merely enables the technique to be used. For example, you can write structured programs
in Fortran77 and object-oriented programs in C, but it is unnecessarily hard to do so because these
languages do not directly support those techniques.
Support for a paradigm comes not only in the obvious form of language facilities that allow
direct use of the paradigm, but also in the more subtle form of compile-time and/or run-time checks
against unintentional deviation from the paradigm. Type checking is the most obvious example of
this; ambiguity detection and run-time checks are also used to extend linguistic support for paradigms. Extra-linguistic facilities such as libraries and programming environments can provide further support for paradigms.
One language is not necessarily better than another because it possesses a feature the other does
not. There are many examples to the contrary. The important issue is not so much what features a
language possesses, but that the features it does possess are sufficient to support the desired programming styles in the desired application areas:
[1] All features must be cleanly and elegantly integrated into the language.
[2] It must be possible to use features in combination to achieve solutions that would otherwise
require extra, separate features.
[3] There should be as few spurious and ‘‘special-purpose’’ features as possible.
[4] A feature’s implementation should not impose significant overheads on programs that do
not require it.
[5] A user should need to know only about the subset of the language explicitly used to write a
program.
The first principle is an appeal to aesthetics and logic. The next two are expressions of the ideal of
minimalism. The last two can be summarized as ‘‘what you don’t know won’t hurt you.’’
C++ was designed to support data abstraction, object-oriented programming, and generic programming in addition to traditional C programming techniques under these constraints. It was not
meant to force one particular programming style upon all users.
The following sections consider some programming styles and the key language mechanisms
supporting them. The presentation progresses through a series of techniques starting with procedural programming and leading up to the use of class hierarchies in object-oriented programming and
generic programming using templates. Each paradigm builds on its predecessors, each adds something new to the C++ programmer’s toolbox, and each reflects a proven design approach.
The presentation of language features is not exhaustive. The emphasis is on design approaches
and ways of organizing programs rather than on language details. At this stage, it is far more
important to gain an idea of what can be done using C++ than to understand exactly how it can be
achieved.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.3
Procedural Programming
23
2.3 Procedural Programming [tour.proc]
The original programming paradigm is:
Decide which procedures you want;
use the best algorithms you can find.
The focus is on the processing – the algorithm needed to perform the desired computation. Languages support this paradigm by providing facilities for passing arguments to functions and returning values from functions. The literature related to this way of thinking is filled with discussion of
ways to pass arguments, ways to distinguish different kinds of arguments, different kinds of functions (e.g., procedures, routines, and macros), etc.
A typical example of ‘‘good style’’ is a square-root function. Given a double-precision
floating-point argument, it produces a result. To do this, it performs a well-understood mathematical computation:
ddoouubbllee ssqqrrtt(ddoouubbllee aarrgg)
{
// code for calculating a square root
}
vvooiidd ff()
{
ddoouubbllee rroooott22 = ssqqrrtt(22);
// ...
}
Curly braces, { }, express grouping in C++. Here, they indicate the start and end of the function
bodies. The double slash, //, begins a comment that extends to the end of the line. The keyword
vvooiidd indicates that a function does not return a value.
From the point of view of program organization, functions are used to create order in a maze of
algorithms. The algorithms themselves are written using function calls and other language facilities. The following subsections present a thumb-nail sketch of C++’s most basic facilities for
expressing computation.
2.3.1 Variables and Arithmetic [tour.var]
Every name and every expression has a type that determines the operations that may be performed
on it. For example, the declaration
iinntt iinncchh;
specifies that iinncchh is of type iinntt; that is, iinncchh is an integer variable.
A declaration is a statement that introduces a name into the program. It specifies a type for that
name. A type defines the proper use of a name or an expression.
C++ offers a variety of fundamental types, which correspond directly to hardware facilities. For
example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
24
A Tour of C++
bbooooll
cchhaarr
iinntt
ddoouubbllee
Chapter 2
// Boolean, possible values are true and false
// character, for example, ’a’, ’z’, and ’9’
// integer, for example, 1, 42, and 1216
// double-precision floating-point number, for example, 3.14 and 299793.0
A cchhaarr variable is of the natural size to hold a character on a given machine (typically a byte), and
an iinntt variable is of the natural size for integer arithmetic on a given machine (typically a word).
The arithmetic operators can be used for any combination of these types:
+
*
/
%
// plus, both unary and binary
// minus, both unary and binary
// multiply
// divide
// remainder
So can the comparison operators:
==
!=
<
>
<=
>=
// equal
// not equal
// less than
// greater than
// less than or equal
// greater than or equal
In assignments and in arithmetic operations, C++ performs all meaningful conversions between the
basic types so that they can be mixed freely:
vvooiidd ssoom
mee__ffuunnccttiioonn()
{
ddoouubbllee d = 22.22;
iinntt i = 77;
d = dd+ii;
i = dd*ii;
}
// function that doesn’t return a value
// initialize floating-point number
// initialize integer
// assign sum to d
// assign product to i
As in C, = is the assignment operator and == tests equality.
2.3.2 Tests and Loops [tour.loop]
C++ provides a conventional set of statements for expressing selection and looping. For example,
here is a simple function that prompts the user and returns a Boolean indicating the response:
bbooooll aacccceepptt()
{
ccoouutt << "D
Doo yyoouu w
waanntt ttoo pprroocceeeedd (yy oorr nn)?\\nn";
cchhaarr aannssw
weerr = 00;
cciinn >> aannssw
weerr;
// write question
// read answer
iiff (aannssw
weerr == ´yy´) rreettuurrnn ttrruuee;
rreettuurrnn ffaallssee;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.3.2
Tests and Loops
25
The << operator (‘‘put to’’) is used as an output operator; ccoouutt is the standard output stream. The
>> operator (‘‘get from’’) is used as an input operator; cciinn is the standard input stream. The type of
the right-hand operand of >> determines what input is accepted and is the target of the input operation. The \\nn character at the end of the output string represents a newline.
The example could be slightly improved by taking an ‘n’ answer into account:
bbooooll aacccceepptt22()
{
ccoouutt << "D
Doo yyoouu w
waanntt ttoo pprroocceeeedd (yy oorr nn)?\\nn";
cchhaarr aannssw
weerr = 00;
cciinn >> aannssw
weerr;
// write question
// read answer
ssw
wiittcchh (aannssw
weerr) {
ccaassee ´yy´:
rreettuurrnn ttrruuee;
ccaassee ´nn´:
rreettuurrnn ffaallssee;
ddeeffaauulltt:
ccoouutt << "II´llll ttaakkee tthhaatt ffoorr a nnoo.\\nn";
rreettuurrnn ffaallssee;
}
}
A switch-statement tests a value against a set of constants. The case constants must be distinct, and
if the value tested does not match any of them, the ddeeffaauulltt is chosen. The programmer need not
provide a ddeeffaauulltt.
Few programs are written without loops. In this case, we might like to give the user a few tries:
bbooooll aacccceepptt33()
{
iinntt ttrriieess = 11;
w
whhiillee (ttrriieess < 44) {
ccoouutt << "D
Doo yyoouu w
waanntt ttoo pprroocceeeedd (yy oorr nn)?\\nn";
cchhaarr aannssw
weerr = 00;
cciinn >> aannssw
weerr;
// write question
// read answer
ssw
wiittcchh (aannssw
weerr) {
ccaassee ´yy´:
rreettuurrnn ttrruuee;
ccaassee ´nn´:
rreettuurrnn ffaallssee;
ddeeffaauulltt:
ccoouutt << "SSoorrrryy, I ddoonn´tt uunnddeerrssttaanndd tthhaatt.\\nn";
ttrriieess = ttrriieess + 11;
}
}
ccoouutt << "II´llll ttaakkee tthhaatt ffoorr a nnoo.\\nn";
rreettuurrnn ffaallssee;
}
The while-statement executes until its condition becomes ffaallssee.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
26
A Tour of C++
Chapter 2
2.3.3 Pointers and Arrays [tour.ptr]
An array can be declared like this:
cchhaarr vv[1100];
// array of 10 characters
Similarly, a pointer can be declared like this:
cchhaarr* pp; // pointer to character
In declarations, [] means ‘‘array of’’ and * means ‘‘pointer to.’’ All arrays have 0 as their lower
bound, so v has ten elements, vv[00]...vv[99]. A pointer variable can hold the address of an object of
the appropriate type:
p = &vv[33];
// p points to v’s fourth element
Unary & is the address-of operator.
Consider copying ten elements from one array to another:
vvooiidd aannootthheerr__ffuunnccttiioonn()
{
iinntt vv11[1100];
iinntt vv22[1100];
// ...
ffoorr (iinntt ii=00; ii<1100; ++ii) vv11[ii]=vv22[ii];
}
This for-statement can be read as ‘‘set i to zero, while i is less than 1100, copy the iith element and
increment ii.’’ When applied to an integer variable, the increment operator ++ simply adds 11.
2.4 Modular Programming [tour.module]
Over the years, the emphasis in the design of programs has shifted from the design of procedures
and toward the organization of data. Among other things, this reflects an increase in program size.
A set of related procedures with the data they manipulate is often called a module. The programming paradigm becomes:
Decide which modules you want;
partition the program so that data is hidden within modules.
This paradigm is also known as the data-hiding principle. Where there is no grouping of procedures with related data, the procedural programming style suffices. Also, the techniques for designing ‘‘good procedures’’ are now applied for each procedure in a module. The most common example of a module is the definition of a stack. The main problems that have to be solved are:
[1] Provide a user interface for the stack (e.g., functions ppuusshh() and ppoopp()).
[2] Ensure that the representation of the stack (e.g., an array of elements) can be accessed only
through this user interface.
[3] Ensure that the stack is initialized before its first use.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.4
Modular Programming
27
C++ provides a mechanism for grouping related data, functions, etc., into separate namespaces. For
example, the user interface of a SSttaacckk module could be declared and used like this:
nnaam
meessppaaccee SSttaacckk {
vvooiidd ppuusshh(cchhaarr);
cchhaarr ppoopp();
}
// interface
vvooiidd ff()
{
SSttaacckk::ppuusshh(´cc´);
iiff (SSttaacckk::ppoopp() != ´cc´) eerrrroorr("iim
mppoossssiibbllee");
}
The SSttaacckk:: qualification indicates that the ppuusshh() and ppoopp() are those from the SSttaacckk namespace. Other uses of those names will not interfere or cause confusion.
The definition of the SSttaacckk could be provided in a separately-compiled part of the program:
nnaam
meessppaaccee SSttaacckk {
// implementation
ccoonnsstt iinntt m
maaxx__ssiizzee = 220000;
cchhaarr vv[m
maaxx__ssiizzee];
iinntt ttoopp = 00;
vvooiidd ppuusshh(cchhaarr cc) { /* check for overflow and push c */ }
cchhaarr ppoopp() { /* check for underflow and pop */ }
}
The key point about this SSttaacckk module is that the user code is insulated from the data representation
of SSttaacckk by the code implementing SSttaacckk::ppuusshh() and SSttaacckk::ppoopp(). The user doesn’t need to
know that the SSttaacckk is implemented using an array, and the implementation can be changed without
affecting user code.
Because data is only one of the things one might want to ‘‘hide,’’ the notion of data hiding is
trivially extended to the notion of information hiding; that is, the names of functions, types, etc.,
can also be made local to a module. Consequently, C++ allows any declaration to be placed in a
namespace (§8.2).
This SSttaacckk module is one way of representing a stack. The following sections use a variety of
stacks to illustrate different programming styles.
2.4.1 Separate Compilation [tour.comp]
C++ supports C’s notion of separate compilation. This can be used to organize a program into a set
of semi-independent fragments.
Typically, we place the declarations that specify the interface to a module in a file with a name
indicating its intended use. Thus,
nnaam
meessppaaccee SSttaacckk {
vvooiidd ppuusshh(cchhaarr);
cchhaarr ppoopp();
}
// interface
would be placed in a file ssttaacckk.hh, and users will include that file, called a header file, like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
28
A Tour of C++
#iinncclluuddee "ssttaacckk.hh"
Chapter 2
// get the interface
vvooiidd ff()
{
SSttaacckk::ppuusshh(´cc´);
iiff (SSttaacckk::ppoopp() != ´cc´) eerrrroorr("iim
mppoossssiibbllee");
}
To help the compiler ensure consistency, the file providing the implementation of the SSttaacckk module
will also include the interface:
#iinncclluuddee "ssttaacckk.hh"
// get the interface
nnaam
meessppaaccee SSttaacckk {
// representation
ccoonnsstt iinntt m
maaxx__ssiizzee = 220000;
cchhaarr vv[m
maaxx__ssiizzee];
iinntt ttoopp = 00;
}
vvooiidd SSttaacckk::ppuusshh(cchhaarr cc) { /* check for overflow and push c */ }
cchhaarr SSttaacckk::ppoopp() { /* check for underflow and pop */ }
The user code goes in a third file, say uusseerr.cc. The code in uusseerr.cc and ssttaacckk.cc shares the stack
interface information presented in ssttaacckk.hh, but the two files are otherwise independent and can be
separately compiled. Graphically, the program fragments can be represented like this:
stack.h:
.
SSttaacckk iinntteerrffaaccee
user.c:
.
#iinncclluuddee ""ssttaacckk..hh""
uussee ssttaacckk
.
stack.c:
.
#iinncclluuddee ""ssttaacckk..hh""
ddeeffiinnee ssttaacckk
.
Separate compilation is an issue in all real programs. It is not simply a concern in programs that
present facilities, such as a SSttaacckk, as modules. Strictly speaking, using separate compilation isn’t a
language issue; it is an issue of how best to take advantage of a particular language implementation.
However, it is of great practical importance. The best approach is to maximize modularity, represent that modularity logically through language features, and then exploit the modularity physically
through files for effective separate compilation (Chapter 8, Chapter 9).
2.4.2 Exception Handling [tour.except]
When a program is designed as a set of modules, error handling must be considered in light of these
modules. Which module is responsible for handling what errors? Often, the module that detects an
error doesn’t know what action to take. The recovery action depends on the module that invoked
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.4.2
Exception Handling
29
the operation rather than on the module that found the error while trying to perform the operation.
As programs grow, and especially when libraries are used extensively, standards for handling errors
(or, more generally, ‘‘exceptional circumstances’’) become important.
Consider again the SSttaacckk example. What ought to be done when we try to ppuusshh() one too
many characters? The writer of the SSttaacckk module doesn’t know what the user would like to be
done in this case, and the user cannot consistently detect the problem (if the user could, the overflow wouldn’t happen in the first place). The solution is for the SSttaacckk implementer to detect the
overflow and then tell the (unknown) user. The user can then take appropriate action. For example:
nnaam
meessppaaccee SSttaacckk {
vvooiidd ppuusshh(cchhaarr);
cchhaarr ppoopp();
// interface
ccllaassss O
Ovveerrfflloow
w { }; // type representing overflow exceptions
}
When detecting an overflow, SSttaacckk::ppuusshh() can invoke the exception-handling code; that is,
‘‘throw an O
Ovveerrfflloow
w exception:’’
vvooiidd SSttaacckk::ppuusshh(cchhaarr cc)
{
iiff (ttoopp == m
maaxx__ssiizzee) tthhrroow
w O
Ovveerrfflloow
w();
// push c
}
The tthhrroow
w transfers control to a handler for exceptions of type SSttaacckk::O
Ovveerrfflloow
w in some function
that directly or indirectly called SSttaacckk::ppuusshh(). To do that, the implementation will unwind the
function call stack as needed to get back to the context of that caller. Thus, the tthhrroow
w acts as a multilevel rreettuurrnn. For example:
vvooiidd ff()
{
// ...
ttrryy { // exceptions here are handled by the handler defined below
w
whhiillee (ttrruuee) SSttaacckk::ppuusshh(´cc´);
}
ccaattcchh (SSttaacckk::O
Ovveerrfflloow
w) {
// oops: stack overflow; take appropriate action
}
// ...
}
The w
whhiillee loop will try to loop forever. Therefore, the ccaattcchh-clause providing a handler for
SSttaacckk::O
Ovveerrfflloow
w will be entered after some call of SSttaacckk::ppuusshh() causes a tthhrroow
w.
Use of the exception-handling mechanisms can make error handling more regular and readable.
See §8.3 and Chapter 14 for further discussion and details.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
30
A Tour of C++
Chapter 2
2.5 Data Abstraction [tour.da]
Modularity is a fundamental aspect of all successful large programs. It remains a focus of all
design discussions throughout this book. However, modules in the form described previously are
not sufficient to express complex systems cleanly. Here, I first present a way of using modules to
provide a form of user-defined types and then show how to overcome some problems with that
approach by defining user-defined types directly.
2.5.1 Modules Defining Types [tour.types]
Programming with modules leads to the centralization of all data of a type under the control of a
type manager module. For example, if we wanted many stacks – rather than the single one provided by the SSttaacckk module above – we could define a stack manager with an interface like this:
nnaam
meessppaaccee SSttaacckk {
ssttrruucctt R
Reepp;
ttyyppeeddeeff R
Reepp& ssttaacckk;
// definition of stack layout is elsewhere
ssttaacckk ccrreeaattee();
vvooiidd ddeessttrrooyy(ssttaacckk ss);
// make a new stack
// delete s
vvooiidd ppuusshh(ssttaacckk ss, cchhaarr cc);
cchhaarr ppoopp(ssttaacckk ss);
// push c onto s
// pop s
}
The declaration
ssttrruucctt R
Reepp;
says that R
Reepp is the name of a type, but it leaves the type to be defined later (§5.7). The declaration
ttyyppeeddeeff R
Reepp& ssttaacckk;
gives the name ssttaacckk to a ‘‘reference to R
Reepp’’ (details in §5.5). The idea is that a stack is identified
by its SSttaacckk::ssttaacckk and that further details are hidden from users.
A SSttaacckk::ssttaacckk acts much like a variable of a built-in type:
ssttrruucctt B
Baadd__ppoopp { };
vvooiidd ff()
{
SSttaacckk::ssttaacckk ss11 = SSttaacckk::ccrreeaattee();
SSttaacckk::ssttaacckk ss22 = SSttaacckk::ccrreeaattee();
// make a new stack
// make another new stack
SSttaacckk::ppuusshh(ss11,´cc´);
SSttaacckk::ppuusshh(ss22,´kk´);
iiff (SSttaacckk::ppoopp(ss11) != ´cc´) tthhrroow
w B
Baadd__ppoopp();
iiff (SSttaacckk::ppoopp(ss22) != ´kk´) tthhrroow
w B
Baadd__ppoopp();
SSttaacckk::ddeessttrrooyy(ss11);
SSttaacckk::ddeessttrrooyy(ss22);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.5.1
Modules Defining Types
31
We could implement this SSttaacckk in several ways. It is important that a user doesn’t need to know
how we do it. As long as we keep the interface unchanged, a user will not be affected if we decide
to re-implement SSttaacckk.
An implementation might preallocate a few stack representations and let SSttaacckk::ccrreeaattee() hand
out a reference to an unused one. SSttaacckk::ddeessttrrooyy() could then mark a representation ‘‘unused’’
so that SSttaacckk::ccrreeaattee() can recycle it:
nnaam
meessppaaccee SSttaacckk {
// representation
ccoonnsstt iinntt m
maaxx__ssiizzee = 220000;
ssttrruucctt R
Reepp {
cchhaarr vv[m
maaxx__ssiizzee];
iinntt ttoopp;
};
ccoonnsstt iinntt m
maaxx = 1166; // maximum number of stacks
R
Reepp ssttaacckkss[m
maaxx];
bbooooll uusseedd[m
maaxx];
// preallocated stack representations
// used[i] is true if stacks[i] is in use
}
vvooiidd SSttaacckk::ppuusshh(ssttaacckk ss, cchhaarr cc) { /* check s for overflow and push c */ }
cchhaarr SSttaacckk::ppoopp(ssttaacckk ss) { /* check s for underflow and pop */ }
SSttaacckk::ssttaacckk SSttaacckk::ccrreeaattee()
{
// pick an unused Rep, mark it used, initialize it, and return a reference to it
}
vvooiidd SSttaacckk::ddeessttrrooyy(ssttaacckk ss) { /* mark s unused */ }
What we have done is to wrap a set of interface functions around the representation type. How the
resulting ‘‘stack type’’ behaves depends partly on how we defined these interface functions, partly
on how we presented the representation type to the users of SSttaacckks, and partly on the design of the
representation type itself.
This is often less than ideal. A significant problem is that the presentation of such ‘‘fake types’’
to the users can vary greatly depending on the details of the representation type – and users ought
to be insulated from knowledge of the representation type. For example, had we chosen to use a
more elaborate data structure to identify a stack, the rules for assignment and initialization of
SSttaacckk::ssttaacckks would have changed dramatically. This may indeed be desirable at times. However, it shows that we have simply moved the problem of providing convenient SSttaacckks from the
SSttaacckk module to the SSttaacckk::ssttaacckk representation type.
More fundamentally, user-defined types implemented through a module providing access to an
implementation type don’t behave like built-in types and receive less and different support than do
built-in types. For example, the time that a SSttaacckk::R
Reepp can be used is controlled through
SSttaacckk::ccrreeaattee() and SSttaacckk::ddeessttrrooyy() rather than by the usual language rules.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
32
A Tour of C++
Chapter 2
2.5.2 User-Defined Types [tour.udt]
C++ attacks this problem by allowing a user to directly define types that behave in (nearly) the
same way as built-in types. Such a type is often called an abstract data type. I prefer the term
user-defined type. A more reasonable definition of abstract data type would require a mathematical ‘‘abstract’’ specification. Given such a specification, what are called types here would be concrete examples of such truly abstract entities. The programming paradigm becomes:
Decide which types you want;
provide a full set of operations for each type.
Where there is no need for more than one object of a type, the data-hiding programming style using
modules suffices.
Arithmetic types such as rational and complex numbers are common examples of user-defined
types. Consider:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx(ddoouubbllee rr, ddoouubbllee ii) { rree=rr; iim
m=ii; }
ccoom
mpplleexx(ddoouubbllee rr) { rree=rr; iim
m=00; }
ccoom
mpplleexx() { rree = iim
m = 00; }
ffrriieenndd
ffrriieenndd
ffrriieenndd
ffrriieenndd
ffrriieenndd
ccoom
mpplleexx
ccoom
mpplleexx
ccoom
mpplleexx
ccoom
mpplleexx
ccoom
mpplleexx
ooppeerraattoorr+(ccoom
mpplleexx, ccoom
mpplleexx);
ooppeerraattoorr-(ccoom
mpplleexx, ccoom
mpplleexx);
ooppeerraattoorr-(ccoom
mpplleexx);
ooppeerraattoorr*(ccoom
mpplleexx, ccoom
mpplleexx);
ooppeerraattoorr/(ccoom
mpplleexx, ccoom
mpplleexx);
ffrriieenndd bbooooll ooppeerraattoorr==(ccoom
mpplleexx, ccoom
mpplleexx);
ffrriieenndd bbooooll ooppeerraattoorr!=(ccoom
mpplleexx, ccoom
mpplleexx);
// ...
// construct complex from two scalars
// construct complex from one scalar
// default complex: (0,0)
// binary
// unary
// equal
// not equal
};
The declaration of class (that is, user-defined type) ccoom
mpplleexx specifies the representation of a complex number and the set of operations on a complex number. The representation is private; that is,
rree and iim
m are accessible only to the functions specified in the declaration of class ccoom
mpplleexx. Such
functions can be defined like this:
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx aa11, ccoom
mpplleexx aa22)
{
rreettuurrnn ccoom
mpplleexx(aa11.rree+aa22.rree,aa11.iim
m+aa22.iim
m);
}
A member function with the same name as its class is called a constructor. A constructor defines a
way to initialize an object of its class. Class ccoom
mpplleexx provides three constructors. One makes a
ccoom
mpplleexx from a ddoouubbllee, another takes a pair of ddoouubbllees, and the third makes a ccoom
mpplleexx with a
default value.
Class ccoom
mpplleexx can be used like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.5.2
User-Defined Types
33
vvooiidd ff(ccoom
mpplleexx zz)
{
ccoom
mpplleexx a = 22.33;
ccoom
mpplleexx b = 11/aa;
ccoom
mpplleexx c = aa+bb*ccoom
mpplleexx(11,22.33);
// ...
iiff (cc != bb) c = -(bb/aa)+22*bb;
}
The compiler converts operators involving ccoom
mpplleexx numbers into appropriate function calls. For
example, cc!=bb means ooppeerraattoorr!=(cc,bb) and 11/aa means ooppeerraattoorr/(ccoom
mpplleexx(11),aa).
Most, but not all, modules are better expressed as user-defined types.
2.5.3 Concrete Types [tour.concrete]
User-defined types can be designed to meet a wide variety of needs. Consider a user-defined SSttaacckk
type along the lines of the ccoom
mpplleexx type. To make the example a bit more realistic, this SSttaacckk type
is defined to take its number of elements as an argument:
ccllaassss SSttaacckk {
cchhaarr* vv;
iinntt ttoopp;
iinntt m
maaxx__ssiizzee;
ppuubblliicc:
ccllaassss U
Unnddeerrfflloow
w { };
ccllaassss O
Ovveerrfflloow
w { };
ccllaassss B
Baadd__ssiizzee { };
SSttaacckk(iinntt ss);
~SSttaacckk();
// used as exception
// used as exception
// used as exception
// constructor
// destructor
vvooiidd ppuusshh(cchhaarr cc);
cchhaarr ppoopp();
};
The constructor SSttaacckk(iinntt) will be called whenever an object of the class is created. This takes
care of initialization. If any cleanup is needed when an object of the class goes out of scope, a complement to the constructor – called the destructor – can be declared:
SSttaacckk::SSttaacckk(iinntt ss)
// constructor
{
ttoopp = 00;
iiff (1100000000<ss) tthhrroow
w B
Baadd__ssiizzee();
m
maaxx__ssiizzee = ss;
v = nneew
w cchhaarr[ss];
// allocate elements on the free store (heap, dynamic store)
}
SSttaacckk::~SSttaacckk()
{
ddeelleettee[] vv;
}
// destructor
// free the elements for possible reuse of their space (§6.2.6)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
34
A Tour of C++
Chapter 2
The constructor initializes a new SSttaacckk variable. To do so, it allocates some memory on the free
store (also called the heap or dynamic store) using the nneew
w operator. The destructor cleans up by
freeing that memory. This is all done without intervention by users of SSttaacckks. The users simply
create and use SSttaacckks much as they would variables of built-in types. For example:
SSttaacckk ss__vvaarr11(1100);
// global stack with 10 elements
vvooiidd ff(SSttaacckk& ss__rreeff, iinntt ii)
// reference to Stack
{
SSttaacckk ss__vvaarr22(ii);
// local stack with i elements
SSttaacckk* ss__ppttrr = nneew
w SSttaacckk(2200); // pointer to Stack allocated on free store
ss__vvaarr11.ppuusshh(´aa´);
ss__vvaarr22.ppuusshh(´bb´);
ss__rreeff.ppuusshh(´cc´);
ss__ppttrr->ppuusshh(´dd´);
// ...
}
This SSttaacckk type obeys the same rules for naming, scope, allocation, lifetime, copying, etc., as does
a built-in type such as iinntt and cchhaarr.
Naturally, the ppuusshh() and ppoopp() member functions must also be defined somewhere:
vvooiidd SSttaacckk::ppuusshh(cchhaarr cc)
{
iiff (ttoopp == m
maaxx__ssiizzee) tthhrroow
w O
Ovveerrfflloow
w();
vv[ttoopp] = cc;
ttoopp = ttoopp + 11;
}
cchhaarr SSttaacckk::ppoopp()
{
iiff (ttoopp == 00) tthhrroow
w U
Unnddeerrfflloow
w();
ttoopp = ttoopp - 11;
rreettuurrnn vv[ttoopp];
}
Types such as ccoom
mpplleexx and SSttaacckk are called concrete types, in contrast to abstract types, where the
interface more completely insulates a user from implementation details.
2.5.4 Abstract Types [tour.abstract]
One property was lost in the transition from SSttaacckk as a ‘‘fake type’’ implemented by a module
(§2.5.1) to a proper type (§2.5.3). The representation is not decoupled from the user interface;
rather, it is a part of what would be included in a program fragment using SSttaacckks. The representation is private, and therefore accessible only through the member functions, but it is present. If it
changes in any significant way, a user must recompile. This is the price to pay for having concrete
types behave exactly like built-in types. In particular, we cannot have genuine local variables of a
type without knowing the size of the type’s representation.
For types that don’t change often, and where local variables provide much-needed clarity and
efficiency, this is acceptable and often ideal. However, if we want to completely isolate users of a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.5.4
Abstract Types
35
stack from changes to its implementation, this last SSttaacckk is insufficient. Then, the solution is to
decouple the interface from the representation and give up genuine local variables.
First, we define the interface:
ccllaassss SSttaacckk {
ppuubblliicc:
ccllaassss U
Unnddeerrfflloow
w { };
ccllaassss O
Ovveerrfflloow
w { };
// used as exception
// used as exception
vviirrttuuaall vvooiidd ppuusshh(cchhaarr cc) = 00;
vviirrttuuaall cchhaarr ppoopp() = 00;
};
The word vviirrttuuaall means ‘‘may be redefined later in a class derived from this one’’ in Simula and
C++. A class derived from SSttaacckk provides an implementation for the SSttaacckk interface. The curious
=00 syntax says that some class derived from SSttaacckk must define the function. Thus, this SSttaacckk can
serve as the interface to any class that implements its ppuusshh() and ppoopp() functions.
This SSttaacckk could be used like this:
vvooiidd ff(SSttaacckk& ss__rreeff)
{
ss__rreeff.ppuusshh(´cc´);
iiff (ss__rreeff.ppoopp() != ´cc´) tthhrroow
w bbaadd__ssttaacckk();
}
Note how ff() uses the SSttaacckk interface in complete ignorance of implementation details. A class
that provides the interface to a variety of other classes is often called a polymorphic type.
Not surprisingly, the implementation could consist of everything from the concrete class SSttaacckk
that we left out of the interface SSttaacckk:
ccllaassss A
Arrrraayy__ssttaacckk : ppuubblliicc SSttaacckk {
cchhaarr* pp;
iinntt m
maaxx__ssiizzee;
iinntt ttoopp;
ppuubblliicc:
A
Arrrraayy__ssttaacckk(iinntt ss);
~A
Arrrraayy__ssttaacckk();
// Array_stack implements Stack
vvooiidd ppuusshh(cchhaarr cc);
cchhaarr ppoopp();
};
The ‘‘:ppuubblliicc’’ can be read as ‘‘is derived from,’’ ‘‘implements,’’ and ‘‘is a subtype of.’’
For a function like ff() to use a SSttaacckk in complete ignorance of implementation details, some
other function will have to make an object on which it can operate. For example:
vvooiidd gg()
{
A
Arrrraayy__ssttaacckk aass(220000);
ff(aass);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
36
A Tour of C++
Chapter 2
Since ff() doesn’t know about A
Arrrraayy__ssttaacckks but only knows the SSttaacckk interface, it will work just as
well for a different implementation of a SSttaacckk. For example:
ccllaassss L
Liisstt__ssttaacckk : ppuubblliicc SSttaacckk {
lliisstt<cchhaarr> llcc;
ppuubblliicc:
L
Liisstt__ssttaacckk() { }
// List_stack implements Stack
// (standard library) list of characters (§3.7.3)
vvooiidd ppuusshh(cchhaarr cc) { llcc.ppuusshh__ffrroonntt(cc); }
cchhaarr ppoopp();
};
cchhaarr L
Liisstt__ssttaacckk::ppoopp()
{
cchhaarr x = llcc.ffrroonntt();
llcc.ppoopp__ffrroonntt();
rreettuurrnn xx;
}
// get first element
// remove first element
Here, the representation is a list of characters. The llcc.ppuusshh__ffrroonntt(cc) adds c as the first element of
llcc, the call llcc.ppoopp__ffrroonntt() removes the first element, and llcc.ffrroonntt() denotes llcc’s first element.
A function can create a L
Liisstt__ssttaacckk and have ff() use it:
vvooiidd hh()
{
L
Liisstt__ssttaacckk llss;
ff(llss);
}
2.5.5 Virtual Functions [tour.virtual]
How is the call ss__sseett.ppoopp() in ff() resolved to the right function definition? When ff() is called
from hh(), L
Liisstt__ssttaacckk::ppoopp() must be called. When ff() is called from gg(),
A
Arrrraayy__ssttaacckk::ppoopp() must be called. To achieve this resolution, a SSttaacckk object must contain
information to indicate the function to be called at run-time. A common implementation technique
is for the compiler to convert the name of a vviirrttuuaall function into an index into a table of pointers to
functions. That table is usually called ‘‘a virtual function table’’ or simply, a vvttbbll. Each class with
virtual functions has its own vvttbbll identifying its virtual functions. This can be represented graphically like this:
A
Arrrraayy__ssttaacckk oobbjjeecctt::
vvt.tbbll::
.
.
A
Arrrraayy__ssttaacckk::ppuusshh()
p
m
maaxx__ssiizzee
A
Arrrraayy__ssttaacckk::ppoopp()
ttoopp
L
Liisstt__ssttaacckk oobbjjeecctt::
llcc
vvt.tbbll::
.
.
L
Liisstt__ssttaacckk::ppuusshh()
L
Liisstt__ssttaacckk::ppoopp()
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.5.5
Virtual Functions
37
The functions in the vvttbbll allow the object to be used correctly even when the size of the object and
the layout of its data are unknown to the caller. All the caller needs to know is the location of the
vvttbbll in a SSttaacckk and the index used for each virtual function. This virtual call mechanism can be
made essentially as efficient as the ‘‘normal function call’’ mechanism. Its space overhead is one
pointer in each object of a class with virtual functions plus one vvttbbll for each such class.
2.6 Object-Oriented Programming [tour.oop]
Data abstraction is fundamental to good design and will remain a focus of design throughout this
book. However, user-defined types by themselves are not flexible enough to serve our needs. This
section first demonstrates a problem with simple user-defined data types and then shows how to
overcome that problem by using class hierarchies.
2.6.1 Problems with Concrete Types [tour.problems]
A concrete type, like a ‘‘fake type’’ defined through a module, defines a sort of black box. Once
the black box has been defined, it does not really interact with the rest of the program. There is no
way of adapting it to new uses except by modifying its definition. This situation can be ideal, but it
can also lead to severe inflexibility. Consider defining a type SShhaappee for use in a graphics system.
Assume for the moment that the system has to support circles, triangles, and squares. Assume also
that we have
ccllaassss P
Pooiinntt{ /* ... */ };
ccllaassss C
Coolloorr{ /* ... */ };
The /* and */ specify the beginning and end, respectively, of a comment. This comment notation
can be used for multi-line comments and comments that end before the end of a line.
We might define a shape like this:
eennuum
m K
Kiinndd { cciirrccllee, ttrriiaannggllee, ssqquuaarree };
ccllaassss SShhaappee {
K
Kiinndd kk;
P
Pooiinntt cceenntteerr;
C
Coolloorr ccooll;
// ...
// enumeration (§4.8)
// type field
ppuubblliicc:
vvooiidd ddrraaw
w();
vvooiidd rroottaattee(iinntt);
// ...
};
The ‘‘type field’’ k is necessary to allow operations such as ddrraaw
w() and rroottaattee() to determine
what kind of shape they are dealing with (in a Pascal-like language, one might use a variant record
with tag kk). The function ddrraaw
w() might be defined like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
38
A Tour of C++
Chapter 2
vvooiidd SShhaappee::ddrraaw
w()
{
ssw
wiittcchh (kk) {
ccaassee cciirrccllee:
// draw a circle
bbrreeaakk;
ccaassee ttrriiaannggllee:
// draw a triangle
bbrreeaakk;
ccaassee ssqquuaarree:
// draw a square
bbrreeaakk;
}
}
This is a mess. Functions such as ddrraaw
w() must ‘‘know about’’ all the kinds of shapes there are.
Therefore, the code for any such function grows each time a new shape is added to the system. If
we define a new shape, every operation on a shape must be examined and (possibly) modified. We
are not able to add a new shape to a system unless we have access to the source code for every
operation. Because adding a new shape involves ‘‘touching’’ the code of every important operation
on shapes, doing so requires great skill and potentially introduces bugs into the code that handles
other (older) shapes. The choice of representation of particular shapes can get severely cramped by
the requirement that (at least some of) their representation must fit into the typically fixed-sized
framework presented by the definition of the general type SShhaappee.
2.6.2 Class Hierarchies [tour.hierarchies]
The problem is that there is no distinction between the general properties of every shape (that is, a
shape has a color, it can be drawn, etc.) and the properties of a specific kind of shape (a circle is a
shape that has a radius, is drawn by a circle-drawing function, etc.). Expressing this distinction and
taking advantage of it defines object-oriented programming. Languages with constructs that allow
this distinction to be expressed and used support object-oriented programming. Other languages
don’t.
The inheritance mechanism (borrowed for C++ from Simula) provides a solution. First, we
specify a class that defines the general properties of all shapes:
ccllaassss SShhaappee {
P
Pooiinntt cceenntteerr;
C
Coolloorr ccooll;
// ...
ppuubblliicc:
P
Pooiinntt w
whheerree() { rreettuurrnn cceenntteerr; }
vvooiidd m
moovvee(P
Pooiinntt ttoo) { cceenntteerr = ttoo; /* ... */ ddrraaw
w(); }
vviirrttuuaall vvooiidd ddrraaw
w() = 00;
vviirrttuuaall vvooiidd rroottaattee(iinntt aannggllee) = 00;
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.6.2
Class Hierarchies
39
As in the abstract type SSttaacckk in §2.5.4, the functions for which the calling interface can be defined
– but where the implementation cannot be defined yet – are vviirrttuuaall. In particular, the functions
ddrraaw
w() and rroottaattee() can be defined only for specific shapes, so they are declared vviirrttuuaall.
Given this definition, we can write general functions manipulating vectors of pointers to shapes:
vvooiidd rroottaattee__aallll(vveeccttoorr<SShhaappee*>& vv, iinntt aannggllee) // rotate v’s elements angle degrees
{
ffoorr (iinntt i = 00; ii<vv.ssiizzee(); ++ii) vv[ii]->rroottaattee(aannggllee);
}
To define a particular shape, we must say that it is a shape and specify its particular properties
(including the virtual functions):
ccllaassss C
Ciirrccllee : ppuubblliicc SShhaappee {
iinntt rraaddiiuuss;
ppuubblliicc:
vvooiidd ddrraaw
w() { /* ... */ }
vvooiidd rroottaattee(iinntt) {} // yes, the null function
};
In C++, class C
Ciirrccllee is said to be derived from class SShhaappee, and class SShhaappee is said to be a base of
class C
Ciirrccllee. An alternative terminology calls C
Ciirrccllee and SShhaappee subclass and superclass, respectively. The derived class is said to inherit members from its base class, so the use of base and
derived classes is commonly referred to as inheritance.
The programming paradigm is:
Decide which classes you want;
provide a full set of operations for each class;
make commonality explicit by using inheritance.
Where there is no such commonality, data abstraction suffices. The amount of commonality
between types that can be exploited by using inheritance and virtual functions is the litmus test of
the applicability of object-oriented programming to a problem. In some areas, such as interactive
graphics, there is clearly enormous scope for object-oriented programming. In other areas, such as
classical arithmetic types and computations based on them, there appears to be hardly any scope for
more than data abstraction, and the facilities needed for the support of object-oriented programming
seem unnecessary.
Finding commonality among types in a system is not a trivial process. The amount of commonality to be exploited is affected by the way the system is designed. When a system is designed –
and even when the requirements for the system are written – commonality must be actively sought.
Classes can be designed specifically as building blocks for other types, and existing classes can be
examined to see if they exhibit similarities that can be exploited in a common base class.
For attempts to explain what object-oriented programming is without recourse to specific programming language constructs, see [Kerr,1987] and [Booch,1994] in §23.6.
Class hierarchies and abstract classes (§2.5.4) complement each other instead of being mutually
exclusive (§12.5). In general, the paradigms listed here tend to be complementary and often
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
40
A Tour of C++
Chapter 2
mutually supportive. For example, classes and modules contain functions, while modules contain
classes and functions. The experienced designer applies a variety of paradigms as need dictates.
2.7 Generic Programming [tour.generic]
Someone who wants a stack is unlikely always to want a stack of characters. A stack is a general
concept, independent of the notion of a character. Consequently, it ought to be represented independently.
More generally, if an algorithm can be expressed independently of representation details and if
it can be done so affordably and without logical contortions, it ought to be done so.
The programming paradigm is:
Decide which algorithms you want;
parameterize them so that they work for
a variety of suitable types and data structures.
2.7.1 Containers [tour.containers]
We can generalize a stack-of-characters type to a stack-of-anything type by making it a template
and replacing the specific type cchhaarr with a template parameter. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSttaacckk {
T
T* vv;
iinntt m
maaxx__ssiizzee;
iinntt ttoopp;
ppuubblliicc:
ccllaassss U
Unnddeerrfflloow
w { };
ccllaassss O
Ovveerrfflloow
w { };
SSttaacckk(iinntt ss);
~SSttaacckk();
// constructor
// destructor
vvooiidd ppuusshh(T
T);
T ppoopp();
};
The tteem
mppllaattee<ccllaassss T
T> prefix makes T a parameter of the declaration it prefixes.
The member functions might be defined similarly:
tteem
mppllaattee<ccllaassss T
T> vvooiidd SSttaacckk<T
T>::ppuusshh(T
T cc)
{
iiff (ttoopp == m
maaxx__ssiizzee) tthhrroow
w O
Ovveerrfflloow
w();
vv[ttoopp] = cc;
ttoopp = ttoopp + 11;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.7.1
Containers
41
tteem
mppllaattee<ccllaassss T
T> T SSttaacckk<T
T>::ppoopp()
{
iiff (ttoopp == 00) tthhrroow
w U
Unnddeerrfflloow
w();
ttoopp = ttoopp - 11;
rreettuurrnn vv[ttoopp];
}
Given these definitions, we can use stacks like this:
SSttaacckk<cchhaarr> sscc;
SSttaacckk<ccoom
mpplleexx> ssccppllxx;
SSttaacckk< lliisstt<iinntt> > ssllii;
// stack of characters
// stack of complex numbers
// stack of list of integers
vvooiidd ff()
{
sscc.ppuusshh(´cc´);
iiff (sscc.ppoopp() != ´cc´) tthhrroow
w B
Baadd__ppoopp();
ssccppllxx.ppuusshh(ccoom
mpplleexx(11,22));
iiff (ssccppllxx.ppoopp() != ccoom
mpplleexx(11,22)) tthhrroow
w B
Baadd__ppoopp();
}
Similarly, we can define lists, vectors, maps (that is, associative arrays), etc., as templates. A class
holding a collection of elements of some type is commonly called a container class, or simply a
container.
Templates are a compile-time mechanism so that their use incurs no run-time overhead compared to ‘‘hand-written code.’’
2.7.2 Generic Algorithms [tour.algorithms]
The C++ standard library provides a variety of containers, and users can write their own (Chapter 3,
Chapter 17, Chapter 18). Thus, we find that we can apply the generic programming paradigm once
more to parameterize algorithms by containers. For example, we want to sort, copy, and search
vveeccttoorrs, lliisstts, and arrays without having to write ssoorrtt(), ccooppyy(), and sseeaarrcchh() functions for each
container. We also don’t want to convert to a specific data structure accepted by a single sort function. Therefore, we must find a generalized way of defining our containers that allows us to manipulate one without knowing exactly which kind of container it is.
One approach, the approach taken for the containers and non-numerical algorithms in the C++
standard library (§3.8, Chapter 18) is to focus on the notion of a sequence and manipulate
sequences through iterators.
Here is a graphical representation of the notion of a sequence:
begin
elements:
end
...
.....
.
.
.
.
.
.
.....
A sequence has a beginning and an end. An iterator refers to an element, and provides an operation
that makes the iterator refer to the next element of the sequence. The end of a sequence is an
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
42
A Tour of C++
Chapter 2
iterator that refers one beyond the last element of the sequence. The physical representation of
‘‘the end’’ may be a sentinel element, but it doesn’t have to be. In fact, the point is that this notion
of sequences covers a wide variety of representations, including lists and arrays.
We need some standard notation for operations such as ‘‘access an element through an iterator’’
and ‘‘make the iterator refer to the next element.’’ The obvious choices (once you get the idea) are
to use the dereference operator * to mean ‘‘access an element through an iterator’’ and the increment operator ++ to mean ‘‘make the iterator refer to the next element.’’
Given that, we can write code like this:
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt> vvooiidd ccooppyy(IInn ffrroom
m, IInn ttoooo__ffaarr, O
Ouutt ttoo)
{
w
whhiillee (ffrroom
m != ttoooo__ffaarr) {
*ttoo = *ffrroom
m; // copy element pointed to
++ttoo;
// next input
++ffrroom
m;
// next output
}
}
This copies any container for which we can define iterators with the right syntax and semantics.
C++’s built-in, low-level array and pointer types have the right operations for that, so we can
write
cchhaarr vvcc11[220000]; // array of 200 characters
cchhaarr vvcc22[550000]; // array of 500 characters
vvooiidd ff()
{
ccooppyy(&vvcc11[00],&vvcc11[220000],&vvcc22[00]);
}
This copies vvcc11 from its first element until its last into vvcc22 starting at vvcc22’s first element.
All standard library containers (§16.3, Chapter 17) support this notion of iterators and
sequences.
Two template parameters IInn and O
Ouutt are used to indicate the types of the source and the target
instead of a single argument. This was done because we often want to copy from one kind of container into another. For example:
ccoom
mpplleexx aacc[220000];
vvooiidd gg(vveeccttoorr<ccoom
mpplleexx>& vvcc, lliisstt<ccoom
mpplleexx>& llcc)
{
ccooppyy(&aacc[00],&aacc[220000],llcc.bbeeggiinn());
ccooppyy(llcc.bbeeggiinn(),llcc.eenndd(),vvcc.bbeeggiinn());
}
This copies the array to the lliisstt and the lliisstt to the vveeccttoorr. For a standard container, bbeeggiinn() is an
iterator pointing to the first element.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 2.8
Postscript
43
2.8 Postscript [tour.post]
No programming language is perfect. Fortunately, a programming language does not have to be
perfect to be a good tool for building great systems. In fact, a general-purpose programming language cannot be perfect for all of the many tasks to which it is put. What is perfect for one task is
often seriously flawed for another because perfection in one area implies specialization. Thus, C++
was designed to be a good tool for building a wide variety of systems and to allow a wide variety of
ideas to be expressed directly.
Not everything can be expressed directly using the built-in features of a language. In fact, that
isn’t even the ideal. Language features exist to support a variety of programming styles and techniques. Consequently, the task of learning a language should focus on mastering the native and
natural styles for that language – not on the understanding of every little detail of all the language
features.
In practical programming, there is little advantage in knowing the most obscure language features or for using the largest number of features. A single language feature in isolation is of little
interest. Only in the context provided by techniques and by other features does the feature acquire
meaning and interest. Thus, when reading the following chapters, please remember that the real
purpose of examining the details of C++ is to be able to use them in concert to support good programming style in the context of sound designs.
2.9 Advice [tour.advice]
[1] Don’t panic! All will become clear in time; §2.1.
[2] You don’t have to know every detail of C++ to write good programs; §1.7.
[3] Focus on programming techniques, not on language features; §2.1.
.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
44
A Tour of C++
Chapter 2
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
3
________________________________________
________________________________________________________________________________________________________________________________________________________________
A Tour of the Standard Library
Why waste time learning
when ignorance is instantaneous?
– Hobbes
Standard libraries — output — strings — input — vectors — range checking — lists —
maps — container overview — algorithms — iterators — I/O iterators — traversals and
predicates — algorithms using member functions — algorithm overview — complex
numbers — vector arithmetic— standard library overview — advice.
3.1 Introduction [tour2.lib]
No significant program is written in just a bare programming language. First, a set of supporting
libraries are developed. These then form the basis for further work.
Continuing Chapter 2, this chapter gives a quick tour of key library facilities to give you an idea
what can be done using C++ and its standard library. Useful library types, such as ssttrriinngg, vveeccttoorr,
lliisstt, and m
maapp, are presented as well as the most common ways of using them. Doing this allows me
to give better examples and to set better exercises in the following chapters. As in Chapter 2, you
are strongly encouraged not to be distracted or discouraged by an incomplete understanding of
details. The purpose of this chapter is to give you a taste of what is to come and to convey an
understanding of the simplest uses of the most useful library facilities. A more detailed introduction to the standard library is given in §16.1.2.
The standard library facilities described in this book are part of every complete C++ implementation. In addition to the standard C++ library, most implementations offer ‘‘graphical user interface’’ systems, often referred to as GUIs or window systems, for interaction between a user and a
program. Similarly, most application development environments provide ‘‘foundation libraries’’
that support corporate or industrial ‘‘standard’’ development and/or execution environments. I do
not describe such systems and libraries. The intent is to provide a self-contained description of C++
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
46
A Tour of the Standard Library
Chapter 3
as defined by the standard and to keep the examples portable, except where specifically noted. Naturally, a programmer is encouraged to explore the more extensive facilities available on most systems, but that is left to exercises.
3.2 Hello, world! [tour2.hello]
The minimal C++ program is
iinntt m
maaiinn() { }
It defines a function called m
maaiinn, which takes no arguments and does nothing.
Every C++ program must have a function named m
maaiinn(). The program starts by executing that
function. The iinntt value returned by m
maaiinn(), if any, is the program’s return value to ‘‘the system.’’
If no value is returned, the system will receive a value indicating successful completion. A nonzero
value from m
maaiinn() indicates failure.
Typically, a program produces some output. Here is a program that writes out H
Heelllloo, w
woorrlldd!:
#iinncclluuddee <iioossttrreeaam
m>
iinntt m
maaiinn()
{
ssttdd::ccoouutt << "H
Heelllloo, w
woorrlldd!\\nn";
}
The line #iinncclluuddee <iioossttrreeaam
m> instructs the compiler to include the declarations of the standard
stream I/O facilities as found in iioossttrreeaam
m. Without these declarations, the expression
ssttdd::ccoouutt << "H
Heelllloo, w
woorrlldd!\\nn"
would make no sense. The operator << (‘‘put to’’) writes its second argument onto its first. In this
case, the string literal "H
Heelllloo, w
woorrlldd!\\nn" is written onto the standard output stream ssttdd::ccoouutt. A
string literal is a sequence of characters surrounded by double quotes. In a string literal, the backslash character \ followed by another character denotes a single special character. In this case, \\nn is
the newline character, so that the characters written are H
Heelllloo, w
woorrlldd! followed by a newline.
3.3 The Standard Library Namespace [tour2.name]
The standard library is defined in a namespace (§2.4, §8.2) called ssttdd. That is why I wrote
ssttdd::ccoouutt rather than plain ccoouutt. I was being explicit about using the ssttaannddaarrdd ccoouutt, rather than
some other ccoouutt.
Every standard library facility is provided through some standard header similar to <iioossttrreeaam
m>.
For example:
#iinncclluuddee<ssttrriinngg>
#iinncclluuddee<lliisstt>
This makes the standard ssttrriinngg and lliisstt available. To use them, the ssttdd:: prefix can be used:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.3
The Standard Library Namespace
47
ssttdd::ssttrriinngg s = "F
Foouurr lleeggss G
Goooodd; ttw
woo lleeggss B
Baaaaaadd!";
ssttdd::lliisstt<ssttdd::ssttrriinngg> ssllooggaannss;
For simplicity, I will rarely use the ssttdd:: prefix explicitly in examples. Neither will I always
#iinncclluuddee the necessary headers explicitly. To compile and run the program fragments here, you
must #iinncclluuddee the appropriate headers (as listed in §3.7.5, §3.8.6, and Chapter 16). In addition,
you must either use the ssttdd:: prefix or make every name from ssttdd global (§8.2.3). For example:
#iinncclluuddee<ssttrriinngg>
uussiinngg nnaam
meessppaaccee ssttdd;
// make the standard string facilities accessible
// make std names available without std:: prefix
ssttrriinngg s = "IIggnnoorraannccee iiss bblliissss!";
// ok: string is std::string
It is generally in poor taste to dump every name from a namespace into the global namespace.
However, to keep short the program fragments used to illustrate language and library features, I
omit repetitive #iinncclluuddees and ssttdd:: qualifications. In this book, I use the standard library almost
exclusively, so if a name from the standard library is used, it either is a use of what the standard
offers or part of an explanation of how the standard facility might be defined.
3.4 Output [tour2.ostream]
The iostream library defines output for every built-in type. Further, it is easy to define output of a
user-defined type. By default, values output to ccoouutt are converted to a sequence of characters. For
example,
vvooiidd ff()
{
ccoouutt << 1100;
}
will place the character 1 followed by the character 0 on the standard output stream. So will
vvooiidd gg()
{
iinntt i = 1100;
ccoouutt << ii;
}
Output of different types can be combined in the obvious way:
vvooiidd hh(iinntt ii)
{
ccoouutt << "tthhee vvaalluuee ooff i iiss ";
ccoouutt << ii;
ccoouutt << ´\\nn´;
}
If i has the value 1100, the output will be
tthhee vvaalluuee ooff i iiss 1100
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
48
A Tour of the Standard Library
Chapter 3
A character constant is a character enclosed in single quotes. Note that a character constant is output as a character rather than as a numerical value. For example,
vvooiidd kk()
{
ccoouutt << ´aa´;
ccoouutt << ´bb´;
ccoouutt << ´cc´;
}
will output aabbcc.
People soon tire of repeating the name of the output stream when outputting several related
items. Fortunately, the result of an output expression can itself be used for further output. For
example:
vvooiidd hh22(iinntt ii)
{
ccoouutt << "tthhee vvaalluuee ooff i iiss " << i << ´\\nn´;
}
This is equivalent to hh(). Streams are explained in more detail in Chapter 21.
3.5 Strings [tour2.string]
The standard library provides a ssttrriinngg type to complement the string literals used earlier. The
ssttrriinngg type provides a variety of useful string operations, such as concatenation. For example:
ssttrriinngg ss11 = "H
Heelllloo";
ssttrriinngg ss22 = "w
woorrlldd";
vvooiidd m
m11()
{
ssttrriinngg ss33 = ss11 + ", " + ss22 + "!\\nn";
ccoouutt << ss33;
}
Here, ss33 is initialized to the character sequence
H
Heelllloo, w
woorrlldd!
followed by a newline. Addition of strings means concatenation. You can add strings, string literals, and characters to a string.
In many applications, the most common form of concatenation is adding something to the end
of a string. This is directly supported by the += operation. For example:
vvooiidd m
m22(ssttrriinngg& ss11, ssttrriinngg& ss22)
{
ss11 = ss11 + ´\\nn´; // append newline
ss22 += ´\\nn´;
// append newline
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.5
Strings
49
The two ways of adding to the end of a string are semantically equivalent, but I prefer the latter
because it is more concise and likely to be more efficiently implemented.
Naturally, ssttrriinnggs can be compared against each other and against string literals. For example:
ssttrriinngg iinnccaannttaattiioonn;
vvooiidd rreessppoonndd(ccoonnsstt ssttrriinngg& aannssw
weerr)
{
iiff (aannssw
weerr == iinnccaannttaattiioonn) {
// perform magic
}
eellssee iiff (aannssw
weerr == "yyeess") {
// ...
}
// ...
}
The standard library string class is described in Chapter 20. Among other useful features, it provides the ability to manipulate substrings. For example:
ssttrriinngg nnaam
mee = "N
Niieellss SSttrroouussttrruupp";
vvooiidd m
m33()
{
ssttrriinngg s = nnaam
mee.ssuubbssttrr(66,1100);
nnaam
mee.rreeppllaaccee(00,55,"N
Niicchhoollaass");
}
// s = "Stroustrup"
// name becomes "Nicholas Stroustrup"
The ssuubbssttrr() operation returns a string that is a copy of the substring indicated by its arguments.
The first argument is an index into the string (a position), and the second argument is the length of
the desired substring. Since indexing starts from 00, s gets the value SSttrroouussttrruupp.
The rreeppllaaccee() operation replaces a substring with a value. In this case, the substring starting at
0 with length 5 is N
Niieellss; it is replaced by N
Niicchhoollaass. Thus, the final value of nnaam
mee is N
Niicchhoollaass
SSttrroouussttrruupp. Note that the replacement string need not be the same size as the substring that it is
replacing.
3.5.1 C-Style Strings [tour2.cstring]
A C-style string is a zero-terminated array of characters (§5.2.2). As shown, we can easily enter a
C-style string into a ssttrriinngg. To call functions that take C-style strings, we need to be able to extract
the value of a ssttrriinngg in the form of a C-style string. The cc__ssttrr() function does that (§20.4.1). For
example, we can print the nnaam
mee using the C output function pprriinnttff() (§21.8) like this:
vvooiidd ff()
{
pprriinnttff("nnaam
mee: %ss\\nn",nnaam
mee.cc__ssttrr());
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
50
A Tour of the Standard Library
Chapter 3
3.6 Input [tour2.istream]
The standard library offers iissttrreeaam
ms for input. Like oossttrreeaam
ms, iissttrreeaam
ms deal with character string
representations of built-in types and can easily be extended to cope with user-defined types.
The operator >> (‘‘get from’’) is used as an input operator; cciinn is the standard input stream.
The type of the right-hand operand of >> determines what input is accepted and what is the target
of the input operation. For example,
vvooiidd ff()
{
iinntt ii;
cciinn >> ii; // read an integer into i
ddoouubbllee dd;
cciinn >> dd; // read a double-precision, floating-point number into d
}
reads a number, such as 11223344, from the standard input into the integer variable i and a floatingpoint number, such as 1122.3344ee55, into the double-precision, floating-point variable dd.
Here is an example that performs inch-to-centimeter and centimeter-to-inch conversions. You
input a number followed by a character indicating the unit: centimeters or inches. The program
then outputs the corresponding value in the other unit:
iinntt m
maaiinn()
{
ccoonnsstt ffllooaatt ffaaccttoorr = 22.5544; // 1 inch equals 2.54 cm
ffllooaatt xx, iinn, ccm
m;
cchhaarr cchh = 00;
ccoouutt << "eenntteerr lleennggtthh: ";
cciinn >> xx;
cciinn >> cchh;
// read a floating-point number
// read a suffix
ssw
wiittcchh (cchh) {
ccaassee ´ii´:
// inch
iinn = xx;
ccm
m = xx*ffaaccttoorr;
bbrreeaakk;
ccaassee ´cc´:
// cm
iinn = xx/ffaaccttoorr;
ccm
m = xx;
bbrreeaakk;
ddeeffaauulltt:
iinn = ccm
m = 00;
bbrreeaakk;
}
ccoouutt << iinn << " iinn = " << ccm
m << " ccm
m\\nn";
}
The switch-statement tests a value against a set of constants. The break-statements are used to exit
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.6
Input
51
the switch-statement. The case constants must be distinct. If the value tested does not match any of
them, the ddeeffaauulltt is chosen. The programmer need not provide a ddeeffaauulltt.
Often, we want to read a sequence of characters. A convenient way of doing that is to read into
a ssttrriinngg. For example:
iinntt m
maaiinn()
{
ssttrriinngg ssttrr;
ccoouutt << "P
Plleeaassee eenntteerr yyoouurr nnaam
mee\\nn";
cciinn >> ssttrr;
ccoouutt << "H
Heelllloo, " << ssttrr << "!\\nn";
}
If you type in
E
Erriicc
the response is
H
Heelllloo, E
Erriicc!
By default, a whitespace character (§5.2.2) such as a space terminates the read, so if you enter
E
Erriicc B
Bllooooddaaxxee
pretending to be the ill-fated king of York, the response is still
H
Heelllloo, E
Erriicc!
You can read a whole line using the ggeettlliinnee() function. For example:
iinntt m
maaiinn()
{
ssttrriinngg ssttrr;
ccoouutt << "P
Plleeaassee eenntteerr yyoouurr nnaam
mee\\nn";
ggeettlliinnee(cciinn,ssttrr);
ccoouutt << "H
Heelllloo, " << ssttrr << "!\\nn";
}
With this program, the input
E
Erriicc B
Bllooooddaaxxee
yields the desired output:
H
Heelllloo, E
Erriicc B
Bllooooddaaxxee!
The standard strings have the nice property of expanding to hold what you put in them, so if you
enter a couple of megabytes of semicolons, the program will echo pages of semicolons back at you
– unless your machine or operating system runs out of some critical resource first.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
52
A Tour of the Standard Library
Chapter 3
3.7 Containers [tour2.stl]
Much computing involves creating collections of various forms of objects and then manipulating
such collections. Reading characters into a string and printing out the string is a simple example.
A class with the main purpose of holding objects is commonly called a container. Providing suitable containers for a given task and supporting them with useful fundamental operations are important steps in the construction of any program.
To illustrate the standard library’s most useful containers, consider a simple program for keeping names and telephone numbers. This is the kind of program for which different approaches
appear ‘‘simple and obvious’’ to people of different backgrounds.
3.7.1 Vector [tour2.vector]
For many C programmers, a built-in array of (name,number) pairs would seem to be a suitable
starting point:
ssttrruucctt E
Ennttrryy {
ssttrriinngg nnaam
mee;
iinntt nnuum
mbbeerr;
};
E
Ennttrryy pphhoonnee__bbooookk[11000000];
vvooiidd pprriinntt__eennttrryy(iinntt ii)
// simple use
{
ccoouutt << pphhoonnee__bbooookk[ii].nnaam
mee << ´ ´ << pphhoonnee__bbooookk[ii].nnuum
mbbeerr << ´\\nn´;
}
However, a built-in array has a fixed size. If we choose a large size, we waste space; if we choose a
smaller size, the array will overflow. In either case, we will have to write low-level memorymanagement code. The standard library provides a vveeccttoorr (§16.3) that takes care of that:
vveeccttoorr<E
Ennttrryy> pphhoonnee__bbooookk(11000000);
vvooiidd pprriinntt__eennttrryy(iinntt ii)
// simple use, exactly as for array
{
ccoouutt << pphhoonnee__bbooookk[ii].nnaam
mee << ´ ´ << pphhoonnee__bbooookk[ii].nnuum
mbbeerr << ´\\nn´;
}
vvooiidd aadddd__eennttrriieess(iinntt nn) // increase size by n
{
pphhoonnee__bbooookk.rreessiizzee(pphhoonnee__bbooookk.ssiizzee()+nn);
}
The vveeccttoorr member function ssiizzee() gives the number of elements.
Note the use of parentheses in the definition of pphhoonnee__bbooookk. We made a single object of type
vveeccttoorr<E
Ennttrryy> and supplied its initial size as an initializer. This is very different from declaring a
built-in array:
vveeccttoorr<E
Ennttrryy> bbooookk(11000000);
vveeccttoorr<E
Ennttrryy> bbooookkss[11000000];
// vector of 1000 elements
// 1000 empty vectors
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.7.1
Vector
53
Should you make the mistake of using [] where you meant () when declaring a vveeccttoorr, your compiler will almost certainly catch the mistake and issue an error message when you try to use the
vveeccttoorr.
A vveeccttoorr is a single object that can be assigned. For example:
vvooiidd ff(vveeccttoorr<E
Ennttrryy>& vv)
{
vveeccttoorr<E
Ennttrryy> vv22 = pphhoonnee__bbooookk;
v = vv22;
// ...
}
Assigning a vveeccttoorr involves copying its elements. Thus, after the initialization and assignment in
ff(), v and vv22 each holds a separate copy of every E
Ennttrryy in the phone book. When a vveeccttoorr holds
many elements, such innocent-looking assignments and initializations can be prohibitively expensive. Where copying is undesirable, references or pointers should be used.
3.7.2 Range Checking [tour2.range]
The standard library vveeccttoorr does not provide range checking by default (§16.3.3). For example:
vvooiidd ff()
{
iinntt i = pphhoonnee__bbooookk[11000011].nnuum
mbbeerr; // 1001 is out of range
// ...
}
The initialization is likely to place some random value in i rather than giving an error. This is
undesirable, so I will use a simple range-checking adaptation of vveeccttoorr, called V
Veecc, in the following
chapters. A V
Veecc is like a vveeccttoorr, except that it throws an exception of type oouutt__ooff__rraannggee if a subscript is out of range.
Techniques for implementing types such as V
Veecc and for using exceptions effectively are discussed in §11.12, §8.3, and Chapter 14. However, the definition here is sufficient for the examples
in this book:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veecc : ppuubblliicc vveeccttoorr<T
T> {
ppuubblliicc:
V
Veecc() : vveeccttoorr<T
T>() { }
V
Veecc(iinntt ss) : vveeccttoorr<T
T>(ss) { }
T
T& ooppeerraattoorr[](iinntt ii) { rreettuurrnn aatt(ii); }
ccoonnsstt T
T& ooppeerraattoorr[](iinntt ii) ccoonnsstt { rreettuurrnn aatt(ii); }
// range-checked
// range-checked
};
The aatt() operation is a vveeccttoorr subscript operation that throws an exception of type oouutt__ooff__rraannggee
if its argument is out of the vveeccttoorr’s range (§16.3.3).
Returning to the problem of keeping names and telephone numbers, we can now use a V
Veecc to
ensure that out-of-range accesses are caught. For example:
V
Veecc<E
Ennttrryy> pphhoonnee__bbooookk(11000000);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
54
A Tour of the Standard Library
Chapter 3
vvooiidd pprriinntt__eennttrryy(iinntt ii)
// simple use, exactly as for vector
{
ccoouutt << pphhoonnee__bbooookk[ii].nnaam
mee << ´ ´ << pphhoonnee__bbooookk[ii].nnuum
mbbeerr << ´\\nn´;
}
An out-of-range access will throw an exception that the user can catch. For example:
vvooiidd ff()
{
ttrryy {
ffoorr (iinntt i = 00; ii<1100000000; ii++) pprriinntt__eennttrryy(ii);
}
ccaattcchh (oouutt__ooff__rraannggee) {
ccoouutt << "rraannggee eerrrroorr\\nn";
}
}
The exception will be thrown, and then caught, when pphhoonnee__bbooookk[ii] is tried with ii==11000000.
If the user doesn’t catch this kind of exception, the program will terminate in a well-defined manner
rather than proceeding or failing in an undefined manner. One way to minimize surprises from
exceptions is to use a m
maaiinn() with a try-block as its body:
iinntt m
maaiinn()
ttrryy {
// your code
}
ccaattcchh (oouutt__ooff__rraannggee) {
cceerrrr << "rraannggee eerrrroorr\\nn";
}
ccaattcchh (...) {
cceerrrr << "uunnkknnoow
wnn eexxcceeppttiioonn tthhrroow
wnn\\nn";
}
This provides default exception handlers so that if we fail to catch some exception, an error message is printed on the standard error-diagnostic output stream cceerrrr (§21.2.1).
3.7.3 List [tour2.list]
Insertion and deletion of phone book entries could be common. Therefore, a list could be more
appropriate than a vector for representing a simple phone book. For example:
lliisstt<E
Ennttrryy> pphhoonnee__bbooookk;
When we use a list, we tend not to access elements using subscripting the way we commonly do for
vectors. Instead, we might search the list looking for an element with a given value. To do this, we
take advantage of the fact that a lliisstt is a sequence as described in §3.8:
vvooiidd pprriinntt__eennttrryy(ccoonnsstt ssttrriinngg& ss)
{
ttyyppeeddeeff lliisstt<E
Ennttrryy>::ccoonnsstt__iitteerraattoorr L
LII;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.7.3
List
55
ffoorr (L
LII i = pphhoonnee__bbooookk.bbeeggiinn(); i != pphhoonnee__bbooookk.eenndd(); ++ii) {
E
Ennttrryy& e = *ii; // reference used as shorthand
iiff (ss == ee.nnaam
mee) ccoouutt << ee.nnaam
mee << ´ ´ << ee.nnuum
mbbeerr << ´\\nn´;
}
}
The search for s starts at the beginning of the list and proceeds until either s is found or the end is
reached. Every standard library container provides the functions bbeeggiinn() and eenndd(), which return
an iterator to the first and to one-past-the-last element, respectively (§16.3.2). Given an iterator ii,
the next element is ++ii. Given an iterator ii, the element it refers to is *ii.
A user need not know the exact type of the iterator for a standard container. That iterator type is
part of the definition of the container and can be referred to by name. When we don’t need to modify an element of the container, ccoonnsstt__iitteerraattoorr is the type we want. Otherwise, we use the plain
iitteerraattoorr type (§16.3.1).
Adding elements to a lliisstt is easy:
vvooiidd aadddd__eennttrryy(E
Ennttrryy& ee, lliisstt<E
Ennttrryy>::iitteerraattoorr ii)
{
pphhoonnee__bbooookk.ppuusshh__ffrroonntt(ee);
// add at beginning
pphhoonnee__bbooookk.ppuusshh__bbaacckk(ee);
// add at end
pphhoonnee__bbooookk.iinnsseerrtt(ii,ee);
// add before the element ‘i’ refers to
}
3.7.4 Map [tour2.map]
Writing code to look up a name in a list of (name,number) pairs is really quite tedious. In addition,
a linear search is quite inefficient for all but the shortest lists. Other data structures directly support
insertion, deletion, and searching based on values. In particular, the standard library provides the
m
maapp type (§17.4.1). A m
maapp is a container of pairs of values. For example:
m
maapp<ssttrriinngg,iinntt> pphhoonnee__bbooookk;
In other contexts, a m
maapp is known as an associative array or a dictionary.
When indexed by a value of its first type (called the key) a m
maapp returns the corresponding value
of the second type (called the value or the mapped type). For example:
vvooiidd pprriinntt__eennttrryy(ccoonnsstt ssttrriinngg& ss)
{
iiff (iinntt i = pphhoonnee__bbooookk[ss]) ccoouutt << s << ´ ´ << i << ´\\nn´;
}
If no match was found for the key ss, a default value is returned from the pphhoonnee__bbooookk. The default
value for an integer type in a m
maapp is 00. Here, I assume that 0 isn’t a valid telephone number.
3.7.5 Standard Containers [tour2.stdcontainer]
A m
maapp, a lliisstt, and a vveeccttoorr can each be used to represent a phone book. However, each has
strengths and weaknesses. For example, subscripting a vveeccttoorr is cheap and easy. On the other
hand, inserting an element between two elements tends to be expensive. A lliisstt has exactly the
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
56
A Tour of the Standard Library
Chapter 3
opposite properties. A m
maapp resembles a lliisstt of (key,value) pairs except that it is optimized for finding values based on keys.
The standard library provides some of the most general and useful container types to allow the
programmer to select a container that best serves the needs of an application:
_________________________________________________________________
________________________________________________________________
Standard Container Summary
__________________________________________________________________
<T
T>
>
A variable-sized vector (§16.3)
vveeccttoorr<
<T
T>
>
A doubly-linked list (§17.2.2)
lliisstt<
qquueeuuee<
<T
T>
>
A queue (§17.3.2)
ssttaacckk<
<T
T>
>
A stack (§17.3.1)
ddeeqquuee<
<T
T>
>
A double-ended queue (§17.2.3)
<T
T>
>
A queue sorted by value (§17.3.3)
pprriioorriittyy__qquueeuuee<
<T
T>
>
A set (§17.4.3)
sseett<
m
muullttiisseett<
<T
T>
>
A set in which a value can occur many times (§17.4.4)
m
maapp<
<kkeeyy,,vvaall>
>
An associative array (§17.4.1)
m
muullttiim
maapp<
<kkeeyy,,vvaall>
>
A map in which a key can occur many times (§17.4.2)
_________________________________________________________________
The standard containers are presented in §16.2, §16.3, and Chapter 17. The containers are defined
in namespace ssttdd and presented in headers <vveeccttoorr>, <lliisstt>, <m
maapp>, etc. (§16.2).
The standard containers and their basic operations are designed to be similar from a notational
point of view. Furthermore, the meanings of the operations are equivalent for the various containers. In general, basic operations apply to every kind of container. For example, ppuusshh__bbaacckk() can
be used (reasonably efficiently) to add elements to the end of a vveeccttoorr as well as for a lliisstt, and
every container has a ssiizzee() member function that returns its number of elements.
This notational and semantic uniformity enables programmers to provide new container types
that can be used in a very similar manner to the standard ones. The range-checked vector, V
Veecc
(§3.7.2), is an example of that. Chapter 17 demonstrates how a hhaasshh__m
maapp can be added to the
framework. The uniformity of container interfaces also allows us to specify algorithms independently of individual container types.
3.8 Algorithms [tour2.algorithms]
A data structure, such as a list or a vector, is not very useful on its own. To use one, we need operations for basic access such as adding and removing elements. Furthermore, we rarely just store
objects in a container. We sort them, print them, extract subsets, remove elements, search for
objects, etc. Consequently, the standard library provides the most common algorithms for containers in addition to providing the most common container types. For example, the following sorts a
vveeccttoorr and places a copy of each unique vveeccttoorr element on a lliisstt:
vvooiidd ff(vveeccttoorr<E
Ennttrryy>& vvee, lliisstt<E
Ennttrryy>& llee)
{
ssoorrtt(vvee.bbeeggiinn(),vvee.eenndd());
uunniiqquuee__ccooppyy(vvee.bbeeggiinn(),vvee.eenndd(),llee.bbeeggiinn());
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.8
Algorithms
57
The standard algorithms are described in Chapter 18. They are expressed in terms of sequences of
elements (§2.7.2). A sequence is represented by a pair of iterators specifying the first element and
the one-beyond-the-last element. In the example, ssoorrtt() sorts the sequence from vvee.bbeeggiinn() to
vvee.eenndd() – which just happens to be all the elements of a vveeccttoorr. For writing, you need only to
specify the first element to be written. If more than one element is written, the elements following
that initial element will be overwritten.
If we wanted to add the new elements to the end of a container, we could have written:
vvooiidd ff(vveeccttoorr<E
Ennttrryy>& vvee, lliisstt<E
Ennttrryy>& llee)
{
ssoorrtt(vvee.bbeeggiinn(),vvee.eenndd());
uunniiqquuee__ccooppyy(vvee.bbeeggiinn(),vvee.eenndd(),bbaacckk__iinnsseerrtteerr(llee));
}
// append to le
A bbaacckk__iinnsseerrtteerr() adds elements at the end of a container, extending the container to make room
for them (§19.2.4). C programmers will appreciate that the standard containers plus
bbaacckk__iinnsseerrtteerr()s eliminate the need to use error-prone, explicit C-style memory management
using rreeaalllloocc() (§16.3.5). Forgetting to use a bbaacckk__iinnsseerrtteerr() when appending can lead to
errors. For example:
vvooiidd ff(lliisstt<E
Ennttrryy>& vvee, vveeccttoorr<E
Ennttrryy>& llee)
{
ccooppyy(vvee.bbeeggiinn(),vvee.eenndd(),llee);
// error: le not an iterator
ccooppyy(vvee.bbeeggiinn(),vvee.eenndd(),llee.eenndd()); // bad: writes beyond the end
ccooppyy(vvee.bbeeggiinn(),vvee.eenndd(),llee.bbeeggiinn()); // overwrite elements
}
3.8.1 Use of Iterators [tour2.iteruse]
When you first encounter a container, a few iterators referring to useful elements can be obtained;
bbeeggiinn() and eenndd() are the best examples of this. In addition, many algorithms return iterators.
For example, the standard algorithm ffiinndd looks for a value in a sequence and returns an iterator to
the element found. Using ffiinndd, we can write a function that counts the number of occurrences of a
character in a ssttrriinngg:
iinntt ccoouunntt(ccoonnsstt ssttrriinngg& ss, cchhaarr cc)
{
ssttrriinngg::ccoonnsstt__iitteerraattoorr i = ffiinndd(ss.bbeeggiinn(),ss.eenndd(),cc);
iinntt n = 00;
w
whhiillee (ii != ss.eenndd()) {
++nn;
i = ffiinndd(ii+11,ss.eenndd(),cc);
}
rreettuurrnn nn;
}
The ffiinndd algorithm returns an iterator to the first occurrence of a value in a sequence or the onepast-the-end iterator. Consider what happens for a simple call of ccoouunntt:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
58
A Tour of the Standard Library
Chapter 3
vvooiidd ff()
{
ssttrriinngg m = "M
Maarryy hhaadd a lliittttllee llaam
mbb";
iinntt aa__ccoouunntt = ccoouunntt(m
m,´aa´);
}
The first call to ffiinndd() finds the ´aa´ in M
Maarryy. Thus, the iterator points to that character and not to
ss.eenndd(), so we enter the loop. In the loop, we start the search at ii+11; that is, we start one past
where we found the ´aa´. We then loop finding the other three ´aa´s. That done, ffiinndd() reaches
the end and returns ss.eenndd() so that the condition ii!=ss.eenndd() fails and we exit the loop.
That call of ccoouunntt() could be graphically represented like this:
M a
r
y
h a d
a
l
i
t
t
l
e
l
a m b
.....
.
.
.
.
.
.
.....
The arrows indicate the initial, intermediate, and final values of the iterator ii.
Naturally, the ffiinndd algorithm will work equivalently on every standard container. Consequently, we could generalize the ccoouunntt() function in the same way:
tteem
mppllaattee<ccllaassss C
C, ccllaassss T
T> iinntt ccoouunntt(ccoonnsstt C
C& vv, T vvaall)
{
ttyyppeennaam
mee C
C::ccoonnsstt__iitteerraattoorr i = ffiinndd(vv.bbeeggiinn(),vv.eenndd(),vvaall); // "typename;" see §C.13.5
iinntt n = 00;
w
whhiillee (ii != vv.eenndd()) {
++nn;
++ii; // skip past the element we just found
i = ffiinndd(ii,vv.eenndd(),vvaall);
}
rreettuurrnn nn;
}
This works, so we can say:
vvooiidd ff(lliisstt<ccoom
mpplleexx>& llcc, vveeccttoorr<ssttrriinngg>& vvcc, ssttrriinngg ss)
{
iinntt ii11 = ccoouunntt(llcc,ccoom
mpplleexx(11,33));
iinntt ii22 = ccoouunntt(vvcc,"C
Chhrryyssiippppuuss");
iinntt ii33 = ccoouunntt(ss,´xx´);
}
However, we don’t have to define a ccoouunntt template. Counting occurrences of an element is so generally useful that the standard library provides that algorithm. To be fully general, the standard
library ccoouunntt takes a sequence as its argument, rather than a container, so we would say:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.8.1
Use of Iterators
59
vvooiidd ff(lliisstt<ccoom
mpplleexx>& llcc, vveeccttoorr<ssttrriinngg>& vvss, ssttrriinngg ss)
{
iinntt ii11 = ccoouunntt(llcc.bbeeggiinn(),llcc.eenndd(),ccoom
mpplleexx(11,33));
iinntt ii22 = ccoouunntt(vvss.bbeeggiinn(),vvss.eenndd(),"D
Diiooggeenneess");
iinntt ii33 = ccoouunntt(ss.bbeeggiinn(),ss.eenndd(),´xx´);
}
The use of a sequence allows us to use ccoouunntt for a built-in array and also to count parts of a container. For example:
vvooiidd gg(cchhaarr ccss[], iinntt sszz)
{
iinntt ii11 = ccoouunntt(&ccss[00],&ccss[sszz],´zz´);
iinntt ii22 = ccoouunntt(&ccss[00],&ccss[sszz/22],´zz´);
}
// ’z’s in array
// ’z’s in first half of array
3.8.2 Iterator Types [tour2.iter]
What are iterators really? Any particular iterator is an object of some type. There are, however,
many different iterator types because an iterator needs to hold the information necessary for doing
its job for a particular container type. These iterator types can be as different as the containers and
the specialized needs they serve. For example, a vveeccttoorr’s iterator is most likely an ordinary pointer
because a pointer is quite a reasonable way of referring to an element of a vveeccttoorr:
iterator:
p
vector:
P
i
e
t
H e
i
n
Alternatively, a vveeccttoorr iterator could be implemented as a pointer to the vveeccttoorr plus an index:
iterator:
(start == p, position == 3)
.............
vector:
P
i
e
t
H e
i
n
Using such an iterator would allow range checking (§19.3).
A list iterator must be something more complicated than a simple pointer to an element because
an element of a list in general does not know where the next element of that list is. Thus, a list iterator might be a pointer to a link:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
60
A Tour of the Standard Library
Chapter 3
iterator:
list:
p
link
link
link
link
P
i
e
t
elements:
...
What is common for all iterators is their semantics and the naming of their operations. For example, applying ++ to any iterator yields an iterator that refers to the next element. Similarly, * yields
the element to which the iterator refers. In fact, any object that obeys a few simple rules like these
is an iterator (§19.2.1). Furthermore, users rarely need to know the type of a specific iterator; each
container ‘‘knows’’ its iterator types and makes them available under the conventional names iitteerraa-ttoorr and ccoonnsstt__iitteerraattoorr. For example, lliisstt<E
Ennttrryy>::iitteerraattoorr is the general iterator type for
lliisstt<E
Ennttrryy>. I rarely have to worry about the details of how that type is defined.
3.8.3 Iterators and I/O [tour2.ioiterators]
Iterators are a general and useful concept for dealing with sequences of elements in containers.
However, containers are not the only place where we find sequences of elements. For example, an
input stream produces a sequence of values and we write a sequence of values to an output stream.
Consequently, the notion of iterators can be usefully applied to input and output.
To make an oossttrreeaam
m__iitteerraattoorr, we need to specify which stream will be used and the type of
objects written to it. For example, we can define an iterator that refers to the standard output
stream, ccoouutt:
oossttrreeaam
m__iitteerraattoorr<ssttrriinngg> oooo(ccoouutt);
The effect of assigning to *oooo is to write the assigned value to ccoouutt. For example:
iinntt m
maaiinn()
{
*oooo = "H
Heelllloo, ";
++oooo;
*oooo = "w
woorrlldd!\\nn";
}
// meaning cout << "Hello, "
// meaning cout << "world!\n"
This is yet another way of writing the canonical message to standard output. The ++oooo is done to
mimic writing into an array through a pointer. This way wouldn’t be my first choice for that simple
task, but the utility of treating output as a write-only container will soon be obvious – if it isn’t
already.
Similarly, an iissttrreeaam
m__iitteerraattoorr is something that allows us to treat an input stream as a readonly container. Again, we must specify the stream to be used and the type of values expected:
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg> iiii(cciinn);
Because input iterators invariably appear in pairs representing a sequence, we must provide an
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.8.3
Iterators and I/O
61
iissttrreeaam
m__iitteerraattoorr to indicate the end of input. This is the default iissttrreeaam
m__iitteerraattoorr:
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg> eeooss;
We could now read H
Heelllloo, w
woorrlldd! from input and write it out again like this:
iinntt m
maaiinn()
{
ssttrriinngg ss11 = *iiii;
++iiii;
ssttrriinngg ss22 = *iiii;
ccoouutt << ss11 << ´ ´ << ss22 << ´\\nn´;
}
Actually, iissttrreeaam
m__iitteerraattoorrs and oossttrreeaam
m__iitteerraattoorrs are not meant to be used directly. Instead, they
are typically provided as arguments to algorithms. For example, we can write a simple program to
read a file, sort the words read, eliminate duplicates, and write the result to another file:
iinntt m
maaiinn()
{
ssttrriinngg ffrroom
m, ttoo;
cciinn >> ffrroom
m >> ttoo;
// get source and target file names
iiffssttrreeaam
m iiss(ffrroom
m.cc__ssttrr());
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg> iiii(iiss);
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg> eeooss;
// input stream (c_str(); see §3.5)
// input iterator for stream
// input sentinel
vveeccttoorr<ssttrriinngg> bb(iiii,eeooss);
ssoorrtt(bb.bbeeggiinn(),bb.eenndd());
// b is a vector initialized from input
// sort the buffer
ooffssttrreeaam
m ooss(ttoo.cc__ssttrr());
oossttrreeaam
m__iitteerraattoorr<ssttrriinngg> oooo(ooss,"\\nn");
// output stream
// output iterator for stream
uunniiqquuee__ccooppyy(bb.bbeeggiinn(),bb.eenndd(),oooo);
// copy buffer to output,
// discard replicated values
rreettuurrnn !iiss.eeooff() && !ooss;
// return error state (§3.2, §21.3.3)
}
An iiffssttrreeaam
m is an iissttrreeaam
m that can be attached to a file, and an ooffssttrreeaam
m is an oossttrreeaam
m that can be
attached to a file. The oossttrreeaam
m__iitteerraattoorr’s second argument is used to delimit output values.
3.8.4 Traversals and Predicates [tour2.traverse]
Iterators allow us to write loops to iterate through a sequence. However, writing loops can be
tedious, so the standard library provides ways for a function to be called for each element of a
sequence.
Consider writing a program that reads words from input and records the frequency of their
occurrence. The obvious representation of the strings and their associated frequencies is a m
maapp:
m
maapp<ssttrriinngg,iinntt> hhiissttooggrraam
m;
The obvious action to be taken for each string to record its frequency is:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
62
A Tour of the Standard Library
Chapter 3
vvooiidd rreeccoorrdd(ccoonnsstt ssttrriinngg& ss)
{
hhiissttooggrraam
m[ss]++;
// record frequency of ‘‘s’’
}
Once the input has been read, we would like to output the data we have gathered. The m
maapp consists
of a sequence of (string,int) pairs. Consequently, we would like to call
vvooiidd pprriinntt(ccoonnsstt ppaaiirr<ccoonnsstt ssttrriinngg,iinntt>& rr)
{
ccoouutt << rr.ffiirrsstt << ´ ´ << rr.sseeccoonndd << ´\\nn´;
}
for each element in the map (the first element of a ppaaiirr is called ffiirrsstt, and the second element is
called sseeccoonndd). The first element of the ppaaiirr is a ccoonnsstt ssttrriinngg rather than a plain ssttrriinngg because all
m
maapp keys are constants.
Thus, the main program becomes:
iinntt m
maaiinn()
{
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg> iiii(cciinn);
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg> eeooss;
ffoorr__eeaacchh(iiii,eeooss,rreeccoorrdd);
ffoorr__eeaacchh(hhiissttooggrraam
m.bbeeggiinn(),hhiissttooggrraam
m.eenndd(),pprriinntt);
}
Note that we don’t need to sort the m
maapp to get the output in order. A m
maapp keeps its elements
ordered so that an iteration traverses the m
maapp in (increasing) order.
Many programming tasks involve looking for something in a container rather than simply doing
something to every element. For example, the ffiinndd algorithm (§18.5.2) provides a convenient way
of looking for a specific value. A more general variant of this idea looks for an element that fulfills
a specific requirement. For example, we might want to search a m
maapp for the first value larger than
4422. A m
maapp is a sequence of (key,value) pairs, so we search that list for a ppaaiirr<ccoonnsstt ssttrriinngg,iinntt>
where the iinntt is greater than 4422:
bbooooll ggtt__4422(ccoonnsstt ppaaiirr<ccoonnsstt ssttrriinngg,iinntt>& rr)
{
rreettuurrnn rr.sseeccoonndd>4422;
}
vvooiidd ff(m
maapp<ssttrriinngg,iinntt>& m
m)
{
ttyyppeeddeeff m
maapp<ssttrriinngg,iinntt>::ccoonnsstt__iitteerraattoorr M
MII;
M
MII i = ffiinndd__iiff(m
m.bbeeggiinn(),m
m.eenndd(),ggtt__4422);
// ...
}
Alternatively, we could count the number of words with a frequency higher than 42:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.8.4
Traversals and Predicates
63
vvooiidd gg(ccoonnsstt m
maapp<ssttrriinngg,iinntt>& m
m)
{
iinntt cc4422 = ccoouunntt__iiff(m
m.bbeeggiinn(),m
m.eenndd(),ggtt__4422);
// ...
}
A function, such as ggtt__4422(), that is used to control the algorithm is called a predicate. A predicate
is called for each element and returns a Boolean value, which the algorithm uses to perform its
intended action. For example, ffiinndd__iiff() searches until its predicate returns ttrruuee to indicate that an
element of interest has been found. Similarly, ccoouunntt__iiff() counts the number of times its predicate
is ttrruuee.
The standard library provides a few useful predicates and some templates that are useful for creating more (§18.4.2).
3.8.5 Algorithms Using Member Functions [tour2.memp]
Many algorithms apply a function to elements of a sequence. For example, in §3.8.4
ffoorr__eeaacchh(iiii,eeooss,rreeccoorrdd);
calls rreeccoorrdd() to read strings from input.
Often, we deal with containers of pointers and we really would like to call a member function of
the object pointed to, rather than a global function on the pointer. For example, we might want to
call the member function SShhaappee::ddrraaw
w() for each element of a lliisstt<SShhaappee*>. To handle this
specific example, we simply write a nonmember function that invokes the member function. For
example:
vvooiidd ddrraaw
w(SShhaappee* pp)
{
pp->ddrraaw
w();
}
vvooiidd ff(lliisstt<SShhaappee*>& sshh)
{
ffoorr__eeaacchh(sshh.bbeeggiinn(),sshh.eenndd(),ddrraaw
w);
}
By generalizing this technique, we can write the example like this:
vvooiidd gg(lliisstt<SShhaappee*>& sshh)
{
ffoorr__eeaacchh(sshh.bbeeggiinn(),sshh.eenndd(),m
meem
m__ffuunn(&SShhaappee::ddrraaw
w));
}
The standard library m
meem
m__ffuunn() template (§18.4.4.2) takes a pointer to a member function (§15.5)
as its argument and produces something that can be called for a pointer to the member’s class. The
result of m
meem
m__ffuunn(&SShhaappee::ddrraaw
w) takes a SShhaappee* argument and returns whatever
SShhaappee::ddrraaw
w() returns.
The m
meem
m__ffuunn() mechanism is important because it allows the standard algorithms to be used
for containers of polymorphic objects.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
64
A Tour of the Standard Library
Chapter 3
3.8.6 Standard Library Algorithms [tour2.algolist]
What is an algorithm? A general definition of an algorithm is ‘‘a finite set of rules which gives a
sequence of operations for solving a specific set of problems [and] has five important features:
Finiteness ... Definiteness ... Input ... Output ... Effectiveness’’ [Knuth,1968,§1.1]. In the context of
the C++ standard library, an algorithm is a set of templates operating on sequences of elements.
The standard library provides dozens of algorithms. The algorithms are defined in namespace
ssttdd and presented in the <aallggoorriitthhm
m> header. Here are a few I have found particularly useful:
______________________________________________________________________
Selected Standard Algorithms
_______________________________________________________________________
_____________________________________________________________________
Invoke function for each element (§18.5.1)
ffoorr__eeaacchh(())
Find first occurrence of arguments (§18.5.2)
ffiinndd(())
ffiinndd__iiff(())
Find first match of predicate (§18.5.2)
ccoouunntt(())
Count occurrences of element (§18.5.3)
ccoouunntt__iiff(())
Count matches of predicate (§18.5.3)
Replace element with new value (§18.6.4)
rreeppllaaccee(())
Replace element that matches predicate with new value (§18.6.4)
rreeppllaaccee__iiff(())
ccooppyy(())
Copy elements (§18.6.1)
uunniiqquuee__ccooppyy(())
Copy elements that are not duplicates (§18.6.1)
ssoorrtt(())
Sort elements (§18.7.1)
Find all elements with equivalent values (§18.7.2)
eeqquuaall__rraannggee(())
m
meerrggee(())
Merge sorted sequences (§18.7.3)
______________________________________________________________________
These algorithms, and many more (see Chapter 18), can be applied to elements of containers,
ssttrriinnggs, and built-in arrays.
3.9 Math [tour2.math]
Like C, C++ wasn’t designed primarily with numerical computation in mind. However, a lot of
numerical work is done in C++, and the standard library reflects that.
3.9.1 Complex Numbers [tour2.complex]
The standard library supports a family of complex number types along the lines of the ccoom
mpplleexx
class described in §2.5.2. To support complex numbers where the scalars are single-precision,
floating-point numbers (ffllooaatts), double precision numbers (ddoouubbllees), etc., the standard library ccoom
m-pplleexx is a template:
tteem
mppllaattee<ccllaassss ssccaallaarr> ccllaassss ccoom
mpplleexx {
ppuubblliicc:
ccoom
mpplleexx(ssccaallaarr rree, ssccaallaarr iim
m);
// ...
};
The usual arithmetic operations and the most common mathematical functions are supported for
complex numbers. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 3.9.1
Complex Numbers
65
// standard exponentiation function from <complex>:
tteem
mppllaattee<ccllaassss C
C> ccoom
mpplleexx<C
C> ppoow
w(ccoonnsstt ccoom
mpplleexx<C
C>&, iinntt);
vvooiidd ff(ccoom
mpplleexx<ffllooaatt> ffll, ccoom
mpplleexx<ddoouubbllee> ddbb)
{
ccoom
mpplleexx<lloonngg ddoouubbllee> lldd = ffll+ssqqrrtt(ddbb);
ddbb += ffll*33;
ffll = ppoow
w(11/ffll,22);
// ...
}
For more details, see §22.5.
3.9.2 Vector Arithmetic [tour2.valarray]
The vveeccttoorr described in §3.7.1 was designed to be a general mechanism for holding values, to be
flexible, and to fit into the architecture of containers, iterators, and algorithms. However, it does
not support mathematical vector operations. Adding such operations to vveeccttoorr would be easy, but
its generality and flexibility precludes optimizations that are often considered essential for serious
numerical work. Consequently, the standard library provides a vector, called vvaallaarrrraayy, that is less
general and more amenable to optimization for numerical computation:
tteem
mppllaattee<ccllaassss T
T> ccllaassss vvaallaarrrraayy {
// ...
T
T& ooppeerraattoorr[](ssiizzee__tt);
// ...
};
The type ssiizzee__tt is the unsigned integer type that the implementation uses for array indices.
The usual arithmetic operations and the most common mathematical functions are supported for
vvaallaarrrraayys. For example:
// standard absolute value function from <valarray>:
tteem
mppllaattee<ccllaassss T
T> vvaallaarrrraayy<T
T> aabbss(ccoonnsstt vvaallaarrrraayy<T
T>&);
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& aa11, vvaallaarrrraayy<ddoouubbllee>& aa22)
{
vvaallaarrrraayy<ddoouubbllee> a = aa11*33.1144+aa22/aa11;
aa22 += aa11*33.1144;
a = aabbss(aa);
ddoouubbllee d = aa22[77];
// ...
}
For more details, see §22.4.
3.9.3 Basic Numeric Support [tour2.basicnum]
Naturally, the standard library contains the most common mathematical functions – such as lloogg(),
ppoow
w(), and ccooss() – for floating-point types; see §22.3. In addition, classes that describe the
properties of built-in types – such as the maximum exponent of a ffllooaatt – are provided; see §22.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
66
A Tour of the Standard Library
Chapter 3
3.10 Standard Library Facilities [tour2.post]
The facilities provided by the standard library can be classified like this:
[1] Basic run-time language support (e.g., for allocation and run-time type information); see
§16.1.3.
[2] The C standard library (with very minor modifications to minimize violations of the type
system); see §16.1.2.
[3] Strings and I/O streams (with support for international character sets and localization); see
Chapter 20 and Chapter 21.
[4] A framework of containers (such as vveeccttoorr, lliisstt, and m
maapp) and algorithms using containers
(such as general traversals, sorts, and merges); see Chapter 16, Chapter 17, Chapter 18, and
Chapter 19.
[5] Support for numerical computation (complex numbers plus vectors with arithmetic operations, BLAS-like and generalized slices, and semantics designed to ease optimization); see
Chapter 22.
The main criterion for including a class in the library was that it would somehow be used by almost
every C++ programmer (both novices and experts), that it could be provided in a general form that
did not add significant overhead compared to a simpler version of the same facility, and that simple
uses should be easy to learn. Essentially, the C++ standard library provides the most common fundamental data structures together with the fundamental algorithms used on them.
Every algorithm works with every container without the use of conversions. This framework,
conventionally called the STL [Stepanov,1994], is extensible in the sense that users can easily provide containers and algorithms in addition to the ones provided as part of the standard and have
these work directly with the standard containers and algorithms.
3.11 Advice [tour2.advice]
[1] Don’t reinvent the wheel; use libraries.
[2] Don’t believe in magic; understand what your libraries do, how they do it, and at what cost
they do it.
[3] When you have a choice, prefer the standard library to other libraries.
[4] Do not think that the standard library is ideal for everything.
[5] Remember to #iinncclluuddee the headers for the facilities you use; §3.3.
[6] Remember that standard library facilities are defined in namespace ssttdd; §3.3.
[7] Use ssttrriinngg rather than cchhaarr*; §3.5, §3.6.
[8] If in doubt use a range-checked vector (such as V
Veecc); §3.7.2.
[9] Prefer vveeccttoorr<T
T>, lliisstt<T
T>, and m
maapp<kkeeyy,vvaalluuee> to T
T[]; §3.7.1, §3.7.3, §3.7.4.
[10] When adding elements to a container, use ppuusshh__bbaacckk() or bbaacckk__iinnsseerrtteerr(); §3.7.3, §3.8.
[11] Use ppuusshh__bbaacckk() on a vveeccttoorr rather than rreeaalllloocc() on an array; §3.8.
[12] Catch common exceptions in m
maaiinn(); §3.7.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Part I
Basic Facilities
This part describes C++’s built-in types and the basic facilities for constructing programs out of them. The C subset of C++ is presented together with C++’s additional
support for traditional styles of programming. It also discusses the basic facilities for
composing a C++ program out of logical and physical parts.
Chapters
4
5
6
7
8
9
Types and Declarations
Pointers, Arrays, and Structures
Expressions and Statements
Functions
Namespaces and Exceptions
Source Files and Programs
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
68
Basic Facilities
Part I
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
4
________________________________________
________________________________________________________________________________________________________________________________________________________________
Types and Declarations
Accept nothing short of perfection!
– anon
Perfection is achieved
only on the point of collapse.
– C. N. Parkinson
Types — fundamental types — Booleans — characters — character literals — integers
— integer literals — floating-point types — floating-point literals — sizes — vvooiidd —
enumerations — declarations — names — scope — initialization — objects — ttyyppeeddeeffs
— advice — exercises.
4.1 Types [dcl.type]
Consider
x = yy+ff(22);
For this to make sense in a C++ program, the names xx, yy, and f must be suitably declared. That is,
the programmer must specify that entities named xx, yy, and f exist and that they are of types for
which = (assignment), + (addition), and () (function call), respectively, are meaningful.
Every name (identifier) in a C++ program has a type associated with it. This type determines
what operations can be applied to the name (that is, to the entity referred to by the name) and how
such operations are interpreted. For example, the declarations
ffllooaatt xx;
iinntt y = 77;
ffllooaatt ff(iinntt);
// x is a floating-point variable
// y is an integer variable with the initial value 7
// f is a function taking an argument of type int and returning a floating-point number
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
70
Types and Declarations
Chapter 4
would make the example meaningful. Because y is declared to be an iinntt, it can be assigned to, used
in arithmetic expressions, etc. On the other hand, f is declared to be a function that takes an iinntt as
its argument, so it can be called given a suitable argument.
This chapter presents fundamental types (§4.1.1) and declarations (§4.9). Its examples just
demonstrate language features; they are not intended to do anything useful. More extensive and
realistic examples are saved for later chapters after more of C++ has been described. This chapter
simply provides the most basic elements from which C++ programs are constructed. You must
know these elements, plus the terminology and simple syntax that goes with them, in order to complete a real project in C++ and especially to read code written by others. However, a thorough
understanding of every detail mentioned in this chapter is not a requirement for understanding the
following chapters. Consequently, you may prefer to skim through this chapter, observing the
major concepts, and return later as the need for understanding of more details arises.
4.1.1 Fundamental Types [dcl.fundamental]
C++ has a set of fundamental types corresponding to the most common basic storage units of a
computer and the most common ways of using them to hold data:
§4.2 A Boolean type (bbooooll)
§4.3 Character types (such as cchhaarr)
§4.4 Integer types (such as iinntt)
§4.5 Floating-point types (such as ddoouubbllee)
In addition, a user can define
§4.8 Enumeration types for representing specific sets of values (eennuum
m)
There also is
§4.7 A type, vvooiidd, used to signify the absence of information
From these types, we can construct other types:
§5.1 Pointer types (such as iinntt*)
§5.2 Array types (such as cchhaarr[])
§5.5 Reference types (such as ddoouubbllee&)
§5.7 Data structures and classes (Chapter 10)
The Boolean, character, and integer types are collectively called integral types. The integral and
floating-point types are collectively called arithmetic types. Enumerations and classes (Chapter 10)
are called user-defined types because they must be defined by users rather than being available for
use without previous declaration, the way fundamental types are. In contrast, other types are called
built-in types.
The integral and floating-point types are provided in a variety of sizes to give the programmer a
choice of the amount of storage consumed, the precision, and the range available for computations
(§4.6). The assumption is that a computer provides bytes for holding characters, words for holding
and computing integer values, some entity most suitable for floating-point computation, and
addresses for referring to those entities. The C++ fundamental types together with pointers and
arrays present these machine-level notions to the programmer in a reasonably implementationindependent manner.
For most applications, one could simply use bbooooll for logical values, cchhaarr for characters, iinntt for
integer values, and ddoouubbllee for floating-point values. The remaining fundamental types are
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.1.1
Fundamental Types
71
variations for optimizations and special needs that are best ignored until such needs arise. They
must be known, however, to read old C and C++ code.
4.2 Booleans [dcl.bool]
A Boolean, bbooooll, can have one of the two values ttrruuee or ffaallssee. A Boolean is used to express the
results of logical operations. For example:
vvooiidd ff(iinntt aa, iinntt bb)
{
bbooooll bb11 = aa==bb;
// ...
}
// = is assignment, == is equality
If a and b have the same value, bb11 becomes ttrruuee; otherwise, bb11 becomes ffaallssee.
A common use of bbooooll is as the type of the result of a function that tests some condition (a
predicate). For example:
bbooooll iiss__ooppeenn(F
Fiillee*);
bbooooll ggrreeaatteerr(iinntt aa, iinntt bb) { rreettuurrnn aa>bb; }
By definition, ttrruuee has the value 1 when converted to an integer and ffaallssee has the value 00. Conversely, integers can be implicitly converted to bbooooll values: nonzero integers convert to ttrruuee and 0
converts to ffaallssee. For example:
bbooooll b = 77;
iinntt i = ttrruuee;
// bool(7) is true, so b becomes true
// int(true) is 1, so i becomes 1
In arithmetic and logical expressions, bboooolls are converted to iinntts; integer arithmetic and logical
operations are performed on the converted values. If the result is converted back to bbooooll, a 0 is
converted to ffaallssee and a nonzero value is converted to ttrruuee.
vvooiidd gg()
{
bbooooll a = ttrruuee;
bbooooll b = ttrruuee;
bbooooll x = aa+bb; // a+b is 2, so x becomes true
bbooooll y = aa|bb; // ab is 1, so y becomes true
}
A pointer can be implicitly converted to a bbooooll (§C.6.2.5). A nonzero pointer converts to ttrruuee;
zero-valued pointers convert to ffaallssee.
4.3 Character Types [dcl.char]
A variable of type cchhaarr can hold a character of the implementation’s character set. For example:
cchhaarr cchh = ´aa´;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
72
Types and Declarations
Chapter 4
Almost universally, a cchhaarr has 8 bits so that it can hold one of 256 different values. Typically, the
character set is a variant of ISO-646, for example ASCII, thus providing the characters appearing
on your keyboard. Many problems arise from the fact that this set of characters is only partially
standardized (§C.3).
Serious variations occur between character sets supporting different natural languages and also
between different character sets supporting the same natural language in different ways. However,
here we are interested only in how such differences affect the rules of C++. The larger and more
interesting issue of how to program in a multi-lingual, multi-character-set environment is beyond
the scope of this book, although it is alluded to in several places (§20.2, §21.7, §C.3.3).
It is safe to assume that the implementation character set includes the decimal digits, the 26
alphabetic characters of English, and some of the basic punctuation characters. It is not safe to
assume that there are no more than 127 characters in an 8-bit character set (e.g., some sets provide
255 characters), that there are no more alphabetic characters than English provides (most European
languages provide more), that the alphabetic characters are contiguous (EBCDIC leaves a gap
between ´ii´ and ´jj´), or that every character used to write C++ is available (e.g., some national
character sets do not provide { } [ ] | \\; §C.3.1). Whenever possible, we should avoid making
assumptions about the representation of objects. This general rule applies even to characters.
Each character constant has an integer value. For example, the value of ´bb´ is 9988 in the ASCII
character set. Here is a small program that will tell you the integer value of any character you care
to input:
#iinncclluuddee <iioossttrreeaam
m>
iinntt m
maaiinn()
{
cchhaarr cc;
ssttdd::cciinn >> cc;
ssttdd::ccoouutt << "tthhee vvaalluuee ooff ´" << c << "´ iiss " << iinntt(cc) << ´\\nn´;
}
The notation iinntt(cc) gives the integer value for a character cc. The possibility of converting a cchhaarr
to an integer raises the question: is a cchhaarr signed or unsigned? The 256 values represented by an
8-bit byte can be interpreted as the values 0 to 225555 or as the values -112277 to 112277. Unfortunately,
which choice is made for a plain cchhaarr is implementation-defined (§C.1, §C.3.4). C++ provides two
types for which the answer is definite; ssiiggnneedd cchhaarr, which can hold at least the values -112277 to 112277,
and uunnssiiggnneedd cchhaarr, which can hold at least the values 0 to 225555. Fortunately, the difference matters
only for values outside the 0 to 112277 range, and the most common characters are within that range.
Values outside that range stored in a plain cchhaarr can lead to subtle portability problems. See
§C.3.4 if you need to use more than one type of cchhaarr or if you store integers in cchhaarr variables.
A type w
wcchhaarr__tt is provided to hold characters of a larger character set such as Unicode. It is a
distinct type. The size of w
wcchhaarr__tt is implementation-defined and large enough to hold the largest
character set supported by the implementation’s locale (see §21.7, §C.3.3). The strange name is a
leftover from C. In C, w
wcchhaarr__tt is a ttyyppeeddeeff (§4.9.7) rather than a built-in type. The suffix __tt was
added to distinguish standard ttyyppeeddeeffs.
Note that the character types are integral types (§4.1.1) so that arithmetic and logical operations
(§6.2) apply.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.3.1
Character Literals
73
4.3.1 Character Literals [dcl.char.lit]
A character literal, often called a character constant, is a character enclosed in single quotes, for
example, ´aa´ and ´00´. The type of a character literal is cchhaarr. Such character literals are really
symbolic constants for the integer value of the characters in the character set of the machine on
which the C++ program is to run. For example, if you are running on a machine using the ASCII
character set, the value of ´00´ is 4488. The use of character literals rather than decimal notation
makes programs more portable. A few characters also have standard names that use the backslash \
as an escape character. For example, \\nn is a newline and \\tt is a horizontal tab. See §C.3.2 for
details about escape characters.
Wide character literals are of the form L
L´aabb´, where the number of characters between the
quotes and their meanings is implementation-defined to match the w
wcchhaarr__tt type. A wide character
literal has type w
wcchhaarr__tt.
4.4 Integer Types [dcl.int]
Like cchhaarr, each integer type comes in three forms: ‘‘plain’’ iinntt, ssiiggnneedd iinntt, and uunnssiiggnneedd iinntt. In
addition, integers come in three sizes: sshhoorrtt iinntt, ‘‘plain’’ iinntt, and lloonngg iinntt. A lloonngg iinntt can be
referred to as plain lloonngg. Similarly, sshhoorrtt is a synonym for sshhoorrtt iinntt, uunnssiiggnneedd for uunnssiiggnneedd iinntt,
and ssiiggnneedd for ssiiggnneedd iinntt.
The uunnssiiggnneedd integer types are ideal for uses that treat storage as a bit array. Using an
uunnssiiggnneedd instead of an iinntt to gain one more bit to represent positive integers is almost never a good
idea. Attempts to ensure that some values are positive by declaring variables uunnssiiggnneedd will typically be defeated by the implicit conversion rules (§C.6.1, §C.6.2.1).
Unlike plain cchhaarrs, plain iinntts are always signed. The signed iinntt types are simply more explicit
synonyms for their plain iinntt counterparts.
4.4.1 Integer Literals [dcl.int.lit]
Integer literals come in four guises: decimal, octal, hexadecimal, and character literals. Decimal literals are the most commonly used and look as you would expect them to:
0
11223344
997766
1122334455667788990011223344556677889900
The compiler ought to warn about literals that are too long to represent.
A literal starting with zero followed by x (00xx) is a hexadecimal (base 16) number. A literal
starting with zero followed by a digit is an octal (base 8) number. For example:
ddeecciim
maall:
ooccttaall:
hheexxaaddeecciim
maall:
0
0000
00xx00
2
0022
00xx22
6633
007777
00xx33ff
8833
00112233
00xx5533
The letters aa, bb, cc, dd, ee, and ff, or their uppercase equivalents, are used to represent 1100, 1111, 1122, 1133,
1144, and 1155, respectively. Octal and hexadecimal notations are most useful for expressing bit patterns. Using these notations to express genuine numbers can lead to surprises. For example, on a
machine on which an iinntt is represented as a two’s complement 16-bit integer, 00xxffffffff is the negative
decimal number -11. Had more bits been used to represent an integer, it would have been 6655553355.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
74
Types and Declarations
Chapter 4
The suffix U can be used to write explicitly uunnssiiggnneedd literals. Similarly, the suffix L can be
used to write explicitly lloonngg literals. For example, 3 is an iinntt, 3U is an uunnssiiggnneedd iinntt, and 33L
L is a
lloonngg iinntt. If no suffix is provided, the compiler gives an integer literal a suitable type based on its
value and the implementation’s integer sizes (§C.4).
It is a good idea to limit the use of nonobvious constants to a few well-commented ccoonnsstt (§5.4)
or enumerator (§4.8) initializers.
4.5 Floating-Point Types [dcl.float]
The floating-point types represent floating-point numbers. Like integers, floating-point types come
in three sizes: ffllooaatt (single-precision), ddoouubbllee (double-precision), and lloonngg ddoouubbllee (extendedprecision).
The exact meaning of single-, double-, and extended-precision is implementation-defined.
Choosing the right precision for a problem where the choice matters requires significant understanding of floating-point computation. If you don’t have that understanding, get advice, take the
time to learn, or use ddoouubbllee and hope for the best.
4.5.1 Floating-Point Literals [dcl.fp.lit]
By default, a floating-point literal is of type ddoouubbllee. Again, a compiler ought to warn about
floating-point literals that are too large to be represented. Here are some floating-point literals:
11.2233
.2233
00.2233
11.
11.00
11.22ee1100
11.2233ee-1155
Note that a space cannot occur in the middle of a floating-point literal. For example, 6655.4433 ee-2211
is not a floating-point literal but rather four separate lexical tokens (causing a syntax error):
6655.4433
e
-
2211
If you want a floating-point literal of type ffllooaatt, you can define one using the suffix f or F
F:
33.1144115599226655ff
22.00ff
22.999977992255F
F
4.6 Sizes [dcl.size]
Some of the aspects of C++’s fundamental types, such as the size of an iinntt, are implementationdefined (§C.2). I point out these dependencies and often recommend avoiding them or taking steps
to minimize their impact. Why should you bother? People who program on a variety of systems or
use a variety of compilers care a lot because if they don’t, they are forced to waste time finding and
fixing obscure bugs. People who claim they don’t care about portability usually do so because they
use only a single system and feel they can afford the attitude that ‘‘the language is what my compiler implements.’’ This is a narrow and shortsighted view. If your program is a success, it is
likely to be ported, so someone will have to find and fix problems related to implementationdependent features. In addition, programs often need to be compiled with other compilers for the
same system, and even a future release of your favorite compiler may do some things differently
from the current one. It is far easier to know and limit the impact of implementation dependencies
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.6
Sizes
75
when a program is written than to try to untangle the mess afterwards.
It is relatively easy to limit the impact of implementation-dependent language features. Limiting the impact of system-dependent library facilities is far harder. Using standard library facilities
wherever feasible is one approach.
The reason for providing more than one integer type, more than one unsigned type, and more
than one floating-point type is to allow the programmer to take advantage of hardware characteristics. On many machines, there are significant differences in memory requirements, memory access
times, and computation speed between the different varieties of fundamental types. If you know a
machine, it is usually easy to choose, for example, the appropriate integer type for a particular variable. Writing truly portable low-level code is harder.
Sizes of C++ objects are expressed in terms of multiples of the size of a cchhaarr, so by definition
the size of a cchhaarr is 11. The size of an object or type can be obtained using the ssiizzeeooff operator
(§6.2). This is what is guaranteed about sizes of fundamental types:
1 ≡ sizeof(char) ≤ sizeof(short) ≤ sizeof(int) ≤ sizeof(long)
1 ≤ sizeof(bool) ≤ sizeof(long)
sizeof(char) ≤ sizeof(wchar_t) ≤ sizeof(long)
sizeof(float) ≤ sizeof(double) ≤ sizeof(long double)
sizeof(N) ≡ sizeof(signed N) ≡ sizeof(unsigned N)
where N can be cchhaarr, sshhoorrtt iinntt, iinntt, or lloonngg iinntt. In addition, it is guaranteed that a cchhaarr has at least
8 bits, a sshhoorrtt at least 16 bits, and a lloonngg at least 32 bits. A cchhaarr can hold a character of the
machine’s character set.
Here is a graphical representation of a plausible set of fundamental types and a sample string:
char:
’a’
bool:
1
short:
756
int:
100000000
int*:
&c1
double:
char[14]:
1234567e34
Hello, world!\0
On the same scale (.2 inch to a byte), a megabyte of memory would stretch about three miles (five
km) to the right.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
76
Types and Declarations
Chapter 4
The cchhaarr type is supposed to be chosen by the implementation to be the most suitable type for
holding and manipulating characters on a given computer; it is typically an 8-bit byte. Similarly,
the iinntt type is supposed to be chosen to be the most suitable for holding and manipulating integers
on a given computer; it is typically a 4-byte (32-bit) word. It is unwise to assume more. For example, there are machines with 32 bit cchhaarrs.
When needed, implementation-dependent aspects about an implementation can be found in
<lliim
miittss> (§22.2). For example:
#iinncclluuddee <lliim
miittss>
iinntt m
maaiinn()
{
ccoouutt << "llaarrggeesstt ffllooaatt == " << nnuum
meerriicc__lliim
miittss<ffllooaatt>::m
maaxx()
<< ", cchhaarr iiss ssiiggnneedd == " << nnuum
meerriicc__lliim
miittss<cchhaarr>::iiss__ssiiggnneedd << ´\\nn´;
}
The fundamental types can be mixed freely in assignments and expressions. Wherever possible,
values are converted so as not to lose information (§C.6).
If a value v can be represented exactly in a variable of type T
T, a conversion of v to T is valuepreserving and no problem. The cases where conversions are not value-preserving are best avoided
(§C.6.2.6).
You need to understand implicit conversion in some detail in order to complete a major project
and especially to understand real code written by others. However, such understanding is not
required to read the following chapters.
4.7 Void [dcl.void]
The type vvooiidd is syntactically a fundamental type. It can, however, be used only as part of a more
complicated type; there are no objects of type vvooiidd. It is used either to specify that a function does
not return a value or as the base type for pointers to objects of unknown type. For example:
vvooiidd xx;
vvooiidd ff();
vvooiidd* ppvv;
// error: there are no void objects
// function f does not return a value (§7.3)
// pointer to object of unknown type (§5.6)
When declaring a function, you must specify the type of the value returned. Logically, you would
expect to be able to indicate that a function didn’t return a value by omitting the return type. However, that would make the grammar (Appendix A) less regular and clash with C usage. Consequently, vvooiidd is used as a ‘‘pseudo return type’’ to indicate that a function doesn’t return a value.
4.8 Enumerations [dcl.enum]
An enumeration is a type that can hold a set of values specified by the user. Once defined, an enumeration is used very much like an integer type.
Named integer constants can be defined as members of an enumeration. For example,
eennuum
m{A
ASSM
M, A
AU
UT
TO
O, B
BR
RE
EA
AK
K };
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.8
Enumerations
77
defines three integer constants, called enumerators, and assigns values to them. By default, enumerator values are assigned increasing from 00, so A
ASSM
M==00, A
AU
UT
TO
O==11, and B
BR
RE
EA
AK
K==22. An enumeration can be named. For example:
eennuum
m kkeeyyw
woorrdd { A
ASSM
M, A
AU
UT
TO
O, B
BR
RE
EA
AK
K };
Each enumeration is a distinct type. The type of an enumerator is its enumeration. For example,
A
AU
UT
TO
O is of type kkeeyyw
woorrdd.
Declaring a variable kkeeyyw
woorrdd instead of plain iinntt can give both the user and the compiler a hint
as to the intended use. For example:
vvooiidd ff(kkeeyyw
woorrdd kkeeyy)
{
ssw
wiittcchh (kkeeyy) {
ccaassee A
ASSM
M:
// do something
bbrreeaakk;
ccaassee B
BR
RE
EA
AK
K:
// do something
bbrreeaakk;
}
}
A compiler can issue a warning because only two out of three kkeeyyw
woorrdd values are handled.
An enumerator can be initialized by a constant-expression (§C.5) of integral type (§4.1.1). The
range of an enumeration holds all the enumeration’s enumerator values rounded up to the nearest
larger binary power minus 11. The range goes down to 0 if the smallest enumerator is non-negative
and to the nearest lesser negative binary power if the smallest enumerator is negative. This defines
the smallest bit-field capable of holding the enumerator values. For example:
eennuum
m ee11 { ddaarrkk, lliigghhtt };
// range 0:1
eennuum
m ee22 { a = 33, b = 9 };
// range 0:15
eennuum
m ee33 { m
miinn = -1100, m
maaxx = 11000000000000 }; // range -1048576:1048575
A value of integral type may be explicitly converted to an enumeration type. The result of such a
conversion is undefined unless the value is within the range of the enumeration. For example:
eennuum
m ffllaagg { xx=11, yy=22, zz=44, ee=88 }; // range 0:15
ffllaagg ff11 = 55;
ffllaagg ff22 = ffllaagg(55);
// type error: 5 is not of type flag
// ok: flag(5) is of type flag and within the range of flag
ffllaagg ff33 = ffllaagg(zz|ee); // ok: flag(12) is of type flag and within the range of flag
ffllaagg ff44 = ffllaagg(9999); // undefined: 99 is not within the range of flag
The last assignment shows why there is no implicit conversion from an integer to an enumeration;
most integer values do not have a representation in a particular enumeration.
The notion of a range of values for an enumeration differs from the enumeration notion in the
Pascal family of languages. However, bit-manipulation examples that require values outside the set
of enumerators to be well-defined have a long history in C and C++.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
78
Types and Declarations
Chapter 4
The ssiizzeeooff an enumeration is the ssiizzeeooff some integral type that can hold its range and not larger
than ssiizzeeooff(iinntt), unless an enumerator cannot be represented as an iinntt or as an uunnssiiggnneedd iinntt. For
example, ssiizzeeooff(ee11) could be 1 or maybe 4 but not 8 on a machine where ssiizzeeooff(iinntt)==44.
By default, enumerations are converted to integers for arithmetic operations (§6.2). An enumeration is a user-defined type, so users can define their own operations, such as ++ and << for an enumeration (§11.2.3).
4.9 Declarations [dcl.dcl]
Before a name (identifier) can be used in a C++ program, it must be declared. That is, its type must
be specified to inform the compiler to what kind of entity the name refers. Here are some examples
illustrating the diversity of declarations:
cchhaarr cchh;
ssttrriinngg ss;
iinntt ccoouunntt = 11;
ccoonnsstt ddoouubbllee ppii = 33.11441155992266553355889977993322338855;
eexxtteerrnn iinntt eerrrroorr__nnuum
mbbeerr;
cchhaarr* nnaam
mee = "N
Njjaall";
cchhaarr* sseeaassoonn[] = { "sspprriinngg", "ssuum
mm
meerr", "ffaallll", "w
wiinntteerr" };
ssttrruucctt D
Daattee { iinntt dd, m
m, yy; };
iinntt ddaayy(D
Daattee* pp) { rreettuurrnn pp->dd; }
ddoouubbllee ssqqrrtt(ddoouubbllee);
tteem
mppllaattee<ccllaassss T
T> T aabbss(T
T aa) { rreettuurrnn aa<00 ? -aa : aa; }
ttyyppeeddeeff ccoom
mpplleexx<sshhoorrtt> P
Pooiinntt;
ssttrruucctt U
Usseerr;
eennuum
m B
Beeeerr { C
Caarrllssbbeerrgg, T
Tuubboorrgg, T
Thhoorr };
nnaam
meessppaaccee N
NSS { iinntt aa; }
As can be seen from these examples, a declaration can do more than simply associate a type with a
name. Most of these declarations are also definitions; that is, they also define an entity for the
name to which they refer. For cchh, that entity is the appropriate amount of memory to be used as a
variable – that memory will be allocated. For ddaayy, it is the specified function. For the constant ppii,
it is the value 33.11441155992266553355889977993322338855. For D
Daattee, that entity is a new type. For P
Pooiinntt, it is the
type ccoom
mpplleexx<sshhoorrtt> so that P
Pooiinntt becomes a synonym for ccoom
mpplleexx<sshhoorrtt>. Of the declarations
above, only
ddoouubbllee ssqqrrtt(ddoouubbllee);
eexxtteerrnn iinntt eerrrroorr__nnuum
mbbeerr;
ssttrruucctt U
Usseerr;
are not also definitions; that is, the entity they refer to must be defined elsewhere. The code (body)
for the function ssqqrrtt must be specified by some other declaration, the memory for the iinntt variable
eerrrroorr__nnuum
mbbeerr must be allocated by some other declaration of eerrrroorr__nnuum
mbbeerr, and some other
declaration of the type U
Usseerr must define what that type looks like. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.9
Declarations
79
ddoouubbllee ssqqrrtt(ddoouubbllee dd) { /* ... */ }
iinntt eerrrroorr__nnuum
mbbeerr = 11;
ssttrruucctt U
Usseerr { /* ... */ };
There must always be exactly one definition for each name in a C++ program (for the effects of
#iinncclluuddee, see §9.2.3). However, there can be many declarations. All declarations of an entity must
agree on the type of the entity referred to. So, this fragment has two errors:
iinntt ccoouunntt;
iinntt ccoouunntt; // error: redefinition
eexxtteerrnn iinntt eerrrroorr__nnuum
mbbeerr;
eexxtteerrnn sshhoorrtt eerrrroorr__nnuum
mbbeerr;
// error: type mismatch
and this has none (for the use of eexxtteerrnn see §9.2):
eexxtteerrnn iinntt eerrrroorr__nnuum
mbbeerr;
eexxtteerrnn iinntt eerrrroorr__nnuum
mbbeerr;
Some definitions specify a ‘‘value’’ for the entities they define. For example:
ssttrruucctt D
Daattee { iinntt dd, m
m, yy; };
ttyyppeeddeeff ccoom
mpplleexx<sshhoorrtt> P
Pooiinntt;
iinntt ddaayy(D
Daattee* pp) { rreettuurrnn pp->dd; }
ccoonnsstt ddoouubbllee ppii = 33.11441155992266553355889977993322338855;
For types, templates, functions, and constants, the ‘‘value’’ is permanent. For nonconstant data
types, the initial value may be changed later. For example:
vvooiidd ff()
{
iinntt ccoouunntt = 11;
cchhaarr* nnaam
mee = "B
Bjjaarrnnee";
// ...
ccoouunntt = 22;
nnaam
mee = "M
Maarriiaann";
}
Of the definitions, only
cchhaarr cchh;
ssttrriinngg ss;
do not specify values. See §4.9.5 and §10.4.2 for explanations of how and when a variable is
assigned a default value. Any declaration that specifies a value is a definition.
4.9.1 The Structure of a Declaration [dcl.parts]
A declaration consists of four parts: an optional ‘‘specifier,’’ a base type, a declarator, and an
optional initializer. Except for function and namespace definitions, a declaration is terminated by a
semicolon. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
80
Types and Declarations
Chapter 4
cchhaarr* kkiinnggss[] = { "A
Annttiiggoonnuuss", "SSeelleeuuccuuss", "P
Pttoolleem
myy" };
Here, the base type is cchhaarr, the declarator is *kkiinnggss[], and the initializer is ={...}.
A specifier is an initial keyword, such as vviirrttuuaall (§2.5.5, §12.2.6) and eexxtteerrnn (§9.2), that specifies some non-type attribute of what is being declared.
A declarator is composed of a name and optionally some declarator operators. The most common declarator operators are (§A.7.1):
*
*ccoonnsstt
&
[]
()
ppooiinntteerr
ccoonnssttaanntt ppooiinntteerr
rreeffeerreennccee
aarrrraayy
ffuunnccttiioonn
pprreeffiixx
pprreeffiixx
pprreeffiixx
ppoossttffiixx
ppoossttffiixx
Their use would be simple if they were all either prefix or postfix. However, *, [], and () were
designed to mirror their use in expressions (§6.2). Thus, * is prefix and [] and () are postfix.
The postfix declarator operators bind tighter than the prefix ones. Consequently, *kkiinnggss[] is a
vector of pointers to something, and we have to use parentheses to express types such as ‘‘pointer
to function;’’ see examples in §5.1. For full details, see the grammar in Appendix A.
Note that the type cannot be left out of a declaration. For example:
ccoonnsstt c = 77;
// error: no type
ggtt(iinntt aa, iinntt bb) { rreettuurrnn (aa>bb) ? a : bb; } // error: no return type
uunnssiiggnneedd uuii;
lloonngg llii;
// ok: ‘unsigned’ is the type ‘unsigned int’
// ok: ‘long’ is the type ‘long int’
In this, standard C++ differs from earlier versions of C and C++ that allowed the first two examples
by considering iinntt to be the type when none were specified (§B.2). This ‘‘implicit iinntt’’ rule was a
source of subtle errors and confusion.
4.9.2 Declaring Multiple Names [dcl.multi]
It is possible to declare several names in a single declaration. The declaration simply contains a list
of comma-separated declarators. For example, we can declare two integers like this:
iinntt xx, yy;
// int x; int y;
Note that operators apply to individual names only – and not to any subsequent names in the same
declaration. For example:
iinntt* pp, yy;
iinntt xx, *qq;
iinntt vv[1100], *ppvv;
// int* p; int y; NOT int* y;
// int x; int* q;
// int v[10];
int* pv;
Such constructs make a program less readable and should be avoided.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.9.3
Names
81
4.9.3 Names [dcl.name]
A name (identifier) consists of a sequence of letters and digits. The first character must be a letter.
The underscore character _ is considered a letter. C++ imposes no limit on the number of characters in a name. However, some parts of an implementation are not under the control of the compiler writer (in particular, the linker), and those parts, unfortunately, sometimes do impose limits.
Some run-time environments also make it necessary to extend or restrict the set of characters
accepted in an identifier. Extensions (e.g., allowing the character $ in a name) yield nonportable
programs. A C++ keyword (Appendix A), such as nneew
w and iinntt, cannot be used as a name of a
user-defined entity. Examples of names are:
hheelllloo
D
DE
EF
FIIN
NE
ED
D
vvaarr00
tthhiiss__iiss__aa__m
moosstt__uunnuussuuaallllyy__lloonngg__nnaam
mee
ffooO
O
bbA
Arr
uu__nnaam
mee
vvaarr11
C
CL
LA
ASSSS
__ccllaassss
H
HoorrsseeSSeennssee
______
Examples of character sequences that cannot be used as identifiers are:
001122
ppaayy.dduuee
a ffooooll
ffoooo~bbaarr
$ssyyss
.nnaam
mee
ccllaassss
iiff
33vvaarr
Names starting with an underscore are reserved for special facilities in the implementation and the
run-time environment, so such names should not be used in application programs.
When reading a program, the compiler always looks for the longest string of characters that
could make up a name. Hence, vvaarr1100 is a single name, not the name vvaarr followed by the number
1100. Also, eellsseeiiff is a single name, not the keyword eellssee followed by the keyword iiff.
Uppercase and lowercase letters are distinct, so C
Coouunntt and ccoouunntt are different names, but it is
unwise to choose names that differ only by capitalization. In general, it is best to avoid names that
differ only in subtle ways. For example, the uppercase o (O
O) and zero (00) can be hard to tell apart,
as can the lowercase L (ll) and one (11). Consequently, ll00, llO
O, ll11, and llll are poor choices for identifier names.
Names from a large scope ought to have relatively long and reasonably obvious names, such as
vveeccttoorr, W
Wiinnddoow
w__w
wiitthh__bboorrddeerr, and D
Deeppaarrttm
meenntt__nnuum
mbbeerr. However, code is clearer if names used
only in a small scope have short, conventional names such as xx, ii, and pp. Classes (Chapter 10) and
namespaces (§8.2) can be used to keep scopes small. It is often useful to keep frequently used
names relatively short and reserve really long names for infrequently used entities. Choose names
to reflect the meaning of an entity rather than its implementation. For example, pphhoonnee__bbooookk is better than nnuum
mbbeerr__lliisstt even if the phone numbers happen to be stored in a lliisstt (§3.7). Choosing good
names is an art.
Try to maintain a consistent naming style. For example, capitalize nonstandard library userdefined types and start nontypes with a lowercase letter (for example, SShhaappee and ccuurrrreenntt__ttookkeenn).
Also, use all capitals for macros (if you must use macros; for example, H
HA
AC
CK
K) and use underscores
to separate words in an identifier. However, consistency is hard to achieve because programs are
typically composed of fragments from different sources and several different reasonable styles are
in use. Be consistent in your use of abbreviations and acronyms.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
82
Types and Declarations
Chapter 4
4.9.4 Scope [dcl.scope]
A declaration introduces a name into a scope; that is, a name can be used only in a specific part of
the program text. For a name declared in a function (often called a local name), that scope extends
from its point of declaration to the end of the block in which its declaration occurs. A block is a
section of code delimited by a { } pair.
A name is called global if it is defined outside any function, class (Chapter 10), or namespace
(§8.2). The scope of a global name extends from the point of declaration to the end of the file in
which its declaration occurs. A declaration of a name in a block can hide a declaration in an
enclosing block or a global name. That is, a name can be redefined to refer to a different entity
within a block. After exit from the block, the name resumes its previous meaning. For example:
iinntt xx;
// global x
vvooiidd ff()
{
iinntt xx;
x = 11;
// local x hides global x
// assign to local x
{
iinntt xx;
x = 22;
// hides first local x
// assign to second local x
}
x = 33;
// assign to first local x
}
iinntt* p = &xx;
// take address of global x
Hiding names is unavoidable when writing large programs. However, a human reader can easily
fail to notice that a name has been hidden. Because such errors are relatively rare, they can be very
difficult to find. Consequently, name hiding should be minimized. Using names such as i and x for
global variables or for local variables in a large function is asking for trouble.
A hidden global name can be referred to using the scope resolution operator ::. For example:
iinntt xx;
vvooiidd ff22()
{
iinntt x = 11; // hide global x
::xx = 22; // assign to global x
x = 22;
// assign to local x
// ...
}
There is no way to use a hidden local name.
The scope of a name starts at its point of declaration; that is, after the complete declarator and
before the initializer. This implies that a name can be used even to specify its own initial value.
For example:
iinntt xx;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.9.4
Scope
83
vvooiidd ff33()
{
iinntt x = xx; // perverse: initialize x with its own (uninitialized) value
}
This is not illegal, just silly. A good compiler will warn if a variable is used before it has been set
(see also §5.9[9]).
It is possible to use a single name to refer to two different objects in a block without using the
:: operator. For example:
iinntt x = 1111;
vvooiidd ff44()
{
iinntt y = xx;
iinntt x = 2222;
y = xx;
}
// perverse:
// use global x: y = 11
// use local x: y = 22
Function argument names are considered declared in the outermost block of a function, so
vvooiidd ff55(iinntt xx)
{
iinntt xx;
// error
}
is an error because x is defined twice in the same scope. Having this be an error allows a not
uncommon, subtle mistake to be caught.
4.9.5 Initialization [dcl.init]
If an initializer is specified for an object, that initializer determines the initial value of an object. If
no initializer is specified, a global (§4.9.4), namespace (§8.2), or local static object (§7.1.2, §10.2.4)
(collectively called static objects) is initialized to 0 of the appropriate type. For example:
iinntt aa;
ddoouubbllee dd;
// means ‘‘int a = 0;’’
// means ‘‘double d = 0.0;’’
Local variables (sometimes called automatic objects) and objects created on the free store (sometimes called dynamic objects or heap objects) are not initialized by default. For example:
vvooiidd ff()
{
iinntt xx;
// ...
}
// x does not have a well-defined value
Members of arrays and structures are default initialized or not depending on whether the array or
structure is static. User-defined types may have default initialization defined (§10.4.2).
More complicated objects require more than one value as an initializer. This is handled by initializer lists delimited by { and } for C-style initialization of arrays (§5.2.1) and structures (§5.7).
For user-defined types with constructors, function-style argument lists are used (§2.5.2, §10.2.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
84
Types and Declarations
Chapter 4
Note that an empty pair of parentheses () in a declaration always means ‘‘function’’ (§7.1).
For example:
iinntt aa[] = { 11, 2 };
P
Pooiinntt zz(11,22);
iinntt ff();
// array initializer
// function-style initializer (initialization by constructor)
// function declaration
4.9.6 Objects and Lvalues [dcl.objects]
We can allocate and use ‘‘variables’’ that do not have names, and it is possible to assign to
strange-looking expressions (e.g., *pp[aa+1100]=77). Consequently, there is a need for a name for
‘‘something in memory.’’ This is the simplest and most fundamental notion of an object. That is,
an object is a contiguous region of storage; an lvalue is an expression that refers to an object. The
word lvalue was originally coined to mean ‘‘something that can be on the left-hand side of an
assignment.’’ However, not every lvalue may be used on the left-hand side of an assignment; an
lvalue can refer to a constant (§5.5). An lvalue that has not been declared ccoonnsstt is often called a
modifiable lvalue. This simple and low-level notion of an object should not be confused with the
notions of class object and object of polymorphic type (§15.4.3).
Unless the programmer specifies otherwise (§7.1.2, §10.4.8), an object declared in a function is
created when its definition is encountered and destroyed when its name goes out of scope (§10.4.4).
Such objects are called automatic objects. Objects declared in global or namespace scope and ssttaatt-iiccs declared in functions or classes are created and initialized once (only) and ‘‘live’’ until the program terminates (§10.4.9). Such objects are called static objects. Array elements and nonstatic
structure or class members have their lifetimes determined by the object of which they are part.
Using the nneew
w and ddeelleettee operators, you can create objects whose lifetimes are controlled
directly (§6.2.6).
4.9.7 Typedef [dcl.typedef]
A declaration prefixed by the keyword ttyyppeeddeeff declares a new name for the type rather than a new
variable of the given type. For example:
ttyyppeeddeeff cchhaarr* P
Pcchhaarr;
P
Pcchhaarr pp11, pp22;
// p1 and p2 are char*s
cchhaarr* pp33 = pp11;
A name defined like this, usually called a ‘‘ ttyyppeeddeeff,’’ can be a convenient shorthand for a type
with an unwieldy name. For example, uunnssiiggnneedd cchhaarr is too long for really frequent use, so we
could define a synonym, uucchhaarr:
ttyyppeeddeeff uunnssiiggnneedd cchhaarr uucchhaarr;
Another use of a ttyyppeeddeeff is to limit the direct reference to a type to one place. For example:
ttyyppeeddeeff iinntt iinntt3322;
ttyyppeeddeeff sshhoorrtt iinntt1166;
If we now use iinntt3322 wherever we need a potentially large integer, we can port our program to a
machine on which ssiizzeeooff(iinntt) is 2 by redefining the single occurrence of iinntt in our code:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 4.9.7
Typedef
85
ttyyppeeddeeff lloonngg iinntt3322;
For good and bad, ttyyppeeddeeffs are synonyms for other types rather than distinct types. Consequently,
ttyyppeeddeeffs mix freely with the types for which they are synonyms. People who would like to have
distinct types with identical semantics or identical representation should look at enumerations
(§4.8) or classes (Chapter 10).
4.10 Advice [dcl.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
Keep scopes small; §4.9.4.
Don’t use the same name in both a scope and an enclosing scope; §4.9.4.
Declare one name (only) per declaration; §4.9.2.
Keep common and local names short, and keep uncommon and nonlocal names longer; §4.9.3.
Avoid similar-looking names; §4.9.3.
Maintain a consistent naming style; §4.9.3.
Choose names carefully to reflect meaning rather than implementation; §4.9.3.
Use a ttyyppeeddeeff to define a meaningful name for a built-in type in cases in which the built-in
type used to represent a value might change; §4.9.7.
Use ttyyppeeddeeffs to define synonyms for types; use enumerations and classes to define new types;
§4.9.7.
Remember that every declaration must specify a type (there is no ‘‘implicit iinntt’’); §4.9.1.
Avoid unnecessary assumptions about the numeric value of characters; §4.3.1, §C.6.2.1.
Avoid unnecessary assumptions about the size of integers; §4.6.
Avoid unnecessary assumptions about the range of floating-point types; §4.6.
Prefer a plain iinntt over a sshhoorrtt iinntt or a lloonngg iinntt; §4.6.
Prefer a ddoouubbllee over a ffllooaatt or a lloonngg ddoouubbllee; §4.5.
Prefer plain cchhaarr over ssiiggnneedd cchhaarr and uunnssiiggnneedd cchhaarr; §C.3.4.
Avoid making unnecessary assumptions about the sizes of objects; §4.6.
Avoid unsigned arithmetic; §4.4.
View ssiiggnneedd to uunnssiiggnneedd and uunnssiiggnneedd to ssiiggnneedd conversions with suspicion; §C.6.2.6.
View floating-point to integer conversions with suspicion; §C.6.2.6.
View conversions to a smaller type, such as iinntt to cchhaarr, with suspicion; §C.6.2.6.
4.11 Exercises
[dcl.exercises]
1. (∗2) Get the ‘‘Hello, world!’’ program (§3.2) to run. If that program doesn’t compile as written, look at §B.3.1.
2. (∗1) For each declaration in §4.9, do the following: If the declaration is not a definition, write a
definition for it. If the declaration is a definition, write a declaration for it that is not also a definition.
3. (∗1.5) Write a program that prints the sizes of the fundamental types, a few pointer types, and a
few enumerations of your choice. Use the ssiizzeeooff operator.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
86
Types and Declarations
Chapter 4
4. (∗1.5) Write a program that prints out the letters ´aa´..´zz´ and the digits ´00´..´99´ and their
integer values. Do the same for other printable characters. Do the same again but use hexadecimal notation.
5. (∗2) What, on your system, are the largest and the smallest values of the following types: cchhaarr,
sshhoorrtt, iinntt, lloonngg, ffllooaatt, ddoouubbllee, lloonngg ddoouubbllee, and uunnssiiggnneedd.
6. (∗1) What is the longest local name you can use in a C++ program on your system? What is the
longest external name you can use in a C++ program on your system? Are there any restrictions
on the characters you can use in a name?
7. (∗2) Draw a graph of the integer and fundamental types where a type points to another type if
all values of the first can be represented as values of the second on every standards-conforming
implementation. Draw the same graph for the types on your favorite implementation.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
5
________________________________________
________________________________________________________________________________________________________________________________________________________________
Pointers, Arrays, and Structures
The sublime and the ridiculous
are often so nearly related that
it is difficult to class them separately.
– Tom Paine
Pointers — zero — arrays — string literals — pointers into arrays — constants — pointers and constants — references — vvooiidd* — data structures — advice — exercises.
5.1 Pointers [ptr.ptr]
For a type T
T, T
T* is the type ‘‘pointer to T
T.’’ That is, a variable of type T
T* can hold the address of
an object of type T
T. For example:
cchhaarr c = ´aa´;
cchhaarr* p = &cc;
// p holds the address of c
or graphically:
pp:
&cc
. .
cc: ’aa’
Unfortunately, pointers to arrays and pointers to functions need a more complicated notation:
iinntt* ppii;
cchhaarr** ppppcc;
iinntt* aapp[1155];
iinntt (*ffpp)(cchhaarr*);
iinntt* ff(cchhaarr*);
// pointer to int
// pointer to pointer to char
// array of 15 pointers to ints
// pointer to function taking a char* argument; returns an int
// function taking a char* argument; returns a pointer to int
See §4.9.1 for an explanation of the declaration syntax and Appendix A for the complete grammar.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
88
Pointers, Arrays, and Structures
Chapter 5
The fundamental operation on a pointer is dereferencing, that is, referring to the object pointed
to by the pointer. This operation is also called indirection. The dereferencing operator is (prefix)
unary *. For example:
cchhaarr c = ´aa´;
cchhaarr* p = &cc; // p holds the address of c
cchhaarr cc22 = *pp; // c2 == ’a’
The variable pointed to by p is cc, and the value stored in c is ´aa´, so the value of *pp assigned to cc22
is ´aa´.
It is possible to perform some arithmetic operations on pointers to array elements (§5.3). Pointers to functions can be extremely useful; they are discussed in §7.7.
The implementation of pointers is intended to map directly to the addressing mechanisms of the
machine on which the program runs. Most machines can address a byte. Those that can’t tend to
have hardware to extract bytes from words. On the other hand, few machines can directly address
an individual bit. Consequently, the smallest object that can be independently allocated and
pointed to using a built-in pointer type is a cchhaarr. Note that a bbooooll occupies at least as much space
as a cchhaarr (§4.6). To store smaller values more compactly, you can use logical operations (§6.2.4)
or bit fields in structures (§C.8.1).
5.1.1 Zero [ptr.zero]
Zero (00) is an iinntt. Because of standard conversions (§C.6.2.3), 0 can be used as a constant of any
integral (§4.1.1), floating-point, pointer, or pointer-to-member type. The type of zero will be determined by context. Zero will typically (but not necessarily) be represented by the bit pattern allzeros of the appropriate size.
No object is allocated with the address 00. Consequently, 0 acts as a pointer literal, indicating
that a pointer doesn’t refer to an object.
In C, it has been popular to define a macro N
NU
UL
LL
L to represent the zero pointer. Because of
C++’s tighter type checking, the use of plain 00, rather than any suggested N
NU
UL
LL
L macro, leads to
fewer problems. If you feel you must define N
NU
UL
LL
L, use
ccoonnsstt iinntt N
NU
UL
LL
L = 00;
The ccoonnsstt qualifier (§5.4) prevents accidental redefinition of N
NU
UL
LL
L and ensures that N
NU
UL
LL
L can be
used where a constant is required.
5.2 Arrays [ptr.array]
For a type T
T, T
T[ssiizzee] is the type ‘‘array of ssiizzee elements of type T
T.’’ The elements are indexed
from 0 to ssiizzee-11. For example:
ffllooaatt vv[33];
cchhaarr* aa[3322];
// an array of three floats: v[0], v[1], v[2]
// an array of 32 pointers to char: a[0] .. a[31]
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.2
Arrays
89
The number of elements of the array, the array bound, must be a constant expression (§C.5). If you
need variable bounds, use a vveeccttoorr (§3.7.1, §16.3). For example:
vvooiidd ff(iinntt ii)
{
iinntt vv11[ii];
vveeccttoorr<iinntt> vv22(ii);
}
// error: array size not a constant expression
// ok
Multidimensional arrays are represented as arrays of arrays. For example:
iinntt dd22[1100][2200]; // d2 is an array of 10 arrays of 20 integers
Using comma notation as used for array bounds in some other languages gives compile-time errors
because comma (,) is a sequencing operator (§6.2.2) and is not allowed in constant expressions
(§C.5). For example, try this:
iinntt bbaadd[55,22];
// error: comma not allowed in a constant expression
Multidimensional arrays are described in §C.7. They are best avoided outside low-level code.
5.2.1 Array Initializers [ptr.array.init]
An array can be initialized by a list of values. For example:
iinntt vv11[] = { 11, 22, 33, 4 };
cchhaarr vv22[] = { ´aa´, ´bb´, ´cc´, 0 };
When an array is declared without a specific size, but with an initializer list, the size is calculated
by counting the elements of the initializer list. Consequently, vv11 and vv22 are of type iinntt[44] and
cchhaarr[44], respectively. If a size is explicitly specified, it is an error to give surplus elements in an
initializer list. For example:
cchhaarr vv33[22] = { ´aa´, ´bb´, 0 };
cchhaarr vv44[33] = { ´aa´, ´bb´, 0 };
// error: too many initializers
// ok
If the initializer supplies too few elements, 0 is assumed for the remaining array elements. For
example:
iinntt vv55[88] = { 11, 22, 33, 4 };
is equivalent to
iinntt vv55[] = { 11, 22, 33, 4 , 00, 00, 00, 0 };
Note that there is no array assignment to match the initialization:
vvooiidd ff()
{
vv44 = { ´cc´, ´dd´, 0 }; // error: no array assignment
}
When you need such assignments, use a vveeccttoorr (§16.3) or a vvaallaarrrraayy (§22.4) instead.
An array of characters can be conveniently initialized by a string literal (§5.2.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
90
Pointers, Arrays, and Structures
Chapter 5
5.2.2 String Literals [ptr.string.literal]
A string literal is a character sequence enclosed within double quotes:
"tthhiiss iiss a ssttrriinngg"
A string literal contains one more character than it appears to have; it is terminated by the null character ´\\00´, with the value 00. For example:
ssiizzeeooff("B
Boohhrr")==55
The type of a string literal is ‘‘array of the appropriate number of ccoonnsstt characters,’’ so ""B
Boohhrr"" is
of type ccoonnsstt cchhaarr[55].
A string literal can be assigned to a cchhaarr*. This is allowed because in previous definitions of C
and C++ , the type of a string literal was cchhaarr*. Allowing the assignment of a string literal to a
cchhaarr* ensures that millions of lines of C and C++ remain valid. It is, however, an error to try to
modify a string literal through such a pointer:
vvooiidd ff()
{
cchhaarr* p = "P
Pllaattoo";
pp[44] = ´ee´;
}
// error: assignment to const; result is undefined
This kind of error cannot in general be caught until run-time, and implementations differ in their
enforcement of this rule. Having string literals constant not only is obvious, but also allows implementations to do significant optimizations in the way string literals are stored and accessed.
If we want a string that we are guaranteed to be able to modify, we must copy the characters
into an array:
vvooiidd ff()
{
cchhaarr pp[] = "Z
Zeennoo";
pp[00] = ´R
R´;
}
// p is an array of 5 char
// ok
A string literal is statically allocated so that it is safe to return one from a function. For example:
ccoonnsstt cchhaarr* eerrrroorr__m
meessssaaggee(iinntt ii)
{
// ...
rreettuurrnn "rraannggee eerrrroorr";
}
The memory holding rraannggee eerrrroorr will not go away after a call of eerrrroorr__m
meessssaaggee().
Whether two identical character literals are allocated as one is implementation-defined (§C.1).
For example:
ccoonnsstt cchhaarr* p = "H
Heerraacclliittuuss";
ccoonnsstt cchhaarr* q = "H
Heerraacclliittuuss";
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.2.2
String Literals
91
vvooiidd gg()
{
iiff (pp == qq) ccoouutt << "oonnee!\\nn"; // result is implementation-defined
// ...
}
Note that == compares addresses (pointer values) when applied to pointers, and not the values
pointed to.
The empty string is written as a pair of adjacent double quotes, "", (and has the type ccoonnsstt
cchhaarr[11]).
The backslash convention for representing nongraphic characters (§C.3.2) can also be used
within a string. This makes it possible to represent the double quote (") and the escape character
backslash ( \\) within a string. The most common such character by far is the newline character,
´\\nn´. For example:
ccoouutt<<"bbeeeepp aatt eenndd ooff m
meessssaaggee\\aa\\nn";
The escape character ´\\aa´ is the ASCII character B
BE
EL
L (also known as alert), which causes some
kind of sound to be emitted.
It is not possible to have a ‘‘real’’ newline in a string:
"tthhiiss iiss nnoott a ssttrriinngg
bbuutt a ssyynnttaaxx eerrrroorr"
Long strings can be broken by whitespace to make the program text neater. For example:
cchhaarr aallpphhaa[] = "aabbccddeeffgghhiijjkkllm
mnnooppqqrrssttuuvvw
wxxyyzz"
"A
AB
BC
CD
DE
EF
FG
GH
HIIJJK
KL
LM
MN
NO
OP
PQ
QR
RSST
TU
UV
VW
WX
XY
YZ
Z";
The compiler will concatenate adjacent strings, so aallpphhaa could equivalently have been initialized
by the single string:
"aabbccddeeffgghhiijjkkllm
mnnooppqqrrssttuuvvw
wxxyyzzA
AB
BC
CD
DE
EF
FG
GH
HIIJJK
KL
LM
MN
NO
OP
PQ
QR
RSST
TU
UV
VW
WX
XY
YZ
Z";
It is possible to have the null character in a string, but most programs will not suspect that there
are characters after it. For example, the string ""JJeennss\\000000M
Muunnkk"" will be treated as ""JJeennss"" by standard library functions such as ssttrrccppyy() and ssttrrlleenn(); see §20.4.1.
A string with the prefix L
L, such as L
L"aannggsstt", is a string of wide characters (§4.3, §C.3.3). Its
type is ccoonnsstt w
wcchhaarr__tt[].
5.3 Pointers into Arrays [ptr.into]
In C++, pointers and arrays are closely related. The name of an array can be used as a pointer to its
initial element. For example:
iinntt vv[] = { 11, 22, 33, 4 };
iinntt* pp11 = vv;
// pointer to initial element (implicit conversion)
iinntt* pp22 = &vv[00];
// pointer to initial element
iinntt* pp33 = &vv[44];
// pointer to one beyond last element
or graphically:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
92
Pointers, Arrays, and Structures
Chapter 5
pp11
.
vv:
pp22
pp33
. . . .
1 2 3 4
..
Taking a pointer to the element one beyond the end of an array is guaranteed to work. This is
important for many algorithms (§2.7.2, §18.3). However, since such a pointer does not in fact point
to an element of the array, it may not be used for reading or writing. The result of taking the
address of the element before the initial element is undefined and should be avoided. On some
machine architectures, arrays are often allocated on machine addressing boundaries, so ‘‘one before
the initial element’’ simply doesn’t make sense.
The implicit conversion of an array name to a pointer to the initial element of the array is extensively used in function calls in C-style code. For example:
eexxtteerrnn "C
C" iinntt ssttrrlleenn(ccoonnsstt cchhaarr*); // from <string.h>
vvooiidd ff()
{
cchhaarr vv[] = "A
Annnneem
maarriiee";
cchhaarr* p = vv;
// implicit conversion of char[] to char*
ssttrrlleenn(pp);
ssttrrlleenn(vv);
// implicit conversion of char[] to char*
v = pp;
// error: cannot assign to array
}
The same value is passed to the standard library function ssttrrlleenn() in both calls. The snag is that it
is impossible to avoid the implicit conversion. In other words, there is no way of declaring a function so that the array v is copied when the function is called. Fortunately, there is no implicit or
explicit conversion from a pointer to an array.
The implicit conversion of the array argument to a pointer means that the size of the array is lost
to the called function. However, the called function must somehow determine the size to perform a
meaningful operation. Like other C standard library functions taking pointers to characters,
ssttrrlleenn() relies on zero to indicate end-of-string; ssttrrlleenn(pp) returns the number of characters up to
and not including the terminating 00. This is all pretty low-level. The standard library vveeccttoorr
(§16.3) and ssttrriinngg (Chapter 20) don’t suffer from this problem.
5.3.1 Navigating Arrays [ptr.navigate]
Efficient and elegant access to arrays (and similar data structures) is the key to many algorithms
(see §3.8, Chapter 18). Access can be achieved either through a pointer to an array plus an index or
through a pointer to an element. For example, traversing a character string using an index,
vvooiidd ffii(cchhaarr vv[])
{
ffoorr (iinntt i = 00; vv[ii]!=00; ii++) uussee(vv[ii]);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.3.1
Navigating Arrays
93
is equivalent to a traversal using a pointer:
vvooiidd ffpp(cchhaarr vv[])
{
ffoorr (cchhaarr* p = vv; *pp!=00; pp++) uussee(*pp);
}
The prefix * operator dereferences a pointer so that *pp is the character pointed to by pp,and ++
increments the pointer so that it refers to the next element of the array.
There is no inherent reason why one version should be faster than the other. With modern compilers, identical code should be generated for both examples (see §5.9[8]). Programmers can
choose between the versions on logical and aesthetic grounds.
The result of applying the arithmetic operators +, -, ++, or -- to pointers depends on the type
of the object pointed to. When an arithmetic operator is applied to a pointer p of type T
T*, p is
assumed to point to an element of an array of objects of type T
T; pp+11 points to the next element of
that array, and pp-11 points to the previous element. This implies that the integer value of pp+11 will
be ssiizzeeooff(T
T) larger than the integer value of pp. For example, executing
#iinncclluuddee <iioossttrreeaam
m>
iinntt m
maaiinn ()
{
iinntt vvii[1100];
sshhoorrtt vvss[1100];
ssttdd::ccoouutt << &vvii[00] << ´ ´ << &vvii[11] << ´\\nn´;
ssttdd::ccoouutt << &vvss[00] << ´ ´ << &vvss[11] << ´\\nn´;
}
produced
00xx77ffffffaaeeff00 00xx77ffffffaaeeff44
00xx77ffffffaaeeddcc 00xx77ffffffaaeeddee
using a default hexadecimal notation for pointer values. This shows that on my implementation,
ssiizzeeooff(sshhoorrtt) is 2 and ssiizzeeooff(iinntt) is 44.
Subtraction of pointers is defined only when both pointers point to elements of the same array
(although the language has no fast way of ensuring that is the case). When subtracting one pointer
from another, the result is the number of array elements between the two pointers (an integer). One
can add an integer to a pointer or subtract an integer from a pointer; in both cases, the result is a
pointer value. If that value does not point to an element of the same array as the original pointer or
one beyond, the result of using that value is undefined. For example:
vvooiidd ff()
{
iinntt vv11[1100];
iinntt vv22[1100];
iinntt ii11 = &vv11[55]-&vv11[33]; // i1 = 2
iinntt ii22 = &vv11[55]-&vv22[33]; // result undefined
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
94
Pointers, Arrays, and Structures
iinntt* pp11 = vv22+22;
iinntt* pp22 = vv22-22;
Chapter 5
// p1 = &v2[2]
// *p2 undefined
}
Complicated pointer arithmetic is usually unnecessary and often best avoided. Addition of pointers
makes no sense and is not allowed.
Arrays are not self-describing because the number of elements of an array is not guaranteed to
be stored with the array. This implies that to traverse an array that does not contain a terminator the
way character strings do, we must somehow supply the number of elements. For example:
vvooiidd ffpp(cchhaarr vv[], uunnssiiggnneedd iinntt ssiizzee)
{
ffoorr (iinntt ii=00; ii<ssiizzee; ii++) uussee(vv[ii]);
ccoonnsstt iinntt N = 77;
cchhaarr vv22[N
N];
ffoorr (iinntt ii=00; ii<N
N; ii++) uussee(vv22[ii]);
}
Note that most C++ implementations offer no range checking for arrays. This array concept is
inherently low-level. A more advanced notion of arrays can be provided through the use of classes;
see §3.7.1.
5.4 Constants [ptr.const]
C++ offers the concept of a user-defined constant, a ccoonnsstt, to express the notion that a value doesn’t
change directly. This is useful in several contexts. For example, many objects don’t actually have
their values changed after initialization, symbolic constants lead to more maintainable code than do
literals embedded directly in code, pointers are often read through but never written through, and
most function parameters are read but not written to.
The keyword ccoonnsstt can be added to the declaration of an object to make the object declared a
constant. Because it cannot be assigned to, a constant must be initialized. For example:
ccoonnsstt iinntt m
mooddeell = 9900;
ccoonnsstt iinntt vv[] = { 11, 22, 33, 4 };
ccoonnsstt iinntt xx;
// model is a const
// v[i] is a const
// error: no initializer
Declaring something ccoonnsstt ensures that its value will not change within its scope:
vvooiidd ff()
{
m
mooddeell = 220000;
vv[22]++;
}
// error
// error
Note that ccoonnsstt modifies a type; that is, it restricts the ways in which an object can be used, rather
than specifying how the constant is to be allocated. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.4
Constants
95
vvooiidd gg(ccoonnsstt X
X* pp)
{
// can’t modify *p here
}
vvooiidd hh()
{
X vvaall;
// val can be modified
gg(&vvaall);
// ...
}
Depending on how smart it is, a compiler can take advantage of an object being a constant in several ways. For example, the initializer for a constant is often (but not always) a constant expression
(§C.5); if it is, it can be evaluated at compile time. Further, if the compiler knows every use of the
ccoonnsstt, it need not allocate space to hold it. For example:
ccoonnsstt iinntt cc11 = 11;
ccoonnsstt iinntt cc22 = 22;
ccoonnsstt iinntt cc33 = m
myy__ff(33);
eexxtteerrnn ccoonnsstt iinntt cc44;
ccoonnsstt iinntt* p = &cc22;
// don’t know the value of c3 at compile time
// don’t know the value of c4 at compile time
// need to allocate space for c2
Given this, the compiler knows the values of cc11 and cc22 so that they can be used in constant expressions. Because the values of cc33 and cc44 are not known at compile time (using only the information
available in this compilation unit; see §9.1), storage must be allocated for cc33 and cc44. Because the
address of cc22 is taken (and presumably used somewhere), storage must be allocated for cc22. The
simple and common case is the one in which the value of the constant is known at compile time and
no storage needs to be allocated; cc11 is an example of that. The keyword eexxtteerrnn indicates that cc44 is
defined elsewhere (§9.2).
It is typically necessary to allocate store for an array of constants because the compiler cannot,
in general, figure out which elements of the array are referred to in expressions. On many
machines, however, efficiency improvements can be achieved even in this case by placing arrays of
constants in read-only storage.
Common uses for ccoonnsstts are as array bounds and case labels. For example:
ccoonnsstt iinntt a = 4422;
ccoonnsstt iinntt b = 9999;
ccoonnsstt iinntt m
maaxx = 112288;
iinntt vv[m
maaxx];
vvooiidd ff(iinntt ii)
{
ssw
wiittcchh (ii) {
ccaassee aa:
// ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
96
Pointers, Arrays, and Structures
Chapter 5
ccaassee bb:
// ...
}
}
Enumerators (§4.8) are often an alternative to ccoonnsstts in such cases.
The way ccoonnsstt can be used with class member functions is discussed in §10.2.6 and §10.2.7.
Symbolic constants should be used systematically to avoid ‘‘magic numbers’’ in code. If a
numeric constant, such as an array bound, is repeated in code, it becomes hard to revise that code
because every occurrence of that constant must be changed to make a correct update. Using a symbolic constant instead localizes information. Usually, a numeric constant represents an assumption
about the program. For example, 4 may represent the number of bytes in an integer, 112288 the number of characters needed to buffer input, and 66.2244 the exchange factor between Danish kroner and
U.S. dollars. Left as numeric constants in the code, these values are hard for a maintainer to spot
and understand. Often, such numeric values go unnoticed and become errors when a program is
ported or when some other change violates the assumptions they represent. Representing assumptions as well-commented symbolic constants minimizes such maintenance problems.
5.4.1 Pointers and Constants [ptr.pc]
When using a pointer, two objects are involved: the pointer itself and the object pointed to. ‘‘Prefixing’’ a declaration of a pointer with ccoonnsstt makes the object, but not the pointer, a constant. To
declare a pointer itself, rather than the object pointed to, to be a constant, we use the declarator
operator *ccoonnsstt instead of plain *. For example:
vvooiidd ff11(cchhaarr* pp)
{
cchhaarr ss[] = "G
Goorrm
m";
ccoonnsstt cchhaarr* ppcc = ss;
ppcc[33] = ´gg´;
ppcc = pp;
// pointer to constant
// error: pc points to constant
// ok
cchhaarr *ccoonnsstt ccpp = ss;
ccpp[33] = ´aa´;
ccpp = pp;
// constant pointer
// ok
// error: cp is constant
ccoonnsstt cchhaarr *ccoonnsstt ccppcc = ss;
ccppcc[33] = ´aa´;
ccppcc = pp;
// const pointer to const
// error: cpc points to constant
// error: cpc is constant
}
The declarator operator that makes a pointer constant is *ccoonnsstt. There is no ccoonnsstt* declarator
operator, so a ccoonnsstt appearing before the * is taken to be part of the base type. For example:
cchhaarr *ccoonnsstt ccpp;
cchhaarr ccoonnsstt* ppcc;
ccoonnsstt cchhaarr* ppcc22;
// const pointer to char
// pointer to const char
// pointer to const char
Some people find it helpful to read such declarations right-to-left. For example, ‘‘ccpp is a ccoonnsstt
pointer to a cchhaarr’’ and ‘‘ppcc22 is a pointer to a cchhaarr ccoonnsstt.’’
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.4.1
Pointers and Constants
97
An object that is a constant when accessed through one pointer may be variable when accessed
in other ways. This is particularly useful for function arguments. By declaring a pointer argument
ccoonnsstt, the function is prohibited from modifying the object pointed to. For example:
cchhaarr* ssttrrccppyy(cchhaarr* pp, ccoonnsstt cchhaarr* qq); // cannot modify *q
You can assign the address of a variable to a pointer to constant because no harm can come from
that. However, the address of a constant cannot be assigned to an unrestricted pointer because this
would allow the object’s value to be changed. For example:
vvooiidd ff44()
{
iinntt a = 11;
ccoonnsstt iinntt c = 22;
ccoonnsstt iinntt* pp11 = &cc;
ccoonnsstt iinntt* pp22 = &aa;
iinntt* pp33 = &cc;
*pp33 = 77;
}
// ok
// ok
// error: initialization of int* with const int*
// try to change the value of c
It is possible to explicitly remove the restrictions on a pointer to ccoonnsstt by explicit type conversion
(§10.2.7.1 and §15.4.2.1).
5.5 References [ptr.ref]
A reference is an alternative name for an object. The main use of references is for specifying arguments and return values for functions in general and for overloaded operators (Chapter 11) in particular. The notation X
X& means reference to X
X. For example:
vvooiidd ff()
{
iinntt i = 11;
iinntt& r = ii;
iinntt x = rr;
r = 22;
// r and i now refer to the same int
// x = 1
// i = 2
}
To ensure that a reference is a name for something (that is, bound to an object), we must initialize
the reference. For example:
iinntt i = 11;
iinntt& rr11 = ii;
iinntt& rr22;
eexxtteerrnn iinntt& rr33;
// ok: r1 initialized
// error: initializer missing
// ok: r3 initialized elsewhere
Initialization of a reference is something quite different from assignment to it. Despite appearances, no operator operates on a reference. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
98
Pointers, Arrays, and Structures
vvooiidd gg()
{
iinntt iiii = 00;
iinntt& rrrr = iiii;
rrrr++;
iinntt* pppp = &rrrr;
}
Chapter 5
// ii is incremented to 1
// pp points to ii
This is legal, but rrrr++ does not increment the reference rrrr; rather, ++ is applied to an iinntt that happens to be iiii. Consequently, the value of a reference cannot be changed after initialization; it
always refers to the object it was initialized to denote. To get a pointer to the object denoted by a
reference rrrr, we can write &rrrr.
The obvious implementation of a reference is as a (constant) pointer that is dereferenced each
time it is used. It doesn’t do much harm thinking about references that way, as long as one remembers that a reference isn’t an object that can be manipulated the way a pointer is:
pppp:
&iiii
rrrr:
iiii:
1
In some cases, the compiler can optimize away a reference so that there is no object representing
that reference at run-time.
Initialization of a reference is trivial when the initializer is an lvalue (an object whose address
you can take; see §4.9.6). The initializer for a ‘‘plain’’ T
T& must be an lvalue of type T
T.
The initializer for a ccoonnsstt T
T& need not be an lvalue or even of type T
T. In such cases,
[1] first, implicit type conversion to T is applied if necessary (see §C.6);
[2] then, the resulting value is placed in a temporary variable of type T
T; and
[3] finally, this temporary variable is used as the value of the initializer.
Consider:
ddoouubbllee& ddrr = 11;
ccoonnsstt ddoouubbllee& ccddrr = 11;
// error: lvalue needed
// ok
The interpretation of this last initialization might be:
ddoouubbllee tteem
mpp = ddoouubbllee(11); // first create a temporary with the right value
ccoonnsstt ddoouubbllee& ccddrr = tteem
mpp; // then use the temporary as the initializer for cdr
A temporary created to hold a reference initializer persists until the end of its reference’s scope.
References to variables and references to constants are distinguished because the introduction of
a temporary in the case of the variable is highly error-prone; an assignment to the variable would
become an assignment to the – soon to disappear – temporary. No such problem exists for references to constants, and references to constants are often important as function arguments (§11.6).
A reference can be used to specify a function argument so that the function can change the
value of an object passed to it. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.5
References
99
vvooiidd iinnccrreem
meenntt(iinntt& aaaa) { aaaa++; }
vvooiidd ff()
{
iinntt x = 11;
iinnccrreem
meenntt(xx);
}
// x = 2
The semantics of argument passing are defined to be those of initialization, so when called,
iinnccrreem
meenntt’s argument aaaa became another name for xx. To keep a program readable, it is often best
to avoid functions that modify their arguments. Instead, you can return a value from the function
explicitly or require a pointer argument:
iinntt nneexxtt(iinntt pp) { rreettuurrnn pp+11; }
vvooiidd iinnccrr(iinntt* pp) { (*pp)++; }
vvooiidd gg()
{
iinntt x = 11;
iinnccrreem
meenntt(xx);
x = nneexxtt(xx);
iinnccrr(&xx);
}
// x = 2
// x = 3
// x = 4
The iinnccrreem
meenntt(xx) notation doesn’t give a clue to the reader that xx’s value is being modified, the
way xx=nneexxtt(xx) and iinnccrr(&xx) does. Consequently ‘‘plain’’ reference arguments should be used
only where the name of the function gives a strong hint that the reference argument is modified.
References can also be used to define functions that can be used on both the left-hand and
right-hand sides of an assignment. Again, many of the most interesting uses of this are found in the
design of nontrivial user-defined types. As an example, let us define a simple associative array.
First, we define struct P
Paaiirr like this:
ssttrruucctt P
Paaiirr {
ssttrriinngg nnaam
mee;
ddoouubbllee vvaall;
};
The basic idea is that a ssttrriinngg has a floating-point value associated with it. It is easy to define a
function, vvaalluuee(), that maintains a data structure consisting of one P
Paaiirr for each different string
that has been presented to it. To shorten the presentation, a very simple (and inefficient) implementation is used:
vveeccttoorr<P
Paaiirr> ppaaiirrss;
ddoouubbllee& vvaalluuee(ccoonnsstt ssttrriinngg& ss)
/*
maintain a set of Pairs:
search for s, return its value if found; otherwise make a new Pair and return the default value 0
*/
{
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
100
Pointers, Arrays, and Structures
Chapter 5
ffoorr (iinntt i = 00; i < ppaaiirrss.ssiizzee(); ii++)
iiff (ss == ppaaiirrss[ii].nnaam
mee) rreettuurrnn ppaaiirrss[ii].vvaall;
P
Paaiirr p = { ss, 0 };
ppaaiirrss.ppuusshh__bbaacckk(pp); // add Pair at end (§3.7.3)
rreettuurrnn ppaaiirrss[ppaaiirrss.ssiizzee()-11].vvaall;
}
This function can be understood as an array of floating-point values indexed by character strings.
For a given argument string, vvaalluuee() finds the corresponding floating-point object (not the value
of the corresponding floating-point object); it then returns a reference to it. For example:
iinntt m
maaiinn() // count the number of occurrences of each word on input
{
ssttrriinngg bbuuff;
w
whhiillee (cciinn>>bbuuff) vvaalluuee(bbuuff)++;
ffoorr (vveeccttoorr<P
Paaiirr>::ccoonnsstt__iitteerraattoorr p = ppaaiirrss.bbeeggiinn(); pp!=ppaaiirrss.eenndd(); ++pp)
ccoouutt << pp->nnaam
mee << ": " << pp->vvaall << ´\\nn´;
}
Each time around, the w
whhiillee-loop reads one word from the standard input stream cciinn into the string
bbuuff (§3.6) and then updates the counter associated with it. Finally, the resulting table of different
words in the input, each with its number of occurrences, is printed. For example, given the input
aaaa bbbb bbbb aaaa aaaa bbbb aaaa aaaa
this program will produce:
aaaa: 5
bbbb: 3
It is easy to refine this into a proper associative array type by using a template class with the selection operator [] overloaded (§11.8). It is even easier just to use the standard library m
maapp (§17.4.1).
5.6 Pointer to Void [ptr.ptrtovoid]
A pointer of any type of object can be assigned to a variable of type vvooiidd*, a vvooiidd* can be assigned
to another vvooiidd*, vvooiidd*s can be compared for equality and inequality, and a vvooiidd* can be explicitly
converted to another type. Other operations would be unsafe because the compiler cannot know
what kind of object is really pointed to. Consequently, other operations result in compile-time
errors. To use a vvooiidd*, we must explicitly convert it to a pointer to a specific type. For example:
vvooiidd ff(iinntt* ppii)
{
vvooiidd* ppvv = ppii; // ok: implicit conversion of int* to void*
*ppvv;
// error: can’t dereference void*
ppvv++;
// error: can’t increment void* (the size of the object pointed to is unknown)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.6
Pointer to Void
iinntt* ppii22 = ssttaattiicc__ccaasstt<iinntt*>(ppvv);
101
// explicit conversion back to int*
ddoouubbllee* ppdd11 = ppvv;
// error
ddoouubbllee* ppdd22 = ppii;
// error
ddoouubbllee* ppdd33 = ssttaattiicc__ccaasstt<ddoouubbllee*>(ppvv); // unsafe
}
In general, it is not safe to use a pointer that has been converted (‘‘cast’’) to a type that differs from
the type the object pointed to. For example, a machine may assume that every ddoouubbllee is allocated
on an 8-byte boundary. If so, strange behavior could arise if ppii pointed to an iinntt that wasn’t allocated that way. This form of explicit type conversion is inherently unsafe and ugly. Consequently,
the notation used, ssttaattiicc__ccaasstt, was designed to be ugly.
The primary use for vvooiidd* is for passing pointers to functions that are not allowed to make
assumptions about the type of the object and for returning untyped objects from functions. To use
such an object, we must use explicit type conversion.
Functions using vvooiidd* pointers typically exist at the very lowest level of the system, where real
hardware resources are manipulated. For example:
vvooiidd* m
myy__aalllloocc(ssiizzee__tt nn); // allocate n bytes from my special heap
Occurrences of vvooiidd*s at higher levels of the system should be viewed with suspicion because they
are likely indicators of design errors. Where used for optimization, vvooiidd* can be hidden behind a
type-safe interface (§13.5, §24.4.2).
Pointers to functions (§7.7) and pointers to members (§15.5) cannot be assigned to vvooiidd*s.
5.7 Structures [ptr.struct]
An array is an aggregate of elements of the same type. A ssttrruucctt is an aggregate of elements of
(nearly) arbitrary types. For example:
ssttrruucctt aaddddrreessss {
cchhaarr* nnaam
mee;
lloonngg iinntt nnuum
mbbeerr;
cchhaarr* ssttrreeeett;
cchhaarr* ttoow
wnn;
cchhaarr ssttaattee[22];
lloonngg zziipp;
};
// "Jim Dandy"
// 61
// "South St"
// "New Providence"
// ’N’ ’J’
// 7974
This defines a new type called aaddddrreessss consisting of the items you need in order to send mail to
someone. Note the semicolon at the end. This is one of very few places in C++ where it is necessary to have a semicolon after a curly brace, so people are prone to forget it.
Variables of type aaddddrreessss can be declared exactly as other variables, and the individual
members can be accessed using the . (dot) operator. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
102
Pointers, Arrays, and Structures
Chapter 5
vvooiidd ff()
{
aaddddrreessss jjdd;
jjdd.nnaam
mee = "JJiim
m D
Daannddyy";
jjdd.nnuum
mbbeerr = 6611;
}
The notation used for initializing arrays can also be used for initializing variables of structure types.
For example:
aaddddrreessss jjdd = {
"JJiim
m D
Daannddyy",
6611, "SSoouutthh SStt",
"N
Neew
w P
Prroovviiddeennccee", {´N
N´,´JJ´}, 77997744
};
Using a constructor (§10.2.3) is usually better, however. Note that jjdd.ssttaattee could not be initialized
by the string ""N
NJJ"". Strings are terminated by the character ´\\00´. Hence, ""N
NJJ"" has three characters
– one more than will fit into jjdd.ssttaattee.
Structure objects are often accessed through pointers using the -> (structure pointer dereference) operator. For example:
vvooiidd pprriinntt__aaddddrr(aaddddrreessss* pp)
{
ccoouutt << pp->nnaam
mee << ´\\nn´
<< pp->nnuum
mbbeerr << ´ ´ << pp->ssttrreeeett << ´\\nn´
<< pp->ttoow
wnn << ´\\nn´
<< pp->ssttaattee[00] << pp->ssttaattee[11] << ´ ´ << pp->zziipp << ´\\nn´;
}
When p is a pointer, pp->m
m is equivalent to (*pp).m
m.
Objects of structure types can be assigned, passed as function arguments, and returned as the
result from a function. For example:
aaddddrreessss ccuurrrreenntt;
aaddddrreessss sseett__ccuurrrreenntt(aaddddrreessss nneexxtt)
{
aaddddrreessss pprreevv = ccuurrrreenntt;
ccuurrrreenntt = nneexxtt;
rreettuurrnn pprreevv;
}
Other plausible operations, such as comparison (== and !=), are not defined. However, the user
can define such operators (Chapter 11).
The size of an object of a structure type is not necessarily the sum of the sizes of its members.
This is because many machines require objects of certain types to be allocated on architecturedependent boundaries or handle such objects much more efficiently if they are. For example, integers are often allocated on word boundaries. On such machines, objects are said to have to be
aligned properly. This leads to ‘‘holes’’ in the structures. For example, on many machines,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.7
Structures
103
ssiizzeeooff(aaddddrreessss) is 2244, and not 2222 as might be expected. You can minimize wasted space by simply ordering members by size (largest member first). However, it is usually best to order members
for readability and sort them by size only if there is a demonstrated need to optimize.
The name of a type becomes available for use immediately after it has been encountered and not
just after the complete declaration has been seen. For example:
ssttrruucctt L
Liinnkk {
L
Liinnkk* pprreevviioouuss;
L
Liinnkk* ssuucccceessssoorr;
};
It is not possible to declare new objects of a structure type until the complete declaration has been
seen. For example:
ssttrruucctt N
Noo__ggoooodd {
N
Noo__ggoooodd m
meem
mbbeerr;
};
// error: recursive definition
This is an error because the compiler is not able to determine the size of N
Noo__ggoooodd. To allow two
(or more) structure types to refer to each other, we can declare a name to be the name of a structure
type. For example:
ssttrruucctt L
Liisstt;
// to be defined later
ssttrruucctt L
Liinnkk {
L
Liinnkk* pprree;
L
Liinnkk* ssuucc;
L
Liisstt* m
meem
mbbeerr__ooff;
};
ssttrruucctt L
Liisstt {
L
Liinnkk* hheeaadd;
};
Without the first declaration of L
Liisstt, use of L
Liisstt in the declaration of L
Liinnkk would have caused a syntax error.
The name of a structure type can be used before the type is defined as long as that use does not
require the name of a member or the size of the structure to be known. For example:
ccllaassss SS;
// ‘S’ is the name of some type
eexxtteerrnn S aa;
S ff();
vvooiidd gg(SS);
SS* hh(SS*);
However, many such declarations cannot be used unless the type S is defined:
vvooiidd kk(SS* pp)
{
S aa;
// error: S not defined; size needed to allocate
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
104
Pointers, Arrays, and Structures
ff();
gg(aa);
pp->m
m = 77;
// error: S not defined; size needed to return value
// error: S not defined; size needed to pass argument
// error: S not defined; member name not known
SS* q = hh(pp);
qq->m
m = 77;
// ok: pointers can be allocated and passed
// error: S not defined; member name not known
Chapter 5
}
A ssttrruucctt is a simple form of a ccllaassss (Chapter 10).
For reasons that reach into the pre-history of C, it is possible to declare a ssttrruucctt and a nonstructure with the same name in the same scope. For example:
ssttrruucctt ssttaatt { /* ... */ };
iinntt ssttaatt(cchhaarr* nnaam
mee, ssttrruucctt ssttaatt* bbuuff);
In that case, the plain name (ssttaatt) is the name of the non-structure, and the structure must be
referred to with the prefix ssttrruucctt. Similarly, the keywords ccllaassss, uunniioonn (§C.8.2), and eennuum
m (§4.8)
can be used as prefixes for disambiguation. However, it is best not to overload names to make that
necessary.
5.7.1 Type Equivalence [ptr.equiv]
Two structures are different types even when they have the same members. For example,
ssttrruucctt SS11 { iinntt aa; };
ssttrruucctt SS22 { iinntt aa; };
are two different types, so
SS11 xx;
SS22 y = xx; // error: type mismatch
Structure types are also different from fundamental types, so
SS11 xx;
iinntt i = xx; // error: type mismatch
Every ssttrruucctt must have a unique definition in a program (§9.2.3).
5.8 Advice [ptr.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
Avoid nontrivial pointer arithmetic; §5.3.
Take care not to write beyond the bounds of an array; §5.3.1.
Use 0 rather than N
NU
UL
LL
L; §5.1.1.
Use vveeccttoorr and vvaallaarrrraayy rather than built-in (C-style) arrays; §5.3.1.
Use ssttrriinngg rather than zero-terminated arrays of cchhaarr; §5.3.
Minimize use of plain reference arguments; §5.5.
Avoid vvooiidd* except in low-level code; §5.6.
Avoid nontrivial literals (‘‘magic numbers’’) in code. Instead, define and use symbolic constants; §4.8, §5.4.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 5.9
5.9 Exercises
Exercises
105
[ptr.exercises]
1. (∗1) Write declarations for the following: a pointer to a character, an array of 10 integers, a reference to an array of 10 integers, a pointer to an array of character strings, a pointer to a pointer
to a character, a constant integer, a pointer to a constant integer, and a constant pointer to an
integer. Initialize each one.
2. (∗1.5) What, on your system, are the restrictions on the pointer types cchhaarr*, iinntt*, and vvooiidd*?
For example, may an iinntt* have an odd value? Hint: alignment.
3. (∗1) Use ttyyppeeddeeff to define the types uunnssiiggnneedd cchhaarr, ccoonnsstt uunnssiiggnneedd cchhaarr, pointer to integer,
pointer to pointer to cchhaarr, pointer to arrays of cchhaarr, array of 7 pointers to iinntt, pointer to an array
of 7 pointers to iinntt, and array of 8 arrays of 7 pointers to iinntt.
4. (∗1) Write a function that swaps (exchanges the values of) two integers. Use iinntt* as the argument type. Write another swap function using iinntt& as the argument type.
5. (∗1.5) What is the size of the array ssttrr in the following example:
cchhaarr ssttrr[] = "aa sshhoorrtt ssttrriinngg";
What is the length of the string ""aa sshhoorrtt ssttrriinngg""?
6. (∗1) Define functions ff(cchhaarr), gg(cchhaarr&), and hh(ccoonnsstt cchhaarr&). Call them with the arguments
´aa´, 4499, 33330000, cc, uucc, and sscc, where c is a cchhaarr, uucc is an uunnssiiggnneedd cchhaarr, and sscc is a ssiiggnneedd
cchhaarr. Which calls are legal? Which calls cause the compiler to introduce a temporary variable?
7. (∗1.5) Define a table of the names of months of the year and the number of days in each month.
Write out that table. Do this twice; once using an array of cchhaarr for the names and an array for
the number of days and once using an array of structures, with each structure holding the name
of a month and the number of days in it.
8. (∗2) Run some tests to see if your compiler really generates equivalent code for iteration using
pointers and iteration using indexing (§5.3.1). If different degrees of optimization can be
requested, see if and how that affects the quality of the generated code.
9. (∗1.5) Find an example where it would make sense to use a name in its own initializer.
10. (∗1) Define an array of strings in which the strings contain the names of the months. Print those
strings. Pass the array to a function that prints those strings.
11. (∗2) Read a sequence of words from input. Use Q
Quuiitt as a word that terminates the input. Print
the words in the order they were entered. Don’t print a word twice. Modify the program to sort
the words before printing them.
12. (∗2) Write a function that counts the number of occurrences of a pair of letters in a ssttrriinngg and
another that does the same in a zero-terminated array of cchhaarr (a C-style string). For example,
the pair "ab" appears twice in "xabaacbaxabb".
13. (∗1.5) Define a ssttrruucctt D
Daattee to keep track of dates. Provide functions that read D
Daattees from
input, write D
Daattees to output, and initialize a D
Daattee with a date.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
106
Pointers, Arrays, and Structures
Chapter 5
.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
6
________________________________________
________________________________________________________________________________________________________________________________________________________________
Expressions and Statements
Premature optimization
is the root of all evil.
– D. Knuth
On the other hand,
we cannot ignore efficiency.
– Jon Bentley
Desk calculator example — input — command line arguments — expression summary
— logical and relational operators — increment and decrement — free store — explicit
type conversion — statement summary — declarations — selection statements — declarations in conditions — iteration statements — the infamous ggoottoo — comments and
indentation — advice — exercises.
6.1 A Desk Calculator [expr.calculator]
Statements and expressions are introduced by presenting a desk calculator program that provides
the four standard arithmetic operations as infix operators on floating-point numbers. The user can
also define variables. For example, given the input
r = 22.55
aarreeaa = ppii * r * r
(pi is predefined) the calculator program will write
22.55
1199.663355
where 22.55 is the result of the first line of input and 1199.663355 is the result of the second.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
108
Expressions and Statements
Chapter 6
The calculator consists of four main parts: a parser, an input function, a symbol table, and a
driver. Actually, it is a miniature compiler in which the parser does the syntactic analysis, the input
function handles input and lexical analysis, the symbol table holds permanent information, and the
driver handles initialization, output, and errors. We could add many features to this calculator to
make it more useful (§6.6[20]), but the code is long enough as it is, and most features would just
add code without providing additional insight into the use of C++.
6.1.1 The Parser [expr.parser]
Here is a grammar for the language accepted by the calculator:
pprrooggrraam
m:
E
EN
ND
D
eexxpprr__lliisstt E
EN
ND
D
// END is end-of-input
eexxpprr__lliisstt:
eexxpprreessssiioonn P
PR
RIIN
NT
T
eexxpprreessssiioonn P
PR
RIIN
NT
T eexxpprr__lliisstt
// PRINT is semicolon
eexxpprreessssiioonn:
eexxpprreessssiioonn + tteerrm
m
eexxpprreessssiioonn - tteerrm
m
tteerrm
m
tteerrm
m:
tteerrm
m / pprriim
maarryy
tteerrm
m * pprriim
maarryy
pprriim
maarryy
pprriim
maarryy:
N
NU
UM
MB
BE
ER
R
N
NA
AM
ME
E
N
NA
AM
ME
E = eexxpprreessssiioonn
- pprriim
maarryy
( eexxpprreessssiioonn )
In other words, a program is a sequence of expressions separated by semicolons. The basic units of
an expression are numbers, names, and the operators *, /, +, - (both unary and binary), and =.
Names need not be declared before use.
The style of syntax analysis used is usually called recursive descent; it is a popular and straightforward top-down technique. In a language such as C++, in which function calls are relatively
cheap, it is also efficient. For each production in the grammar, there is a function that calls other
functions. Terminal symbols (for example, E
EN
ND
D, N
NU
UM
MB
BE
ER
R, +, and -) are recognized by the lexical analyzer, ggeett__ttookkeenn(); and nonterminal symbols are recognized by the syntax analyzer functions, eexxpprr(), tteerrm
m(), and pprriim
m(). As soon as both operands of a (sub)expression are known, the
expression is evaluated; in a real compiler, code could be generated at this point.
The parser uses a function ggeett__ttookkeenn() to get input. The value of the most recent call of
ggeett__ttookkeenn() can be found in the global variable ccuurrrr__ttookk. The type of ccuurrrr__ttookk is the enumeration T
Tookkeenn__vvaalluuee:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.1.1
The Parser
eennuum
m T
Tookkeenn__vvaalluuee {
N
NA
AM
ME
E,
N
NU
UM
MB
BE
ER
R,
E
EN
ND
D,
P
PL
LU
USS=´+´,
M
MIIN
NU
USS=´-´, M
MU
UL
L=´*´,
P
PR
RIIN
NT
T=´;´, A
ASSSSIIG
GN
N=´=´, L
LP
P=´(´,
};
109
D
DIIV
V=´/´,
R
RP
P=´)´
T
Tookkeenn__vvaalluuee ccuurrrr__ttookk = P
PR
RIIN
NT
T;
Representing each token by the integer value of its character is convenient and efficient and can be
a help to people using debuggers. This works as long as no character used as input has a value used
as an enumerator – and no character set I know of has a printing character with a single-digit integer value. I chose P
PR
RIIN
NT
T as the initial value for ccuurrrr__ttookk because that is the value it will have
after the calculator has evaluated an expression and displayed its value. Thus, I ‘‘start the system’’
in a normal state to minimize the chance of errors and the need for special startup code.
Each parser function takes a bbooooll (§4.2) argument indicating whether the function needs to call
ggeett__ttookkeenn() to get the next token. Each parser function evaluates ‘‘its’’ expression and returns the
value. The function eexxpprr() handles addition and subtraction. It consists of a single loop that looks
for terms to add or subtract:
ddoouubbllee eexxpprr(bbooooll ggeett)
{
ddoouubbllee lleefftt = tteerrm
m(ggeett);
// add and subtract
ffoorr (;;)
// ‘‘forever’’
ssw
wiittcchh (ccuurrrr__ttookk) {
ccaassee P
PL
LU
USS:
lleefftt += tteerrm
m(ttrruuee);
bbrreeaakk;
ccaassee M
MIIN
NU
USS:
lleefftt -= tteerrm
m(ttrruuee);
bbrreeaakk;
ddeeffaauulltt:
rreettuurrnn lleefftt;
}
}
This function really does not do much itself. In a manner typical of higher-level functions in a
large program, it calls other functions to do the work.
The switch-statement tests the value of its condition, which is supplied in parentheses after the
ssw
wiittcchh keyword, against a set of constants. The break-statements are used to exit the switchstatement. The constants following the ccaassee labels must be distinct. If the value tested does not
match any ccaassee label, the ddeeffaauulltt is chosen. The programmer need not provide a ddeeffaauulltt.
Note that an expression such as 22-33+44 is evaluated as (22-33)+44, as specified in the grammar.
The curious notation ffoorr(;;) is the standard way to specify an infinite loop; you could pronounce it ‘‘forever.’’ It is a degenerate form of a for-statement (§6.3.3); w
whhiillee(ttrruuee) is an alternative. The switch-statement is executed repeatedly until something different from + and - is found,
and then the return-statement in the default case is executed.
The operators += and -= are used to handle the addition and subtraction; lleefftt=lleefftt+tteerrm
m() and
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
110
Expressions and Statements
Chapter 6
lleefftt=lleefftt-tteerrm
m() could have been used without changing the meaning of the program. However,
lleefftt+=tteerrm
m() and lleefftt-=tteerrm
m() not only are shorter but also express the intended operation
directly. Each assignment operator is a separate lexical token, so a + = 11; is a syntax error because
of the space between the + and the =.
Assignment operators are provided for the binary operators
+
-
*
/
%
&
|
^
<<
>>
so that the following assignment operators are possible
=
+=
-=
*=
/=
%=
&=
|=
^=
<<= >>=
The % is the modulo, or remainder, operator; &, |, and ^ are the bitwise logical operators AND,
OR, and exclusive OR; << and >> are the left shift and right shift operators; §6.2 summarizes the
operators and their meanings. For a binary operator @ applied to operands of built-in types, an
expression xx@
@=
=yy means xx=
=xx@
@yy, except that x is evaluated once only.
Chapter 8 and Chapter 9 discuss how to organize a program as a set of modules. With one
exception, the declarations for this calculator example can be ordered so that everything is declared
exactly once and before it is used. The exception is eexxpprr(), which calls tteerrm
m(), which calls
pprriim
m(), which in turn calls eexxpprr(). This loop must be broken somehow. A declaration
ddoouubbllee eexxpprr(bbooooll);
before the definition of pprriim
m() will do nicely.
Function tteerrm
m() handles multiplication and division in the same way eexxpprr() handles addition
and subtraction:
ddoouubbllee tteerrm
m(bbooooll ggeett)
{
ddoouubbllee lleefftt = pprriim
m(ggeett);
// multiply and divide
ffoorr (;;)
ssw
wiittcchh (ccuurrrr__ttookk) {
ccaassee M
MU
UL
L:
lleefftt *= pprriim
m(ttrruuee);
bbrreeaakk;
ccaassee D
DIIV
V:
iiff (ddoouubbllee d = pprriim
m(ttrruuee)) {
lleefftt /= dd;
bbrreeaakk;
}
rreettuurrnn eerrrroorr("ddiivviiddee bbyy 00");
ddeeffaauulltt:
rreettuurrnn lleefftt;
}
}
The result of dividing by zero is undefined and usually disastrous. We therefore test for 0 before
dividing and call eerrrroorr() if we detect a zero divisor. The function eerrrroorr() is described in §6.1.4.
The variable d is introduced into the program exactly where it is needed and initialized immediately. The scope of a name introduced in a condition is the statement controlled by that condition,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.1.1
The Parser
111
and the resulting value is the value of the condition (§6.3.2.1). Consequently, the division and
assignment lleefftt/=dd is done if and only if d is nonzero.
The function pprriim
m() handling a primary is much like eexxpprr() and tteerrm
m(), except that because
we are getting lower in the call hierarchy a bit of real work is being done and no loop is necessary:
ddoouubbllee nnuum
mbbeerr__vvaalluuee;
ssttrriinngg ssttrriinngg__vvaalluuee;
ddoouubbllee pprriim
m(bbooooll ggeett)
{
iiff (ggeett) ggeett__ttookkeenn();
// handle primaries
ssw
wiittcchh (ccuurrrr__ttookk) {
ccaassee N
NU
UM
MB
BE
ER
R:
// floating-point constant
{
ddoouubbllee v = nnuum
mbbeerr__vvaalluuee;
ggeett__ttookkeenn();
rreettuurrnn vv;
}
ccaassee N
NA
AM
ME
E:
{
ddoouubbllee& v = ttaabbllee[ssttrriinngg__vvaalluuee];
iiff (ggeett__ttookkeenn() == A
ASSSSIIG
GN
N) v = eexxpprr(ttrruuee);
rreettuurrnn vv;
}
ccaassee M
MIIN
NU
USS:
// unary minus
rreettuurrnn -pprriim
m(ttrruuee);
ccaassee L
LP
P:
{
ddoouubbllee e = eexxpprr(ttrruuee);
iiff (ccuurrrr__ttookk != R
RP
P) rreettuurrnn eerrrroorr(") eexxppeecctteedd");
ggeett__ttookkeenn();
// eat ’)’
rreettuurrnn ee;
}
ddeeffaauulltt:
rreettuurrnn eerrrroorr("pprriim
maarryy eexxppeecctteedd");
}
}
When a N
NU
UM
MB
BE
ER
R (that is, an integer or floating-point literal) is seen, its value is returned. The
input routine ggeett__ttookkeenn() places the value in the global variable nnuum
mbbeerr__vvaalluuee. Use of a global
variable in a program often indicates that the structure is not quite clean – that some sort of optimization has been applied. So it is here. Ideally, a lexical token consists of two parts: a value specifying the kind of token (a T
Tookkeenn__vvaalluuee in this program) and (when needed) the value of the token.
Here, there is only a single, simple variable, ccuurrrr__ttookk, so the global variable nnuum
mbbeerr__vvaalluuee is
needed to hold the value of the last N
NU
UM
MB
BE
ER
R read. Eliminating this spurious global variable is left
as an exercise (§6.6[21]). Saving the value of nnuum
mbbeerr__vvaalluuee in the local variable v before calling
ggeett__ttookkeenn() is not really necessary. For every legal input, the calculator always uses one number
in the computation before reading another from input. However, saving the value and displaying it
correctly after an error helps the user.
In the same way that the value of the last N
NU
UM
MB
BE
ER
R is kept in nnuum
mbbeerr__vvaalluuee, the character
string representation of the last N
NA
AM
ME
E seen is kept in ssttrriinngg__vvaalluuee. Before doing anything to a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
112
Expressions and Statements
Chapter 6
name, the calculator must first look ahead to see if it is being assigned to or simply read. In both
cases, the symbol table is consulted. The symbol table is a m
maapp (§3.7.4, §17.4.1):
m
maapp<ssttrriinngg,ddoouubbllee> ttaabbllee;
That is, when ttaabbllee is indexed by a ssttrriinngg, the resulting value is the ddoouubbllee corresponding to the
ssttrriinngg. For example, if the user enters
rraaddiiuuss = 66337788.338888;
the calculator will execute
ddoouubbllee& v = ttaabbllee["rraaddiiuuss"];
// ... expr() calculates the value to be assigned ...
v = 66337788.338888;
The reference v is used to hold on to the ddoouubbllee associated with rraaddiiuuss while eexxpprr() calculates the
value 66337788.338888 from the input characters.
6.1.2 The Input Function [expr.input]
Reading input is often the messiest part of a program. This is because a program must communicate with a person, it must cope with that person’s whims, conventions, and seemingly random
errors. Trying to force the person to behave in a manner more suitable for the machine is often
(rightly) considered offensive. The task of a low-level input routine is to read characters and compose higher-level tokens from them. These tokens are then the units of input for higher-level routines. Here, low-level input is done by ggeett__ttookkeenn(). Writing a low-level input routine need not be
an everyday task. Many systems provide standard functions for this.
I build ggeett__ttookkeenn() in two stages. First, I provide a deceptively simple version that imposes a
burden on the user. Next, I modify it into a slightly less elegant, but much easier to use, version.
The idea is to read a character, use that character to decide what kind of token needs to be composed, and then return the T
Tookkeenn__vvaalluuee representing the token read.
The initial statements read the first non-whitespace character into cchh and check that the read
operation succeeded:
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn()
{
cchhaarr cchh = 00;
cciinn>>cchh;
ssw
wiittcchh (cchh) {
ccaassee 00:
rreettuurrnn ccuurrrr__ttookk=E
EN
ND
D;
// assign and return
By default, operator >> skips whitespace (that is, spaces, tabs, newlines, etc.) and leaves the value
of cchh unchanged if the input operation failed. Consequently, cchh==00 indicates end of input.
Assignment is an operator, and the result of the assignment is the value of the variable assigned
to. This allows me to assign the value E
EN
ND
D to ccuurrrr__ttookk and return it in the same statement. Having a single statement rather than two is useful in maintenance. If the assignment and the return
became separated in the code, a programmer might update the one and forget to update to the other.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.1.2
The Input Function
113
Let us look at some of the cases separately before considering the complete function. The
expression terminator ´;´, the parentheses, and the operators are handled simply by returning their
values:
ccaassee ´;´:
ccaassee ´*´:
ccaassee ´/´:
ccaassee ´+´:
ccaassee ´-´:
ccaassee ´(´:
ccaassee ´)´:
ccaassee ´=´:
rreettuurrnn ccuurrrr__ttookk=T
Tookkeenn__vvaalluuee(cchh);
Numbers are handled like this:
ccaassee ´00´: ccaassee ´11´: ccaassee ´22´: ccaassee ´33´: ccaassee ´44´:
ccaassee ´55´: ccaassee ´66´: ccaassee ´77´: ccaassee ´88´: ccaassee ´99´:
ccaassee ´.´:
cciinn.ppuuttbbaacckk(cchh);
cciinn >> nnuum
mbbeerr__vvaalluuee;
rreettuurrnn ccuurrrr__ttookk=N
NU
UM
MB
BE
ER
R;
Stacking ccaassee labels horizontally rather than vertically is generally not a good idea because this
arrangement is harder to read. However, having one line for each digit is tedious. Because operator >> is already defined for reading floating-point constants into a ddoouubbllee, the code is trivial. First
the initial character (a digit or a dot) is put back into cciinn. Then the constant can be read into
nnuum
mbbeerr__vvaalluuee.
A name is handled similarly:
ddeeffaauulltt:
// NAME, NAME =, or error
iiff (iissaallpphhaa(cchh)) {
cciinn.ppuuttbbaacckk(cchh);
cciinn>>ssttrriinngg__vvaalluuee;
rreettuurrnn ccuurrrr__ttookk=N
NA
AM
ME
E;
}
eerrrroorr("bbaadd ttookkeenn");
rreettuurrnn ccuurrrr__ttookk=P
PR
RIIN
NT
T;
The standard library function iissaallpphhaa() (§20.4.2) is used to avoid listing every character as a separate ccaassee label. Operator >> applied to a string (in this case, ssttrriinngg__vvaalluuee) reads until it hits whitespace. Consequently, a user must terminate a name by a space before an operator using the name as
an operand. This is less than ideal, so we will return to this problem in §6.1.3.
Here, finally, is the complete input function:
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn()
{
cchhaarr cchh = 00;
cciinn>>cchh;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
114
Expressions and Statements
Chapter 6
ssw
wiittcchh (cchh) {
ccaassee 00:
rreettuurrnn ccuurrrr__ttookk=E
EN
ND
D;
ccaassee ´;´:
ccaassee ´*´:
ccaassee ´/´:
ccaassee ´+´:
ccaassee ´-´:
ccaassee ´(´:
ccaassee ´)´:
ccaassee ´=´:
rreettuurrnn ccuurrrr__ttookk=T
Tookkeenn__vvaalluuee(cchh);
ccaassee ´00´: ccaassee ´11´: ccaassee ´22´: ccaassee ´33´: ccaassee ´44´:
ccaassee ´55´: ccaassee ´66´: ccaassee ´77´: ccaassee ´88´: ccaassee ´99´:
ccaassee ´.´:
cciinn.ppuuttbbaacckk(cchh);
cciinn >> nnuum
mbbeerr__vvaalluuee;
rreettuurrnn ccuurrrr__ttookk=N
NU
UM
MB
BE
ER
R;
ddeeffaauulltt:
// NAME, NAME =, or error
iiff (iissaallpphhaa(cchh)) {
cciinn.ppuuttbbaacckk(cchh);
cciinn>>ssttrriinngg__vvaalluuee;
rreettuurrnn ccuurrrr__ttookk=N
NA
AM
ME
E;
}
eerrrroorr("bbaadd ttookkeenn");
rreettuurrnn ccuurrrr__ttookk=P
PR
RIIN
NT
T;
}
}
The conversion of an operator to its token value is trivial because the T
Tookkeenn__vvaalluuee of an operator
was defined as the integer value of the operator (§4.8).
6.1.3 Low-level Input [expr.low]
Using the calculator as defined so far reveals a few inconveniences. It is tedious to remember to
add a semicolon after an expression in order to get its value printed, and having a name terminated
by whitespace only is a real nuisance. For example, xx=77 is an identifier – rather than the identifier
x followed by the operator = and the number 77. Both problems are solved by replacing the typeoriented default input operations in ggeett__ttookkeenn() with code that reads individual characters.
First, we’ll make a newline equivalent to the semicolon used to mark the end of expression:
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn()
{
cchhaarr cchh;
ddoo { // skip whitespace except ’\n’
iiff(!cciinn.ggeett(cchh)) rreettuurrnn ccuurrrr__ttookk = E
EN
ND
D;
}w
whhiillee (cchh!=´\\nn´ && iissssppaaccee(cchh));
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.1.3
Low-level Input
115
ssw
wiittcchh (cchh) {
ccaassee ´;´:
ccaassee ´\\nn´:
rreettuurrnn ccuurrrr__ttookk=P
PR
RIIN
NT
T;
A do-statement is used; it is equivalent to a while-statement except that the controlled statement is
always executed at least once. The call cciinn.ggeett(cchh) reads a single character from the standard
input stream into cchh. By default, ggeett() does not skip whitespace the way ooppeerraattoorr >> does. The
test iiff (!cciinn.ggeett(cchh)) fails if no character can be read from cciinn; in this case, E
EN
ND
D is returned to
terminate the calculator session. The operator ! (NOT) is used because ggeett() returns ttrruuee in case
of success.
The standard library function iissssppaaccee() provides the standard test for whitespace (§20.4.2);
iissssppaaccee(cc) returns a nonzero value if c is a whitespace character and zero otherwise. The test is
implemented as a table lookup, so using iissssppaaccee() is much faster than testing for the individual
whitespace characters. Similar functions test if a character is a digit – iissddiiggiitt() – a letter – iissaall-pphhaa() – or a digit or letter – iissaallnnuum
m().
After whitespace has been skipped, the next character is used to determine what kind of lexical
token is coming.
The problem caused by >> reading into a string until whitespace is encountered is solved by
reading one character at a time until a character that is not a letter or a digit is found:
ddeeffaauulltt:
// NAME, NAME=, or error
iiff (iissaallpphhaa(cchh)) {
ssttrriinngg__vvaalluuee = cchh;
w
whhiillee (cciinn.ggeett(cchh) && iissaallnnuum
m(cchh)) ssttrriinngg__vvaalluuee.ppuusshh__bbaacckk(cchh);
cciinn.ppuuttbbaacckk(cchh);
rreettuurrnn ccuurrrr__ttookk=N
NA
AM
ME
E;
}
eerrrroorr("bbaadd ttookkeenn");
rreettuurrnn ccuurrrr__ttookk=P
PR
RIIN
NT
T;
Fortunately, these two improvements could both be implemented by modifying a single local section of code. Constructing programs so that improvements can be implemented through local modifications only is an important design aim.
6.1.4 Error Handling [expr.error]
Because the program is so simple, error handling is not a major concern. The error function simply
counts the errors, writes out an error message, and returns:
iinntt nnoo__ooff__eerrrroorrss;
ddoouubbllee eerrrroorr(ccoonnsstt ssttrriinngg& ss)
{
nnoo__ooff__eerrrroorrss++;
cceerrrr << "eerrrroorr: " << s << ´\\nn´;
rreettuurrnn 11;
}
The stream cceerrrr is an unbuffered output stream usually used to report errors (§21.2.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
116
Expressions and Statements
Chapter 6
The reason for returning a value is that errors typically occur in the middle of the evaluation of
an expression, so we should either abort that evaluation entirely or return a value that is unlikely to
cause subsequent errors. The latter is adequate for this simple calculator. Had ggeett__ttookkeenn() kept
track of the line numbers, eerrrroorr() could have informed the user approximately where the error
occurred. This would be useful when the calculator is used noninteractively (§6.6[19]).
Often, a program must be terminated after an error has occurred because no sensible way of
continuing has been devised. This can be done by calling eexxiitt(), which first cleans up things like
output streams and then terminates the program with its argument as the return value (§9.4.1.1).
More stylized error-handling mechanisms can be implemented using exceptions (see §8.3,
Chapter 14), but what we have here is quite suitable for a 150-line calculator.
6.1.5 The Driver [expr.driver]
With all the pieces of the program in place, we need only a driver to start things. In this simple
example, m
maaiinn() can do that:
iinntt m
maaiinn()
{
ttaabbllee["ppii"] = 33.11441155992266553355889977993322338855;
ttaabbllee["ee"] = 22.77118822881188228844559900445522335544;
// insert predefined names
w
whhiillee (cciinn) {
ggeett__ttookkeenn();
iiff (ccuurrrr__ttookk == E
EN
ND
D) bbrreeaakk;
iiff (ccuurrrr__ttookk == P
PR
RIIN
NT
T) ccoonnttiinnuuee;
ccoouutt << eexxpprr(ffaallssee) << ´\\nn´;
}
rreettuurrnn nnoo__ooff__eerrrroorrss;
}
Conventionally, m
maaiinn() should return zero if the program terminates normally and nonzero otherwise (§3.2). Returning the number of errors accomplishes this nicely. As it happens, the only
initialization needed is to insert the predefined names into the symbol table.
The primary task of the main loop is to read expressions and write out the answer. This is
achieved by the line:
ccoouutt << eexxpprr(ffaallssee) << ´\\nn´;
The argument ffaallssee tells eexxpprr() that it does not need to call ggeett__ttookkeenn() to get a current token on
which to work.
Testing cciinn each time around the loop ensures that the program terminates if something goes
wrong with the input stream, and testing for E
EN
ND
D ensures that the loop is correctly exited when
ggeett__ttookkeenn() encounters end-of-file. A break-statement exits its nearest enclosing switch-statement
or loop (that is, a for-statement, while-statement, or do-statement). Testing for P
PR
RIIN
NT
T (that is, for
´\\nn´ and ´;´) relieves eexxpprr() of the responsibility for handling empty expressions. A continuestatement is equivalent to going to the very end of a loop, so in this case
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.1.5
The Driver
117
w
whhiillee (cciinn) {
// ...
iiff (ccuurrrr__ttookk == P
PR
RIIN
NT
T) ccoonnttiinnuuee;
ccoouutt << eexxpprr(ffaallssee) << ´\\nn´;
}
is equivalent to
w
whhiillee (cciinn) {
// ...
iiff (ccuurrrr__ttookk != P
PR
RIIN
NT
T)
ccoouutt << eexxpprr(ffaallssee) << ´\\nn´;
}
6.1.6 Headers [expr.headers]
The calculator uses standard library facilities. Therefore, appropriate headers must be #iinncclluuddeed to
complete the program:
#iinncclluuddee<iioossttrreeaam
m>
#iinncclluuddee<ssttrriinngg>
#iinncclluuddee<m
maapp>
#iinncclluuddee<ccccttyyppee>
// I/O
// strings
// map
// isalpha(), etc.
All of these headers provide facilities in the ssttdd namespace, so to use the names they provide we
must either use explicit qualification with ssttdd:: or bring the names into the global namespace by
uussiinngg nnaam
meessppaaccee ssttdd;
To avoid confusing the discussion of expressions with modularity issues, I did the latter. Chapter 8
and Chapter 9 discuss ways of organizing this calculator into modules using namespaces and how
to organize it into source files. On many systems, standard headers have equivalents with a .hh suffix that declare the classes, functions, etc., and place them in the global namespace (§9.2.1, §9.2.4,
§B.3.1).
6.1.7 Command-Line Arguments [expr.command]
After the program was written and tested, I found it a bother to first start the program, then type the
expressions, and finally quit. My most common use was to evaluate a single expression. If that
expression could be presented as a command-line argument, a few keystrokes could be avoided.
A program starts by calling m
maaiinn() (§3.2, §9.4). When this is done, m
maaiinn() is given two
arguments specifying the number of arguments, usually called aarrggcc, and an array of arguments,
usually called aarrggvv. The arguments are character strings, so the type of aarrggvv is cchhaarr*[aarrggcc+11].
The name of the program (as it occurs on the command line) is passed as aarrggvv[00], so aarrggcc is
always at least 11. The list of arguments is zero-terminated; that is, aarrggvv[aarrggcc]==00. For example,
for the command
ddcc 115500/11.11993344
the arguments have these values:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
118
Expressions and Statements
aarrggcc:
Chapter 6
2
aarrggvv:
0
.
.
"ddcc"
"115500//11..11993344"
Because the conventions for calling m
maaiinn() are shared with C, C-style arrays and strings are used.
It is not difficult to get hold of a command-line argument. The problem is how to use it with
minimal reprogramming. The idea is to read from the command string in the same way that we
read from the input stream. A stream that reads from a string is unsurprisingly called an
iissttrriinnggssttrreeaam
m. Unfortunately, there is no elegant way of making cciinn refer to an iissttrriinnggssttrreeaam
m.
Therefore, we must find a way of getting the calculator input functions to refer to an iissttrriinnggssttrreeaam
m.
Furthermore, we must find a way of getting the calculator input functions to refer to an
iissttrriinnggssttrreeaam
m or to cciinn depending on what kind of command-line argument we supply.
A simple solution is to introduce a global pointer iinnppuutt that points to the input stream to be used
and have every input routine use that:
iissttrreeaam
m* iinnppuutt; // pointer to input stream
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
ssw
wiittcchh (aarrggcc) {
ccaassee 11:
iinnppuutt = &cciinn;
bbrreeaakk;
ccaassee 22:
iinnppuutt = nneew
w iissttrriinnggssttrreeaam
m(aarrggvv[11]);
bbrreeaakk;
ddeeffaauulltt:
eerrrroorr("ttoooo m
maannyy aarrgguum
meennttss");
rreettuurrnn 11;
}
ttaabbllee["ppii"] = 33.11441155992266553355889977993322338855;
ttaabbllee["ee"] = 22.77118822881188228844559900445522335544;
// read from standard input
// read argument string
// insert predefined names
w
whhiillee (*iinnppuutt) {
ggeett__ttookkeenn();
iiff (ccuurrrr__ttookk == E
EN
ND
D) bbrreeaakk;
iiff (ccuurrrr__ttookk == P
PR
RIIN
NT
T) ccoonnttiinnuuee;
ccoouutt << eexxpprr(ffaallssee) << ´\\nn´;
}
iiff (iinnppuutt != &cciinn) ddeelleettee iinnppuutt;
rreettuurrnn nnoo__ooff__eerrrroorrss;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.1.7
Command-Line Arguments
119
An iissttrriinnggssttrreeaam
m is a kind of iissttrreeaam
m that reads from its character string argument (§21.5.3).
Upon reaching the end of its string, an iissttrriinnggssttrreeaam
m fails exactly like other streams do when they
hit the end of input (§3.6, §21.3.3). To use an iissttrriinnggssttrreeaam
m, you must include <ssssttrreeaam
m>.
It would be easy to modify m
maaiinn() to accept several command-line arguments, but this does
not appear to be necessary, especially as several expressions can be passed as a single argument:
ddcc "rraattee=11.11993344;115500/rraattee;1199.7755/rraattee;221177/rraattee"
I use quotes because ; is the command separator on my UNIX systems. Other systems have different conventions for supplying arguments to a program on startup.
It was inelegant to modify all of the input routines to use *iinnppuutt rather than cciinn to gain the flexibility to use alternative sources of input. The change could have been avoided had I shown foresight by introducing something like iinnppuutt from the start. A more general and useful view is to note
that the source of input really should be the parameter of a calculator module. That is, the fundamental problem with this calculator example is that what I refer to as ‘‘the calculator’’ is only a collection of functions and data. There is no module (§2.4) or object (§2.5.2) that explicitly represents
the calculator. Had I set out to design a calculator module or a calculator type, I would naturally
have considered what its parameters should be (§8.5[3], §10.6[16]).
6.1.8 A Note on Style [expr.style]
To programmers unacquainted with associative arrays, the use of the standard library m
maapp as the
symbol table seems almost like cheating. It is not. The standard library and other libraries are
meant to be used. Often, a library has received more care in its design and implementation than a
programmer could afford for a handcrafted piece of code to be used in just one program.
Looking at the code for the calculator, especially at the first version, we can see that there isn’t
much traditional C-style, low-level code presented. Many of the traditional tricky details have been
replaced by uses of standard library classes such as oossttrreeaam
m, ssttrriinngg, and m
maapp (§3.4, §3.5, §3.7.4,
Chapter 17).
Note the relative scarcity of arithmetic, loops, and even assignments. This is the way things
ought to be in code that doesn’t manipulate hardware directly or implement low-level abstractions.
6.2 Operator Summary [expr.operators]
This section presents a summary of expressions and some examples. Each operator is followed by
one or more names commonly used for it and an example of its use. In these tables, a class_name
is the name of a class, a member is a member name, an object is an expression yielding a class
object, a pointer is an expression yielding a pointer, an expr is an expression, and an lvalue is an
expression denoting a nonconstant object. A type can be a fully general type name (with *, (),
etc.) only when it appears in parentheses; elsewhere, there are restrictions (§A.5).
The syntax of expressions is independent of operand types. The meanings presented here apply
when the operands are of built-in types (§4.1.1). In addition, you can define meanings for operators
applied to operands of user-defined types (§2.5.2, Chapter 11).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
120
Expressions and Statements
Chapter 6
_____________________________________________________________
Operator Summary
______________________________________________________________
____________________________________________________________
class_name :: member
scope resolution
namespace_name :: member
scope resolution
:: name
global
global
:: qualified-name
_____________________________________________________________
object . member
member selection
pointer -> member
member selection
pointer [ expr ]
subscripting
function call
expr ( expr_list )
value construction
type ( expr_list )
lvalue ++
post increment
lvalue - post decrement
ttyyppeeiidd ( type )
type identification
run-time type identification
ttyyppeeiidd ( expr )
run-time checked conversion
ddyynnaam
miicc__ccaasstt < type > ( expr )
ssttaattiicc__ccaasstt < type > ( expr )
compile-time checked conversion
rreeiinntteerrpprreett__ccaasstt < type > ( expr )
unchecked conversion
ccoonnsstt conversion
ccoonnsstt__ccaasstt < type > ( expr )
_____________________________________________________________
size of object
ssiizzeeooff expr
size
of
type
s
si
iz
ze
eo
of
f
(
type
)
++ lvalue
pre increment
-- lvalue
pre decrement
complement
~ expr
not
! expr
unary
minus
expr
+ expr
unary plus
& lvalue
address of
dereference
∗ expr
create (allocate)
nneew
w type
create
(allocate
and
initialize)
n
ne
ew
w
type
(
expr-list
)
nneew
w ( expr-list ) type
create (place)
nneew
w ( expr-list ) type ( expr-list )
create (place and initialize)
destroy (de-allocate)
ddeelleettee pointer
destroy array
ddeelleettee[] pointer
cast (type conversion)
( type ) expr
_____________________________________________________________
object .* pointer-to-member
member selection
_____________________________________________________________
member selection
pointer ->* pointer-to-member
expr ∗ expr
multiply
expr / expr
divide
modulo (remainder)
expr % expr
_____________________________________________________________
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.2
Operator Summary
121
________________________________________
Operator Summary (continued)
_________________________________________
_______________________________________
expr + expr
add (plus)
subtract (minus)
expr - expr
________________________________________
shift left
expr << expr
shift right
expr
>>
expr
________________________________________
expr < expr
less than
expr <= expr
less than or equal
greater than
expr > expr
greater than or equal
expr >= expr
________________________________________
expr == expr
equal
not equal
expr != expr
________________________________________
bitwise AND
expr & expr
________________________________________
bitwise exclusive OR
expr ^ expr
________________________________________
bitwise inclusive OR
expr | expr
________________________________________
logical AND
expr && expr
________________________________________
logical inclusive OR
expr || expr
________________________________________
simple assignment
lvalue = expr
lvalue ∗= expr
multiply and assign
lvalue /= expr
divide and assign
lvalue %= expr
modulo and assign
add and assign
lvalue += expr
subtract and assign
lvalue -= expr
lvalue <<= expr
shift left and assign
lvalue >>= expr
shift right and assign
lvalue &= expr
AND and assign
inclusive OR and assign
lvalue |= expr
exclusive OR and assign
lvalue ^= expr
________________________________________
conditional expression
expr ? expr : expr
________________________________________
throw exception
tthhrroow
w expr
________________________________________
comma (sequencing)
expr , expr
________________________________________
Each box holds operators with the same precedence. Operators in higher boxes have higher precedence than operators in lower boxes. For example: aa+bb*cc means aa+(bb*cc) rather than (aa+bb)*cc
because * has higher precedence than +.
Unary operators and assignment operators are right-associative; all others are left-associative.
For example, aa=bb=cc means aa=(bb=cc), aa+bb+cc means (aa+bb)+cc, and *pp++ means *(pp++), not
(*pp)++.
A few grammar rules cannot be expressed in terms of precedence (also known as binding
strength) and associativity. For example, aa=bb<cc?dd=ee:ff=gg means aa=((bb<cc)?(dd=ee):(ff=gg)),
but you need to look at the grammar (§A.5) to determine that.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
122
Expressions and Statements
Chapter 6
6.2.1 Results [expr.res]
The result types of arithmetic operators are determined by a set of rules known as ‘‘the usual arithmetic conversions’’ (§C.6.3). The overall aim is to produce a result of the ‘‘largest’’ operand type.
For example, if a binary operator has a floating-point operand, the computation is done using
floating-point arithmetic and the result is a floating-point value. If it has a lloonngg operand, the computation is done using long integer arithmetic, and the result is a lloonngg. Operands that are smaller
than an iinntt (such as bbooooll and cchhaarr) are converted to iinntt before the operator is applied.
The relational operators, ==, <=, etc., produce Boolean results. The meaning and result type of
user-defined operators are determined by their declarations (§11.2).
Where logically feasible, the result of an operator that takes an lvalue operand is an lvalue
denoting that lvalue operand. For example:
vvooiidd ff(iinntt xx, iinntt yy)
{
iinntt j = x = yy;
iinntt* p = &++xx;
iinntt* q = &(xx++);
iinntt* pppp = &(xx>yy?xx:yy);
}
// the value of x=y is the value of x after the assignment
// p points to x
// error: x++ is not an lvalue (it is not the value stored in x)
// address of the int with the larger value
If both the second and third operands of ?: are lvalues and have the same type, the result is of that
type and is an lvalue. Preserving lvalues in this way allows greater flexibility in using operators.
This is particularly useful when writing code that needs to work uniformly and efficiently with both
built-in and user-defined types (e.g., when writing templates or programs that generate C++ code).
The result of ssiizzeeooff is of an unsigned integral type called ssiizzee__tt defined in <ccssttddddeeff>. The
result of pointer subtraction is of a signed integral type called ppttrrddiiffff__tt defined in <ccssttddddeeff>.
Implementations do not have to check for arithmetic overflow and hardly any do. For example:
vvooiidd ff()
{
iinntt i = 11;
w
whhiillee (00 < ii) ii++;
ccoouutt << "ii hhaass bbeeccoom
mee nneeggaattiivvee!" << i << ´\\nn´;
}
This will (eventually) try to increase i past the largest integer. What happens then is undefined, but
typically the value ‘‘wraps around’’ to a negative number (on my machine -22114477448833664488). Similarly, the effect of dividing by zero is undefined, but doing so usually causes abrupt termination of
the program. In particular, underflow, overflow, and division by zero do not throw standard exceptions (§14.10).
6.2.2 Evaluation Order [expr.evaluation]
The order of evaluation of subexpressions within an expression is undefined. In particular, you
cannot assume that the expression is evaluated left to right. For example:
iinntt x = ff(22)+gg(33); // undefined whether f() or g() is called first
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.2.2
Evaluation Order
123
Better code can be generated in the absence of restrictions on expression evaluation order. However, the absence of restrictions on evaluation order can lead to undefined results. For example,
iinntt i = 11;
vv[ii] = ii++;
// undefined result
may be evaluated as either vv[11]=11 or vv[22]=11 or may cause some even stranger behavior. Compilers can warn about such ambiguities. Unfortunately, most do not.
The operators , (comma), && (logical and), and || (logical or) guarantee that their left-hand
operand is evaluated before their right-hand operand. For example, bb=(aa=22,aa+11) assigns 3 to bb.
Examples of the use of || and && can be found in §6.2.3. For built-in types, the second operand of
&& is evaluated only if its first operand is ttrruuee, and the second operand of || is evaluated only if its
first operand is ffaallssee; this is sometimes called short-circuit evaluation. Note that the sequencing
operator , (comma) is logically different from the comma used to separate arguments in a function
call. Consider:
ff11(vv[ii],ii++);
ff22( (vv[ii],ii++) );
// two arguments
// one argument
The call of ff11 has two arguments, vv[ii] and ii++, and the order of evaluation of the argument
expressions is undefined. Order dependence of argument expressions is very poor style and has
undefined behavior. The call of ff22 has one argument, the comma expression (vv[ii],ii++), which is
equivalent to ii++.
Parentheses can be used to force grouping. For example, aa*bb/cc means (aa*bb)/cc so parentheses must be used to get aa*(bb/cc); aa*(bb/cc) may be evaluated as (aa*bb)/cc only if the user cannot
tell the difference. In particular, for many floating-point computations aa*(bb/cc) and (aa*bb)/cc are
significantly different, so a compiler will evaluate such expressions exactly as written.
6.2.3 Operator Precedence [expr.precedence]
Precedence levels and associativity rules reflect the most common usage. For example,
iiff (ii<=00 || m
maaxx<ii) // ...
means ‘‘if i is less than or equal to 0 or if m
maaxx is less than ii.’’ That is, it is equivalent to
iiff ( (ii<=00) || (m
maaxx<ii) ) // ...
and not the legal but nonsensical
iiff (ii <= (00||m
maaxx) < ii) // ...
However, parentheses should be used whenever a programmer is in doubt about those rules. Use of
parentheses becomes more common as the subexpressions become more complicated, but complicated subexpressions are a source of errors. Therefore, if you start feeling the need for parentheses,
you might consider breaking up the expression by using an extra variable.
There are cases when the operator precedence does not result in the ‘‘obvious’’ interpretation.
For example:
iiff (ii&m
maasskk == 00)
// oops! == expression as operand for &
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
124
Expressions and Statements
Chapter 6
This does not apply a mask to i and then test if the result is zero. Because == has higher precedence than &, the expression is interpreted as ii&(m
maasskk==00). Fortunately, it is easy enough for a
compiler to warn about most such mistakes. In this case, parentheses are important:
iiff ((ii&m
maasskk) == 00) // ...
It is worth noting that the following does not work the way a mathematician might expect:
iiff (00 <= x <= 9999) // ...
This is legal, but it is interpreted as (00<=xx)<=9999, where the result of the first comparison is either
ttrruuee or ffaallssee. This Boolean value is then implicitly converted to 1 or 00, which is then compared to
9999, yielding ttrruuee. To test whether x is in the range 00..9999, we might use:
iiff (00<=xx && xx<=9999) // ...
A common mistake for novices is to use = (assignment) instead of == (equals) in a condition:
iiff (aa = 77) // oops! constant assignment in condition
This is natural because = means ‘‘equals’’ in many languages. Again, it is easy for a compiler to
warn about most such mistakes – and many do.
6.2.4 Bitwise Logical Operators [expr.logical]
The bitwise logical operators &, |, ^, ~, >>, and << are applied to objects of integer types – that is,
bbooooll, cchhaarr, sshhoorrtt, iinntt, lloonngg, and their uunnssiiggnneedd counterparts. The results are also integers.
A typical use of bitwise logical operators is to implement the notion of a small set (a bit vector).
In this case, each bit of an unsigned integer represents one member of the set, and the number of
bits limits the number of members. The binary operator & is interpreted as intersection, | as union,
^ as symmetric difference, and ~ as complement. An enumeration can be used to name the members of such a set. Here is a small example borrowed from an implementation of oossttrreeaam
m:
eennuum
m iiooss__bbaassee::iioossttaattee {
ggooooddbbiitt=00, eeooffbbiitt=11, ffaaiillbbiitt=22, bbaaddbbiitt=44
};
The implementation of a stream can set and test its state like this:
ssttaattee = ggooooddbbiitt;
// ...
iiff (ssttaattee&(bbaaddbbiitt|ffaaiillbbiitt)) // stream no good
The extra parentheses are necessary because & has higher precedence than |.
A function that reaches the end of input might report it like this:
ssttaattee |= eeooffbbiitt;
The |= operator is used to add to the state. A simple assignment, ssttaattee=eeooffbbiitt, would have cleared
all other bits.
These stream state flags are observable from outside the stream implementation. For example,
we could see how the states of two streams differ like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.2.4
iinntt ddiiffff = cciinn.rrddssttaattee()^ccoouutt.rrddssttaattee();
Bitwise Logical Operators
125
// rdstate() returns the state
Computing differences of stream states is not very common. For other similar types, computing
differences is essential. For example, consider comparing a bit vector that represents the set of
interrupts being handled with another that represents the set of interrupts waiting to be handled.
Please note that this bit fiddling is taken from the implementation of iostreams rather than from
the user interface. Convenient bit manipulation can be very important, but for reliability, maintainability, portability, etc., it should be kept at low levels of a system. For more general notions of a
set, see the standard library sseett (§17.4.3), bbiittsseett (§17.5.3), and vveeccttoorr<bbooooll> (§16.3.11).
Using fields (§C.8.1) is really a convenient shorthand for shifting and masking to extract bit
fields from a word. This can, of course, also be done using the bitwise logical operators. For
example, one could extract the middle 16 bits of a 32-bit lloonngg like this:
uunnssiiggnneedd sshhoorrtt m
miiddddllee(lloonngg aa) { rreettuurrnn (aa>>88)&00xxffffffff; }
Do not confuse the bitwise logical operators with the logical operators: &&, ||, and ! . The latter
return either ttrruuee or ffaallssee, and they are primarily useful for writing the test in an iiff, w
whhiillee, or ffoorr
statement (§6.3.2, §6.3.3). For example, !00 (not zero) is the value ttrruuee, whereas ~00 (complement
of zero) is the bit pattern all-ones, which in two’s complement representation is the value -11.
6.2.5 Increment and Decrement [expr.incr]
The ++ operator is used to express incrementing directly, rather than expressing it indirectly using
a combination of an addition and an assignment. By definition, ++llvvaalluuee means llvvaalluuee+=11, which
again means llvvaalluuee=llvvaalluuee+11 provided llvvaalluuee has no side effects. The expression denoting the
object to be incremented is evaluated once (only). Decrementing is similarly expressed by the -operator. The operators ++ and -- can be used as both prefix and postfix operators. The value of
++xx is the new (that is, incremented) value of xx. For example, yy=++xx is equivalent to yy=(xx+=11).
The value of xx++, however, is the old value of xx. For example, yy=xx++ is equivalent to
yy=(tt=xx,xx+=11,tt), where t is a variable of the same type as xx.
Like addition and subtraction of pointers, ++ and -- on pointers operate in terms of elements of
the array into which the pointer points; pp++ makes p point to the next element (§5.3.1).
The increment operators are particularly useful for incrementing and decrementing variables in
loops. For example, one can copy a zero-terminated string like this:
vvooiidd ccppyy(cchhaarr* pp, ccoonnsstt cchhaarr* qq)
{
w
whhiillee (*pp++ = *qq++) ;
}
Like C, C++ is both loved and hated for enabling such terse, expression-oriented coding. Because
w
whhiillee (*pp++ = *qq++) ;
is more than a little obscure to non-C programmers and because the style of coding is not uncommon in C and C++, it is worth examining more closely.
Consider first a more traditional way of copying an array of characters:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
126
Expressions and Statements
Chapter 6
iinntt lleennggtthh = ssttrrlleenn(qq);
ffoorr (iinntt i = 00; ii<=lleennggtthh; ii++) pp[ii] = qq[ii];
This is wasteful. The length of a zero-terminated string is found by reading the string looking for
the terminating zero. Thus, we read the string twice: once to find its length and once to copy it. So
we try this instead:
iinntt ii;
ffoorr (ii = 00; qq[ii]!=00 ; ii++) pp[ii] = qq[ii];
pp[ii] = 00; // terminating zero
The variable i used for indexing can be eliminated because p and q are pointers:
w
whhiillee (*qq != 00) {
*pp = *qq;
pp++;
// point to next character
qq++;
// point to next character
}
*pp = 00;
// terminating zero
Because the post-increment operation allows us first to use the value and then to increment it, we
can rewrite the loop like this:
w
whhiillee (*qq != 00) {
*pp++ = *qq++;
}
*pp = 00; // terminating zero
The value of *pp++ = *qq++ is *qq. We can therefore rewrite the example like this:
w
whhiillee ((*pp++ = *qq++) != 00) { }
In this case, we don’t notice that *qq is zero until we already have copied it into *pp and incremented
pp. Consequently, we can eliminate the final assignment of the terminating zero. Finally, we can
reduce the example further by observing that we don’t need the empty block and that the ‘‘!= 00’’ is
redundant because the result of a pointer or integral condition is always compared to zero anyway.
Thus, we get the version we set out to discover:
w
whhiillee (*pp++ = *qq++) ;
Is this version less readable than the previous versions? Not to an experienced C or C++ programmer. Is this version more efficient in time or space than the previous versions? Except for the first
version that called ssttrrlleenn(), not really. Which version is the most efficient will vary among
machine architectures and among compilers.
The most efficient way of copying a zero-terminated character string for your particular
machine ought to be the standard string copy function:
cchhaarr* ssttrrccppyy(cchhaarr*, ccoonnsstt cchhaarr*);
// from <string.h>
For more general copying, the standard ccooppyy algorithm (§2.7.2, §18.6.1) can be used. Whenever
possible, use standard library facilities in preference to fiddling with pointers and bytes. Standard
library functions may be inlined (§7.1.1) or even implemented using specialized machine
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.2.5
Increment and Decrement
127
instructions. Therefore, you should measure carefully before believing that some piece of handcrafted code outperforms library functions.
6.2.6 Free Store [expr.free]
A named object has its lifetime determined by its scope (§4.9.4). However, it is often useful to create an object that exists independently of the scope in which it was created. In particular, it is common to create objects that can be used after returning from the function in which they were created.
The operator nneew
w creates such objects, and the operator ddeelleettee can be used to destroy them.
Objects allocated by nneew
w are said to be ‘‘on the free store’’ (also, to be ‘‘heap objects,’’ or ‘‘allocated in dynamic memory’’).
Consider how we might write a compiler in the style used for the desk calculator (§6.1). The
syntax analysis functions might build a tree of the expressions for use by the code generator:
ssttrruucctt E
Ennooddee {
T
Tookkeenn__vvaalluuee ooppeerr;
E
Ennooddee* lleefftt;
E
Ennooddee* rriigghhtt;
// ...
};
E
Ennooddee* eexxpprr(bbooooll ggeett)
{
E
Ennooddee* lleefftt = tteerrm
m(ggeett);
ffoorr (;;)
ssw
wiittcchh(ccuurrrr__ttookk) {
ccaassee P
PL
LU
USS:
ccaassee M
MIIN
NU
USS:
{
E
Ennooddee* n = nneew
w E
Ennooddee;
nn->ooppeerr = ccuurrrr__ttookk;
nn->lleefftt = lleefftt;
nn->rriigghhtt = tteerrm
m(ttrruuee);
lleefftt = nn;
bbrreeaakk;
}
ddeeffaauulltt:
rreettuurrnn lleefftt;
}
// create an Enode on free store
// return node
}
A code generator would then use the resulting nodes and delete them:
vvooiidd ggeenneerraattee(E
Ennooddee* nn)
{
ssw
wiittcchh (nn->ooppeerr) {
ccaassee P
PL
LU
USS:
// ...
ddeelleettee nn; // delete an Enode from the free store
}
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
128
Expressions and Statements
Chapter 6
An object created by nneew
w exists until it is explicitly destroyed by ddeelleettee. Then, the space it occupied can be reused by nneew
w. A C++ implementation does not guarantee the presence of a ‘‘garbage
collector’’ that looks out for unreferenced objects and makes them available to nneew
w for reuse. Consequently, I will assume that objects created by nneew
w are manually freed using ddeelleettee. If a garbage
collector is present, the ddeelleettees can be omitted in most cases (§C.9.1).
The ddeelleettee operator may be applied only to a pointer returned by nneew
w or to zero. Applying
ddeelleettee to zero has no effect.
More specialized versions of operator nneew
w can also be defined (§15.6).
6.2.6.1 Arrays [expr.array]
Arrays of objects can also be created using nneew
w. For example:
cchhaarr* ssaavvee__ssttrriinngg(ccoonnsstt cchhaarr* pp)
{
cchhaarr* s = nneew
w cchhaarr[ssttrrlleenn(pp)+11];
ssttrrccppyy(ss,pp);
// copy from p to s
rreettuurrnn ss;
}
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
iiff (aarrggcc < 22) eexxiitt(11);
cchhaarr* p = ssaavvee__ssttrriinngg(aarrggvv[11]);
// ...
ddeelleettee[] pp;
}
The ‘‘plain’’ operator ddeelleettee is used to delete individual objects; ddeelleettee[] is used to delete arrays.
To deallocate space allocated by nneew
w, ddeelleettee and ddeelleettee[] must be able to determine the size of
the object allocated. This implies that an object allocated using the standard implementation of
nneew
w will occupy slightly more space than a static object. Typically, one word is used to hold the
object’s size.
Note that a vveeccttoorr (§3.7.1, §16.3) is a proper object and can therefore be allocated and deallocated using plain nneew
w and ddeelleettee. For example:
vvooiidd ff(iinntt nn)
{
vveeccttoorr<iinntt>* p = nneew
w vveeccttoorr<iinntt>(nn);
iinntt* q = nneew
w iinntt[nn];
// ...
ddeelleettee pp;
ddeelleettee[] qq;
}
// individual object
// array
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.2.6.2
Memory Exhaustion
129
6.2.6.2 Memory Exhaustion [expr.exhaust]
The free store operators nneew
w, ddeelleettee, nneew
w[], and ddeelleettee[] are implemented using functions:
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt);
vvooiidd ooppeerraattoorr ddeelleettee(vvooiidd*);
// space for individual object
vvooiidd* ooppeerraattoorr nneew
w[](ssiizzee__tt); // space for array
vvooiidd ooppeerraattoorr ddeelleettee[](vvooiidd*);
When operator nneew
w needs to allocate space for an object, it calls ooppeerraattoorr nneew
w() to allocate a suitable number of bytes. Similarly, when operator nneew
w needs to allocate space for an array, it calls
ooppeerraattoorr nneew
w[]().
The standard implementations of ooppeerraattoorr nneew
w() and ooppeerraattoorr nneew
w[]() do not initialize the
memory returned.
What happens when nneew
w can find no store to allocate? By default, the allocator throws a
bbaadd__aalllloocc exception. For example:
vvooiidd ff()
{
ttrryy {
ffoorr(;;) nneew
w cchhaarr[1100000000];
}
ccaattcchh(bbaadd__aalllloocc) {
cceerrrr << "M
Meem
moorryy eexxhhaauusstteedd!\\nn";
}
}
However much memory we have available, this will eventually invoke the bbaadd__aalllloocc handler.
We can specify what nneew
w should do upon memory exhaustion. When nneew
w fails, it first calls a
function specified by a call to sseett__nneew
w__hhaannddlleerr() declared in <nneew
w>, if any. For example:
vvooiidd oouutt__ooff__ssttoorree()
{
cceerrrr << "ooppeerraattoorr nneew
w ffaaiilleedd: oouutt ooff ssttoorree\\nn";
tthhrroow
w bbaadd__aalllloocc();
}
iinntt m
maaiinn()
{
sseett__nneew
w__hhaannddlleerr(oouutt__ooff__ssttoorree); // make out_of_store the new_handler
ffoorr (;;) nneew
w cchhaarr[1100000000];
ccoouutt << "ddoonnee\\nn";
}
This will never get to write ddoonnee. Instead, it will write
ooppeerraattoorr nneew
w ffaaiilleedd: oouutt ooff ssttoorree
See §14.4.5 for a plausible implementation of an ooppeerraattoorr nneew
w() that checks to see if there is a
new handler to call and that throws bbaadd__aalllloocc if not. A nneew
w__hhaannddlleerr might do something more
clever than simply terminating the program. If you know how nneew
w and ddeelleettee work – for example,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
130
Expressions and Statements
Chapter 6
because you provided your own ooppeerraattoorr nneew
w() and ooppeerraattoorr ddeelleettee() – the handler might
attempt to find some memory for nneew
w to return. In other words, a user might provide a garbage
collector, thus rendering the use of ddeelleettee optional. Doing this is most definitely not a task for a
beginner, though. For almost everybody who needs an automatic garbage collector, the right thing
to do is to acquire one that has already been written and tested (§C.9.1).
By providing a nneew
w__hhaannddlleerr, we take care of the check for memory exhaustion for every ordinary use of nneew
w in the program. Two alternative ways of controlling memory allocation exist. We
can either provide nonstandard allocation and deallocation functions (§15.6) for the standard uses
of nneew
w or rely on additional allocation information provided by the user (§10.4.11, §19.4.5).
6.2.7 Explicit Type Conversion [expr.cast]
Sometimes, we have to deal with‘‘raw memory;’’ that is, memory that holds or will hold objects of
a type not known to the compiler. For example, a memory allocator may return a vvooiidd* pointing to
newly allocated memory or we might want to state that a given integer value is to be treated as the
address of an I/O device:
vvooiidd* m
maalllloocc(ssiizzee__tt);
vvooiidd ff()
{
iinntt* p = ssttaattiicc__ccaasstt<iinntt*>(m
maalllloocc(110000));
IIO
O__ddeevviiccee* dd11 = rreeiinntteerrpprreett__ccaasstt<IIO
O__ddeevviiccee*>(00X
Xffff0000);
// ...
}
// new allocation used as ints
// device at 0Xff00
A compiler does not know the type of the object pointed to by the vvooiidd*. Nor can it know whether
the integer 00X
Xffff0000 is a valid address. Consequently, the correctness of the conversions are completely in the hands of the programmer. Explicit type conversion, often called casting, is occasionally essential. However, traditionally it is seriously overused and a major source of errors.
The ssttaattiicc__ccaasstt operator converts between related types such as one pointer type to another, an
enumeration to an integral type, or a floating-point type to an integral type. The rreeiinntteerrpprreett__ccaasstt
handles conversions between unrelated types such as an integer to a pointer. This distinction
allows the compiler to apply some minimal type checking for ssttaattiicc__ccaasstt and makes it easier for a
programmer to find the more dangerous conversions represented as rreeiinntteerrpprreett__ccaasstts. Some
ssttaattiicc__ccaasstts are portable, but few rreeiinntteerrpprreett__ccaasstts are. Hardly any guarantees are made for
rreeiinntteerrpprreett__ccaasstt, but generally it produces a value of a new type that has the same bit pattern as its
argument. If the target has at least as many bits as the original value, we can rreeiinntteerrpprreett__ccaasstt the
result back to its original type and use it. The result of a rreeiinntteerrpprreett__ccaasstt is guaranteed to be
usable only if its result type is the exact type used to define the value involved. Note that
rreeiinntteerrpprreett__ccaasstt is the kind of conversion that must be used for pointers to functions (§7.7).
If you feel tempted to use an explicit type conversion, take the time to consider if it is really
necessary. In C++, explicit type conversion is unnecessary in most cases when C needs it (§1.6)
and also in many cases in which earlier versions of C++ needed it (§1.6.2, §B.2.3). In many programs, explicit type conversion can be completely avoided; in others, its use can be localized to a
few routines. In this book, explicit type conversion is used in realistic situations in §6.2.7, §7.7,
§13.5, §15.4, and §25.4.1, only.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.2.7
Explicit Type Conversion
131
A form of run-time checked conversion, ddyynnaam
miicc__ccaasstt (§15.4.1), and a cast for removing ccoonnsstt
qualifiers, ccoonnsstt__ccaasstt (§15.4.2.1), are also provided.
From C, C++ inherited the notation (T
T)ee, which performs any conversion that can be expressed
as a combination of ssttaattiicc__ccaasstts, rreeiinntteerrpprreett__ccaasstts, and ccoonnsstt__ccaasstts to make a value of type T
from the expression e (§B.2.3). This C-style cast is far more dangerous than the named conversion
operators because the notation is harder to spot in a large program and the kind of conversion
intended by the programmer is not explicit. That is, (T
T)ee might be doing a portable conversion
between related types, a nonportable conversion between unrelated types, or removing the ccoonnsstt
modifier from a pointer type. Without knowing the exact types of T and ee, you cannot tell.
6.2.8 Constructors [expr.ctor]
The construction of a value of type T from a value e can be expressed by the functional notation
T
T(ee). For example:
vvooiidd ff(ddoouubbllee dd)
{
iinntt i = iinntt(dd);
// truncate d
ccoom
mpplleexx z = ccoom
mpplleexx(dd); // make a complex from d
// ...
}
The T
T(ee) construct is sometimes referred to as a function-style cast. For a built-in type T
T, T
T(ee) is
equivalent to ssttaattiicc__ccaasstt<T
T>(ee). Unfortunately, this implies that the use of T
T(ee) is not always
safe. For arithmetic types, values can be truncated and even explicit conversion of a longer integer
type to a shorter (such as lloonngg to cchhaarr) can result in undefined behavior. I try to use the notation
exclusively where the construction of a value is well-defined; that is, for narrowing arithmetic conversions (§C.6), for conversion from integers to enumerations (§4.8), and the construction of
objects of user-defined types (§2.5.2, §10.2.3).
Pointer conversions cannot be expressed directly using the T
T(ee) notation. For example,
cchhaarr*(22) is a syntax error. Unfortunately, the protection that the constructor notation provides
against such dangerous conversions can be circumvented by using ttyyppeeddeeff names (§4.9.7) for
pointer types.
The constructor notation T
T() is used to express the default value of type T
T. For example:
vvooiidd ff(ddoouubbllee dd)
{
iinntt j = iinntt();
ccoom
mpplleexx z = ccoom
mpplleexx();
// ...
}
// default int value
// default complex value
The value of an explicit use of the constructor for a built-in type is 0 converted to that type (§4.9.5).
Thus, iinntt() is another way of writing 00. For a user-defined type T
T, T
T() is defined by the default
constructor (§10.4.2), if any.
The use of the constructor notation for built-in types is particularly important when writing templates. Then, the programmer does not know whether a template parameter will refer to a built-in
type or a user-defined type (§16.3.4, §17.4.1.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
132
Expressions and Statements
Chapter 6
6.3 Statement Summary [expr.stmts]
Here are a summary and some examples of C++ statements:
_______________________________________________________________
Statement Syntax
________________________________________________________________
______________________________________________________________
statement:
declaration
{ statement-listopt }
ttrryy { statement-listopt } handler-list
expressionopt ;
iiff ( condition ) statement
iiff ( condition ) statement eellssee statement
ssw
wiittcchh ( condition ) statement
w
whhiillee ( condition ) statement
ddoo statement w
whhiillee ( expression ) ;
ffoorr ( for-init-statement conditionopt ; expressionopt ) statement
ccaassee constant-expression : statement
ddeeffaauulltt : statement
bbrreeaakk ;
ccoonnttiinnuuee ;
rreettuurrnn expressionopt ;
ggoottoo identifier ;
identifier : statement
statement-list:
statement statement-listopt
condition:
expression
type-specifier declarator = expression
handler-list:
ccaattcchh ( exception-declaration ) { statement-listopt }
_______________________________________________________________
handler-list handler-listopt
Note that a declaration is a statement and that there is no assignment statement or procedure call
statement; assignments and function calls are expressions. The statements for handling exceptions,
try-blocks, are described in §8.3.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.3.1
Declarations as Statements
133
6.3.1 Declarations as Statements [expr.dcl]
A declaration is a statement. Unless a variable is declared ssttaattiicc, its initializer is executed whenever the thread of control passes through the declaration (see also §10.4.8). The reason for allowing declarations wherever a statement can be used (and a few other places; §6.3.2.1, §6.3.3.1) is to
enable the programmer to minimize the errors caused by uninitialized variables and to allow better
locality in code. There is rarely a reason to introduce a variable before there is a value for it to
hold. For example:
vvooiidd ff(vveeccttoorr<ssttrriinngg>& vv, iinntt ii, ccoonnsstt cchhaarr* pp)
{
iiff (pp==00) rreettuurrnn;
iiff (ii<00 || vv.ssiizzee()<=ii) eerrrroorr("bbaadd iinnddeexx");
ssttrriinngg s = vv[ii];
iiff (ss == pp) {
// ...
}
// ...
}
The ability to place declarations after executable code is essential for many constants and for
single-assignment styles of programming where a value of an object is not changed after initialization. For user-defined types, postponing the definition of a variable until a suitable initializer is
available can also lead to better performance. For example,
ssttrriinngg ss; /* ... */ s = "T
Thhee bbeesstt iiss tthhee eenneem
myy ooff tthhee ggoooodd.";
can easily be much slower than
ssttrriinngg s = "V
Voollttaaiirree";
The most common reason to declare a variable without an initializer is that it requires a statement
to initialize it. Examples are input variables and arrays.
6.3.2 Selection Statements [expr.select]
A value can be tested by either an iiff statement or a ssw
wiittcchh statement:
iiff ( condition ) statement
iiff ( condition ) statement eellssee statement
ssw
wiittcchh ( condition ) statement
The comparison operators
==
!=
<
<=
>
>=
return the bbooooll ttrruuee if the comparison is true and ffaallssee otherwise.
In an iiff statement, the first (or only) statement is executed if the expression is nonzero and the
second statement (if it is specified) is executed otherwise. This implies that any arithmetic or
pointer expression can be used as a condition. For example, if x is an integer, then
iiff (xx) // ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
134
Expressions and Statements
Chapter 6
means
iiff (xx != 00) // ...
For a pointer pp,
iiff (pp) // ...
is a direct statement of the test ‘‘does p point to a valid object,’’ whereas
iiff (pp != 00) // ...
states the same question indirectly by comparing to a value known not to point to an object. Note
that the representation of the pointer 0 is not all-zeros on all machines (§5.1.1). Every compiler I
have checked generated the same code for both forms of the test.
The logical operators
&&
||
!
are most commonly used in conditions. The operators && and || will not evaluate their second
argument unless doing so is necessary. For example,
iiff (pp && 11<pp->ccoouunntt) // ...
first tests that p is nonzero. It tests 11<pp->ccoouunntt only if p is nonzero.
Some if-statements can conveniently be replaced by conditional-expressions. For example,
iiff (aa <= bb)
m
maaxx = bb;
eellssee
m
maaxx = aa;
is better expressed like this:
m
maaxx = (aa<=bb) ? b : aa;
The parentheses around the condition are not necessary, but I find the code easier to read when they
are used.
A switch-statement can alternatively be written as a set of iiff-ssttaatteem
meenntts. For example,
ssw
wiittcchh (vvaall) {
ccaassee 11:
ff();
bbrreeaakk;
ccaassee 22:
gg();
bbrreeaakk;
ddeeffaauulltt:
hh();
bbrreeaakk;
}
could alternatively be expressed as
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.3.2
Selection Statements
135
iiff (vvaall == 11)
ff();
eellssee iiff (vvaall == 22)
gg();
eellssee
hh();
The meaning is the same, but the first (ssw
wiittcchh) version is preferred because the nature of the operation (testing a value against a set of constants) is explicit. This makes the ssw
wiittcchh statement easier
to read for nontrivial examples. It can also lead to the generation of better code.
Beware that a case of a switch must be terminated somehow unless you want to carry on executing the next case. Consider:
ssw
wiittcchh (vvaall) {
// beware
ccaassee 11:
ccoouutt << "ccaassee 11\\nn";
ccaassee 22:
ccoouutt << "ccaassee 22\\nn";
ddeeffaauulltt:
ccoouutt << "ddeeffaauulltt: ccaassee nnoott ffoouunndd\\nn";
}
Invoked with vvaall==11, this prints
ccaassee 1
ccaassee 2
ddeeffaauulltt: ccaassee nnoott ffoouunndd
to the great surprise of the uninitiated. It is a good idea to comment the (rare) cases in which a
fall-through is intentional so that an uncommented fall-through can be assumed to be an error. A
bbrreeaakk is the most common way of terminating a case, but a rreettuurrnn is often useful (§6.1.1).
6.3.2.1 Declarations in Conditions [expr.cond]
To avoid accidental misuse of a variable, it is usually a good idea to introduce the variable into the
smallest scope possible. In particular, it is usually best to delay the definition of a local variable
until one can give it an initial value. That way, one cannot get into trouble by using the variable
before its initial value is assigned.
One of the most elegant applications of these two principles is to declare a variable in a condition. Consider:
iiff (ddoouubbllee d = pprriim
m(ttrruuee)) {
lleefftt /= dd;
bbrreeaakk;
}
Here, d is declared and initialized and the value of d after initialization is tested as the value of the
condition. The scope of d extends from its point of declaration to the end of the statement that the
condition controls. For example, had there been an eellssee-branch to the if-statement, d would be in
scope on both branches.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
136
Expressions and Statements
Chapter 6
The obvious and traditional alternative is to declare d before the condition. However, this
opens the scope (literally) for the use of d before its initialization or after its intended useful life:
ddoouubbllee dd;
// ...
dd22 = dd;
// ...
// oops!
iiff (dd = pprriim
m(ttrruuee)) {
lleefftt /= dd;
bbrreeaakk;
}
// ...
d = 22.00; // two unrelated uses of d
In addition to the logical benefits of declaring variables in conditions, doing so also yields the most
compact source code.
A declaration in a condition must declare and initialize a single variable or ccoonnsstt.
6.3.3 Iteration Statements [expr.loop]
A loop can be expressed as a ffoorr, w
whhiillee, or ddoo statement:
w
whhiillee ( ccoonnddiittiioonn ) ssttaatteem
meenntt
ddoo ssttaatteem
meenntt w
whhiillee ( eexxpprreessssiioonn ) ;
meenntt
ffoorr ( ffoorr-iinniitt-ssttaatteem
meenntt ccoonnddiittiioonnoopptt ; eexxpprreessssiioonnoopptt ) ssttaatteem
Each of these statements executes a statement (called the controlled statement or the body of the
loop) repeatedly until the condition becomes false or the programmer breaks out of the loop some
other way.
The for-statement is intended for expressing fairly regular loops. The loop variable, the termination condition, and the expression that updates the loop variable can be presented ‘‘up front’’ on
a single line. This can greatly increase readability and thereby decrease the frequency of errors. If
no initialization is needed, the initializing statement can be empty. If the condition is omitted, the
for-statement will loop forever unless the user explicitly exits it by a bbrreeaakk, rreettuurrnn, ggoottoo, tthhrroow
w, or
some less obvious way such as a call of eexxiitt() (§9.4.1.1). If the expression is omitted, we must
update some form of loop variable in the body of the loop. If the loop isn’t of the simple ‘‘introduce a loop variable, test the condition, update the loop variable’’ variety, it is often better
expressed as a while-statement. A for-statement is also useful for expressing a loop without an
explicit termination condition:
ffoorr(;;) { // ‘‘forever’’
// ...
}
A while-statement simply executes its controlled statement until its condition becomes ffaallssee. I tend
to prefer while-statements over for-statements when there isn’t an obvious loop variable or where
the update of a loop variable naturally comes in the middle of the loop body. An input loop is an
example of a loop where there is no obvious loop variable:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.3.3
Iteration Statements
137
w
whhiillee(cciinn>>cchh) // ...
In my experience, the do-statement is a source of errors and confusion. The reason is that its body
is always executed once before the condition is evaluated. However, for the body to work correctly, something very much like the condition must hold even the first time through. More often
than I would have guessed, I have found that condition not to hold as expected either when the program was first written and tested or later after the code preceding it has been modified. I also prefer
the condition ‘‘up front where I can see it.’’ Consequently, I tend to avoid do-statements.
6.3.3.1 Declarations in For-Statements [expr.for]
A variable can be declared in the initializer part of a for-statement. If that initializer is a declaration, the variable (or variables) it introduces is in scope until the end of the for-statement. For
example:
vvooiidd ff(iinntt vv[], iinntt m
maaxx)
{
ffoorr (iinntt i = 00; ii<m
maaxx; ii++) vv[ii] = ii*ii;
}
If the final value of an index needs to be known after exit from a ffoorr-loop, the index variable must
be declared outside the ffoorr-loop (e.g., §6.3.4).
6.3.4 Goto [expr.goto]
C++ possesses the infamous ggoottoo:
ggoottoo identifier ;
identifier : statement
The ggoottoo has few uses in general high-level programming, but it can be very useful when C++ code
is generated by a program rather than written directly by a person; for example, ggoottoos can be used
in a parser generated from a grammar by a parser generator. The ggoottoo can also be important in the
rare cases in which optimal efficiency is essential, for example, in the inner loop of some real-time
application.
One of the few sensible uses of ggoottoo in ordinary code is to break out from a nested loop or
switch-statement (a bbrreeaakk breaks out of only the innermost enclosing loop or switch-statement).
For example:
vvooiidd ff()
{
iinntt ii;
iinntt jj;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
138
Expressions and Statements
Chapter 6
ffoorr (ii = 00; ii<nn; ii++)
ffoorr (jj = 00; jj<m
m; jj++) iiff (nnm
m[ii][jj] == aa) ggoottoo ffoouunndd;
// not found
// ...
ffoouunndd:
// nm[i][j] == a
}
There is also a ccoonnttiinnuuee statement that, in effect, goes to the end of a loop statement, as explained
in §6.1.5.
6.4 Comments and Indentation [expr.comment]
Judicious use of comments and consistent use of indentation can make the task of reading and
understanding a program much more pleasant. Several different consistent styles of indentation are
in use. I see no fundamental reason to prefer one over another (although, like most programmers, I
have my preferences, and this book reflects them). The same applies to styles of comments.
Comments can be misused in ways that seriously affect the readability of a program. The compiler does not understand the contents of a comment, so it has no way of ensuring that a comment
[1] is meaningful,
[2] describes the program, and
[3] is up to date.
Most programs contain comments that are incomprehensible, ambiguous, and just plain wrong.
Bad comments can be worse than no comments.
If something can be stated in the language itself, it should be, and not just mentioned in a comment. This remark is aimed at comments such as these:
// variable "v" must be initialized
// variable "v" must be used only by function "f()"
// call function "init()" before calling any other function in this file
// call function "cleanup()" at the end of your program
// don’t use function "weird()"
// function "f()" takes two arguments
Such comments can often be rendered unnecessary by proper use of C++. For example, one might
utilize the linkage rules (§9.2) and the visibility, initialization, and cleanup rules for classes (see
§10.4.1) to make the preceding examples redundant.
Once something has been stated clearly in the language, it should not be mentioned a second
time in a comment. For example:
a = bb+cc; // a becomes b+c
ccoouunntt++; // increment the counter
Such comments are worse than simply redundant. They increase the amount of text the reader has
to look at, they often obscure the structure of the program, and they may be wrong. Note, however,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.4
Comments and Indentation
139
that such comments are used extensively for teaching purposes in programming language textbooks
such as this. This is one of the many ways a program in a textbook differs from a real program.
My preference is for:
[1] A comment for each source file stating what the declarations in it have in common, references to manuals, general hints for maintenance, etc.
[2] A comment for each class, template, and namespace
[3] A comment for each nontrivial function stating its purpose, the algorithm used (unless it is
obvious), and maybe something about the assumptions it makes about its environment
[4] A comment for each global and namespace variable and constant
[5] A few comments where the code is nonobvious and/or nonportable
[6] Very little else
For example:
//
tbl.c: Implementation of the symbol table.
/*
Gaussian elimination with partial pivoting.
See Ralston: "A first course ..." pg 411.
*/
//
swap() assumes the stack layout of an SGI R6000.
/***********************************
Copyright (c) 1997 AT&T, Inc.
All rights reserved
************************************/
A well-chosen and well-written set of comments is an essential part of a good program. Writing
good comments can be as difficult as writing the program itself. It is an art well worth cultivating.
Note also that if // comments are used exclusively in a function, then any part of that function
can be commented out using /* */ style comments, and vice versa.
6.5 Advice [expr.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
Prefer the standard library to other libraries and to ‘‘handcrafted code;’’ §6.1.8.
Avoid complicated expressions; §6.2.3.
If in doubt about operator precedence, parenthesize; §6.2.3.
Avoid explicit type conversion (casts); §6.2.7.
When explicit type conversion is necessary, prefer the more specific cast operators to the Cstyle cast; §6.2.7.
Use the T
T(ee) notation exclusively for well-defined construction; §6.2.8.
Avoid expressions with undefined order of evaluation; §6.2.2.
Avoid ggoottoo; §6.3.4.
Avoid do-statements; §6.3.3.
Don’t declare a variable until you have a value to initialize it with; §6.3.1, §6.3.2.1, §6.3.3.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
140
Expressions and Statements
Chapter 6
[11] Keep comments crisp; §6.4.
[12] Maintain a consistent indentation style; §6.4.
[13] Prefer defining a member ooppeerraattoorr nneew
w() (§15.6) to replacing the global ooppeerraattoorr nneew
w();
§6.2.6.2.
[14] When reading input, always consider ill-formed input; §6.1.3.
6.6 Exercises [expr.exercises]
1. (∗1) Rewrite the following ffoorr statement as an equivalent w
whhiillee statement:
ffoorr (ii=00; ii<m
maaxx__lleennggtthh; ii++) iiff (iinnppuutt__lliinnee[ii] == ´?´) qquueesstt__ccoouunntt++;
Rewrite it to use a pointer as the controlled variable, that is, so that the test is of the form
*pp==´?´.
2. (∗1) Fully parenthesize the following expressions:
a = b + c * d << 2 & 8
a & 007777 != 3
a == b || a == c && c < 5
c = x != 0
0 <= i < 7
ff(11,22)+33
a = - 1 + + b -- - 5
a = b == c ++
a=b=c=0
aa[44][22] *= * b ? c : * d * 2
aa-bb,cc=dd
3. (∗2) Read a sequence of possibly whitespace-separated (name,value) pairs, where the name is a
single whitespace-separated word and the value is an integer or a floating-point value. Compute
and print the sum and mean for each name and the sum and mean for all names. Hint: §6.1.8.
4. (∗1) Write a table of values for the bitwise logical operations (§6.2.4) for all possible combinations of 0 and 1 operands.
5. (∗1.5) Find 5 different C++ constructs for which the meaning is undefined (§C.2). (∗1.5) Find 5
different C++ constructs for which the meaning is implementation-defined (§C.2).
6. (∗1) Find 10 different examples of nonportable C++ code.
7. (∗2) Write 5 expressions for which the order of evaluation is undefined. Execute them to see
what one or – preferably – more implementations do with them.
8. (∗1.5) What happens if you divide by zero on your system? What happens in case of overflow
and underflow?
9. (∗1) Fully parenthesize the following expressions:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 6.6
Exercises
141
*pp++
*--pp
++aa-(iinntt*)pp->m
m
*pp.m
m
*aa[ii]
10. (*2) Write these functions: ssttrrlleenn(), which returns the length of a C-style string; ssttrrccppyy(),
which copies a string into another; and ssttrrccm
mpp(), which compares two strings. Consider what
the argument types and return types ought to be. Then compare your functions with the standard library versions as declared in <ccssttrriinngg> (<ssttrriinngg.hh>) and as specified in §20.4.1.
11. (∗1) See how your compiler reacts to these errors:
vvooiidd ff(iinntt aa, iinntt bb)
{
iiff (aa = 33) // ...
iiff (aa&007777 == 00) // ...
a := bb+11;
}
Devise more simple errors and see how the compiler reacts.
12. (∗2) Modify the program from §6.6[3] to also compute the median.
13. (∗2) Write a function ccaatt() that takes two C-style string arguments and returns a string that is
the concatenation of the arguments. Use nneew
w to find store for the result.
14. (∗2) Write a function rreevv() that takes a string argument and reverses the characters in it. That
is, after rreevv(pp) the last character of p will be the first, etc.
15. (∗1.5) What does the following example do?
vvooiidd sseenndd(iinntt* ttoo, iinntt* ffrroom
m, iinntt ccoouunntt)
// Duff’s device. Helpful comment deliberately deleted.
{
iinntt n = (ccoouunntt+77)/88;
ssw
wiittcchh (ccoouunntt%88) {
ccaassee 00: ddoo { *ttoo++ = *ffrroom
m++;
ccaassee 77:
*ttoo++ = *ffrroom
m++;
ccaassee 66:
*ttoo++ = *ffrroom
m++;
ccaassee 55:
*ttoo++ = *ffrroom
m++;
ccaassee 44:
*ttoo++ = *ffrroom
m++;
ccaassee 33:
*ttoo++ = *ffrroom
m++;
ccaassee 22:
*ttoo++ = *ffrroom
m++;
ccaassee 11:
*ttoo++ = *ffrroom
m++;
}w
whhiillee (--nn>00);
}
}
Why would anyone write something like that?
16. (∗2) Write a function aattooii(ccoonnsstt cchhaarr*) that takes a string containing digits and returns the
corresponding iinntt. For example, aattooii("112233") is 112233. Modify aattooii() to handle C++ octal and
hexadecimal notation in addition to plain decimal numbers. Modify aattooii() to handle the C++
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
142
Expressions and Statements
Chapter 6
character constant notation.
17. (∗2) Write a function iittooaa(iinntt ii, cchhaarr bb[]) that creates a string representation of i in b and
returns bb.
18. (*2) Type in the calculator example and get it to work. Do not ‘‘save time’’ by using an already
entered text. You’ll learn most from finding and correcting ‘‘little silly errors.’’
19. (∗2) Modify the calculator to report line numbers for errors.
20. (∗3) Allow a user to define functions in the calculator. Hint: Define a function as a sequence of
operations just as a user would have typed them. Such a sequence can be stored either as a
character string or as a list of tokens. Then read and execute those operations when the function
is called. If you want a user-defined function to take arguments, you will have to invent a notation for that.
21. (∗1.5) Convert the desk calculator to use a ssyym
mbbooll structure instead of using the static variables
nnuum
mbbeerr__vvaalluuee and ssttrriinngg__vvaalluuee.
22. (∗2.5) Write a program that strips comments out of a C++ program. That is, read from cciinn,
remove both // comments and /* */ comments, and write the result to ccoouutt. Do not worry
about making the layout of the output look nice (that would be another, and much harder, exercise). Do not worry about incorrect programs. Beware of //, /*, and */ in comments, strings,
and character constants.
23. (∗2) Look at some programs to get an idea of the variety of indentation, naming, and commenting styles actually used.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
7
________________________________________
________________________________________________________________________________________________________________________________________________________________
Functions
To iterate is human,
to recurse divine.
– L. Peter Deutsch
Function declarations and definitions — argument passing — return values — function
overloading — ambiguity resolution — default arguments — ssttddaarrggss — pointers to
functions — macros — advice — exercises.
7.1 Function Declarations [fct.dcl]
The typical way of getting something done in a C++ program is to call a function to do it. Defining
a function is the way you specify how an operation is to be done. A function cannot be called
unless it has been previously declared.
A function declaration gives the name of the function, the type of the value returned (if any) by
the function, and the number and types of the arguments that must be supplied in a call of the function. For example:
E
Elleem
m* nneexxtt__eelleem
m();
cchhaarr* ssttrrccppyy(cchhaarr* ttoo, ccoonnsstt cchhaarr* ffrroom
m);
vvooiidd eexxiitt(iinntt);
The semantics of argument passing are identical to the semantics of initialization. Argument types
are checked and implicit argument type conversion takes place when necessary. For example:
ddoouubbllee ssqqrrtt(ddoouubbllee);
ddoouubbllee ssrr22 = ssqqrrtt(22);
ddoouubbllee ssqq33 = ssqqrrtt("tthhrreeee");
// call sqrt() with the argument double(2)
// error: sqrt() requires an argument of type double
The value of such checking and type conversion should not be underestimated.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
144
Functions
Chapter 7
A function declaration may contain argument names. This can be a help to the reader of a program, but the compiler simply ignores such names. As mentioned in §4.7, vvooiidd as a return type
means that the function does not return a value.
7.1.1 Function Definitions [fct.def]
Every function that is called in a program must be defined somewhere (once only). A function definition is a function declaration in which the body of the function is presented. For example:
eexxtteerrnn vvooiidd ssw
waapp(iinntt*, iinntt*); // a declaration
vvooiidd ssw
waapp(iinntt* pp, iinntt* qq)
{
iinntt t = *pp;
*pp = *qq;
*qq = tt;
}
// a definition
The type of the definition and all declarations for a function must specify the same type. The argument names, however, are not part of the type and need not be identical.
It is not uncommon to have function definitions with unused arguments:
vvooiidd sseeaarrcchh(ttaabbllee* tt, ccoonnsstt cchhaarr* kkeeyy, ccoonnsstt cchhaarr*)
{
// no use of the third argument
}
As shown, the fact that an argument is unused can be indicated by not naming it. Typically,
unnamed arguments arise from the simplification of code or from planning ahead for extensions. In
both cases, leaving the argument in place, although unused, ensures that callers are not affected by
the change.
A function can be defined to be iinnlliinnee. For example:
iinnlliinnee iinntt ffaacc(iinntt nn)
{
rreettuurrnn (nn<22) ? 1 : nn*ffaacc(nn-11);
}
The iinnlliinnee specifier is a hint to the compiler that it should attempt to generate code for a call of
ffaacc() inline rather than laying down the code for the function once and then calling through the
usual function call mechanism. A clever compiler can generate the constant 772200 for a call ffaacc(66).
The possibility of mutually recursive inline functions, inline functions that recurse or not depending
on input, etc., makes it impossible to guarantee that every call of an iinnlliinnee function is actually
inlined. The degree of cleverness of a compiler cannot be legislated, so one compiler might generate 772200, another 66*ffaacc(55), and yet another an un-inlined call ffaacc(66).
To make inlining possible in the absence of unusually clever compilation and linking facilities,
the definition – and not just the declaration – of an inline function must be in scope (§9.2). An
iinnlliinnee specifier does not affect the semantics of a function. In particular, an inline function still has
a unique address and so has ssttaattiicc variables (§7.1.2) of an inline function.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.1.2
Static Variables
145
7.1.2 Static Variables [fct.static]
A local variable is initialized when the thread of execution reaches its definition. By default, this
happens in every call of the function and each invocation of the function has its own copy of the
variable. If a local variable is declared ssttaattiicc, a single, statically allocated object will be used to
represent that variable in all calls of the function. It will be initialized only the first time the thread
of execution reaches its definition. For example:
vvooiidd ff(iinntt aa)
{
w
whhiillee (aa--) {
ssttaattiicc iinntt n = 00;
iinntt x = 00;
// initialized once
// initialized n times
ccoouutt << "nn == " << nn++ << ", x == " << xx++ << ´\\nn´;
}
}
iinntt m
maaiinn()
{
ff(33);
}
This prints:
n == 00, x == 0
n == 11, x == 0
n == 22, x == 0
A static variable provides a function with ‘‘a memory’’ without introducing a global variable that
might be accessed and corrupted by other functions (see also §10.2.4).
7.2 Argument Passing [fct.arg]
When a function is called, store is set aside for its formal arguments and each formal argument is
initialized by its corresponding actual argument. The semantics of argument passing are identical
to the semantics of initialization. In particular, the type of an actual argument is checked against
the type of the corresponding formal argument, and all standard and user-defined type conversions
are performed. There are special rules for passing arrays (§7.2.1), a facility for passing unchecked
arguments (§7.6), and a facility for specifying default arguments (§7.5). Consider:
vvooiidd ff(iinntt vvaall, iinntt& rreeff)
{
vvaall++;
rreeff++;
}
When ff() is called, vvaall++ increments a local copy of the first actual argument, whereas rreeff++
increments the second actual argument. For example,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
146
Functions
Chapter 7
vvooiidd gg()
{
iinntt i = 11;
iinntt j = 11;
ff(ii,jj);
}
will increment j but not ii. The first argument, ii, is passed by value, the second argument, jj, is
passed by reference. As mentioned in §5.5, functions that modify call-by-reference arguments can
make programs hard to read and should most often be avoided (but see §21.2.1). It can, however,
be noticeably more efficient to pass a large object by reference than to pass it by value. In that
case, the argument might be declared ccoonnsstt to indicate that the reference is used for efficiency reasons only and not to enable the called function to change the value of the object:
vvooiidd ff(ccoonnsstt L
Laarrggee& aarrgg)
{
// the value of "arg" cannot be changed without explicit use of type conversion
}
The absence of ccoonnsstt in the declaration of a reference argument is taken as a statement of intent to
modify the variable:
vvooiidd gg(L
Laarrggee& aarrgg); // assume that g() modifies arg
Similarly, declaring a pointer argument ccoonnsstt tells readers that the value of an object pointed to by
that argument is not changed by the function. For example:
iinntt ssttrrlleenn(ccoonnsstt cchhaarr*);
cchhaarr* ssttrrccppyy(cchhaarr* ttoo, ccoonnsstt cchhaarr* ffrroom
m);
iinntt ssttrrccm
mpp(ccoonnsstt cchhaarr*, ccoonnsstt cchhaarr*);
// number of characters in a C-style string
// copy a C-style string
// compare C-style strings
The importance of using ccoonnsstt arguments increases with the size of a program.
Note that the semantics of argument passing are different from the semantics of assignment.
This is important for ccoonnsstt arguments, reference arguments, and arguments of some user-defined
types (§10.4.4.1).
A literal, a constant, and an argument that requires conversion can be passed as a ccoonnsstt& argument, but not as a non-ccoonnsstt argument. Allowing conversions for a ccoonnsstt T
T& argument ensures that
such an argument can be given exactly the same set of values as a T argument by passing the value
in a temporary, if necessary. For example:
ffllooaatt ffssqqrrtt(ccoonnsstt ffllooaatt&); // Fortran-style sqrt taking a reference argument
vvooiidd gg(ddoouubbllee dd)
{
ffllooaatt r = ffssqqrrtt(22.00ff);
r = ffssqqrrtt(rr);
r = ffssqqrrtt(dd);
}
// pass ref to temp holding 2.0f
// pass ref to r
// pass ref to temp holding float(d)
Disallowing conversions for non-ccoonnsstt reference arguments (§5.5) avoids the possibility of silly
mistakes arising from the introduction of temporaries. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.2
Argument Passing
147
vvooiidd uuppddaattee(ffllooaatt& ii);
vvooiidd gg(ddoouubbllee dd, ffllooaatt rr)
{
uuppddaattee(22.00ff); // error: const argument
uuppddaattee(rr);
// pass ref to r
uuppddaattee(dd);
// error: type conversion required
}
Had these calls been allowed, uuppddaattee() would quietly have updated temporaries that immediately
were deleted. Usually, that would come as an unpleasant surprise to the programmer.
7.2.1 Array Arguments [fct.array]
If an array is used as a function argument, a pointer to its initial element is passed. For example:
iinntt ssttrrlleenn(ccoonnsstt cchhaarr*);
vvooiidd ff()
{
cchhaarr vv[] = "aann aarrrraayy";
iinntt i = ssttrrlleenn(vv);
iinntt j = ssttrrlleenn("N
Niicchhoollaass");
}
That is, an argument of type T
T[] will be converted to a T
T* when passed as an argument. This
implies that an assignment to an element of an array argument changes the value of an element of
the argument array. In other words, arrays differ from other types in that an array is not (and cannot be) passed by value.
The size of an array is not available to the called function. This can be a nuisance, but there are
several ways of circumventing this problem. C-style strings are zero-terminated, so their size can
be computed easily. For other arrays, a second argument specifying the size can be passed. For
example:
vvooiidd ccoom
mppuuttee11(iinntt* vveecc__ppttrr, iinntt vveecc__ssiizzee);
// one way
ssttrruucctt V
Veecc {
iinntt* ppttrr;
iinntt ssiizzee;
};
vvooiidd ccoom
mppuuttee22(ccoonnsstt V
Veecc& vv);
// another way
Alternatively, a type such as vveeccttoorr (§3.7.1, §16.3) can be used instead of an array.
Multidimensional arrays are trickier (see §C.7), but often arrays of pointers can be used instead,
and they need no special treatment. For example:
cchhaarr* ddaayy[] = {
"m
moonn", "ttuuee", "w
weedd", "tthhuu", "ffrrii", "ssaatt", "ssuunn"
};
Again, vveeccttoorr and similar types are alternatives to the built-in, low-level arrays and pointers.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
148
Functions
Chapter 7
7.3 Value Return [fct.return]
A value must be returned from a function that is not declared vvooiidd (however, m
maaiinn() is special; see
§3.2). Conversely, a value cannot be returned from a vvooiidd function. For example:
iinntt ff11() { }
vvooiidd ff22() { }
// error: no value returned
// ok
iinntt ff33() { rreettuurrnn 11; }
vvooiidd ff44() { rreettuurrnn 11; }
// ok
// error: return value in void function
iinntt ff55() { rreettuurrnn; }
vvooiidd ff66() { rreettuurrnn; }
// error: return value missing
// ok
A return value is specified by a return statement. For example:
iinntt ffaacc(iinntt nn) { rreettuurrnn (nn>11) ? nn*ffaacc(nn-11) : 11; }
A function that calls itself is said to be recursive.
There can be more than one return statement in a function:
iinntt ffaacc22(iinntt nn)
{
iiff (nn > 11) rreettuurrnn nn*ffaacc22(nn-11);
rreettuurrnn 11;
}
Like the semantics of argument passing, the semantics of function value return are identical to the
semantics of initialization. A return statement is considered to initialize an unnamed variable of the
returned type. The type of a return expression is checked against the type of the returned type, and
all standard and user-defined type conversions are performed. For example:
ddoouubbllee ff() { rreettuurrnn 11; } // 1 is implicitly converted to double(1)
Each time a function is called, a new copy of its arguments and local (automatic) variables is created. The store is reused after the function returns, so a pointer to a local variable should never be
returned. The contents of the location pointed to will change unpredictably:
iinntt* ffpp() { iinntt llooccaall = 11; /* ... */ rreettuurrnn &llooccaall; }
// bad
This error is less common than the equivalent error using references:
iinntt& ffrr() { iinntt llooccaall = 11; /* ... */ rreettuurrnn llooccaall; }
// bad
Fortunately, a compiler can easily warn about returning references to local variables.
A vvooiidd function cannot return a value. However, a call of a vvooiidd function doesn’t yield a value,
so a vvooiidd function can use a call of a vvooiidd function as the expression in a rreettuurrnn statement. For
example:
vvooiidd gg(iinntt* pp);
vvooiidd hh(iinntt* pp) { /* ... */ rreettuurrnn gg(pp); } // ok: return of ‘‘no value’’
This form of return is important when writing template functions where the return type is a template parameter (see §18.4.4.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.4
Overloaded Function Names
149
7.4 Overloaded Function Names [fct.over]
Most often, it is a good idea to give different functions different names, but when some functions
conceptually perform the same task on objects of different types, it can be more convenient to give
them the same name. Using the same name for operations on different types is called overloading.
The technique is already used for the basic operations in C++. That is, there is only one name for
addition, +, yet it can be used to add values of integer, floating-point, and pointer types. This idea
is easily extended to functions defined by the programmer. For example:
vvooiidd pprriinntt(iinntt);
// print an int
vvooiidd pprriinntt(ccoonnsstt cchhaarr*); // print a C-style character string
As far as the compiler is concerned, the only thing functions of the same name have in common is
that name. Presumably, the functions are in some sense similar, but the language does not constrain or aid the programmer. Thus overloaded function names are primarily a notational convenience. This convenience is significant for functions with conventional names such as ssqqrrtt, pprriinntt,
and ooppeenn. When a name is semantically significant, this convenience becomes essential. This happens, for example, with operators such as +, *, and <<, in the case of constructors (§11.7), and in
generic programming (§2.7.2, Chapter 18). When a function f is called, the compiler must figure
out which of the functions with the name f is to be invoked. This is done by comparing the types of
the actual arguments with the types of the formal arguments of all functions called ff. The idea is to
invoke the function that is the best match on the arguments and give a compile-time error if no
function is the best match. For example:
vvooiidd pprriinntt(ddoouubbllee);
vvooiidd pprriinntt(lloonngg);
vvooiidd ff()
{
pprriinntt(11L
L);
pprriinntt(11.00);
pprriinntt(11);
}
// print(long)
// print(double)
// error, ambiguous: print(long(1)) or print(double(1))?
Finding the right version to call from a set of overloaded functions is done by looking for a best
match between the type of the argument expression and the parameters (formal arguments) of the
functions. To approximate our notions of what is reasonable, a series of criteria are tried in order:
[1] Exact match; that is, match using no or only trivial conversions (for example, array name to
pointer, function name to pointer to function, and T to ccoonnsstt T
T)
[2] Match using promotions; that is, integral promotions (bbooooll to iinntt, cchhaarr to iinntt, sshhoorrtt to iinntt,
and their uunnssiiggnneedd counterparts; §C.6.1), ffllooaatt to ddoouubbllee, and ddoouubbllee to lloonngg ddoouubbllee
[3] Match using standard conversions (for example, iinntt to ddoouubbllee, ddoouubbllee to iinntt, D
Deerriivveedd* to
B
Baassee* (§12.2), T* to vvooiidd* (§5.6), iinntt to uunnssiiggnneedd iinntt; §C.6)
[4] Match using user-defined conversions (§11.4)
[5] Match using the ellipsis ... in a function declaration (§7.6)
If two matches are found at the highest level where a match is found, the call is rejected as ambiguous. The resolution rules are this elaborate primarily to take into account the elaborate C and C++
rules for built-in numeric types (§C.6). For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
150
Functions
vvooiidd
vvooiidd
vvooiidd
vvooiidd
vvooiidd
Chapter 7
pprriinntt(iinntt);
pprriinntt(ccoonnsstt cchhaarr*);
pprriinntt(ddoouubbllee);
pprriinntt(lloonngg);
pprriinntt(cchhaarr);
vvooiidd hh(cchhaarr cc, iinntt ii, sshhoorrtt ss, ffllooaatt ff)
{
pprriinntt(cc);
// exact match: invoke print(char)
pprriinntt(ii);
// exact match: invoke print(int)
pprriinntt(ss);
// integral promotion: invoke print(int)
pprriinntt(ff);
// float to double promotion: print(double)
pprriinntt(´aa´);
pprriinntt(4499);
pprriinntt(00);
pprriinntt("aa");
// exact match: invoke print(char)
// exact match: invoke print(int)
// exact match: invoke print(int)
// exact match: invoke print(const char*)
}
The call pprriinntt(00) invokes pprriinntt(iinntt) because 0 is an iinntt. The call pprriinntt(´aa´) invokes
pprriinntt(cchhaarr) because ´aa´ is a cchhaarr (§4.3.1). The reason to distinguish between conversions and
promotions is that we want to prefer safe promotions, such as cchhaarr to iinntt, over unsafe conversions,
such as iinntt to cchhaarr.
The overloading resolution is independent of the order of declaration of the functions considered.
Overloading relies on a relatively complicated set of rules, and occasionally a programmer will
be surprised which function is called. So, why bother? Consider the alternative to overloading.
Often, we need similar operations performed on objects of several types. Without overloading, we
must define several functions with different names:
vvooiidd pprriinntt__iinntt(iinntt);
vvooiidd pprriinntt__cchhaarr(cchhaarr);
vvooiidd pprriinntt__ssttrriinngg(ccoonnsstt cchhaarr*); // C-style string
vvooiidd gg(iinntt ii, cchhaarr cc, ccoonnsstt cchhaarr* pp, ddoouubbllee dd)
{
pprriinntt__iinntt(ii);
// ok
pprriinntt__cchhaarr(cc);
// ok
pprriinntt__ssttrriinngg(pp);
// ok
pprriinntt__iinntt(cc);
pprriinntt__cchhaarr(ii);
pprriinntt__ssttrriinngg(ii);
pprriinntt__iinntt(dd);
// ok? calls print_int(int(c))
// ok? calls print_char(char(i))
// error
// ok? calls print_int(int(d))
}
Compared to the overloaded pprriinntt(), we have to remember several names and remember to use
those correctly. This can be tedious, defeats attempts to do generic programming (§2.7.2), and generally encourages the programmer to focus on relatively low-level type issues. Because there is no
overloading, all standard conversions apply to arguments to these functions. It can also lead to
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.4
Overloaded Function Names
151
errors. In the previous example, this implies that only one of the four calls with a ‘‘wrong’’ argument is caught by the compiler. Thus, overloading can increase the chances that an unsuitable
argument will be rejected by the compiler.
7.4.1 Overloading and Return Type [fct.return]
Return types are not considered in overload resolution. The reason is to keep resolution for an individual operator (§11.2.1, §11.2.4) or function call context-independent. Consider:
ffllooaatt ssqqrrtt(ffllooaatt);
ddoouubbllee ssqqrrtt(ddoouubbllee);
vvooiidd ff(ddoouubbllee ddaa, ffllooaatt ffllaa)
{
ffllooaatt ffll = ssqqrrtt(ddaa); // call sqrt(double)
ddoouubbllee d = ssqqrrtt(ddaa); // call sqrt(double)
ffll = ssqqrrtt(ffllaa);
// call sqrt(float)
d = ssqqrrtt(ffllaa);
// call sqrt(float)
}
If the return type were taken into account, it would no longer be possible to look at a call of ssqqrrtt()
in isolation and determine which function was called.
7.4.2 Overloading and Scopes [fct.scope]
Functions declared in different non-namespace scopes do not overload. For example:
vvooiidd ff(iinntt);
vvooiidd gg()
{
vvooiidd ff(ddoouubbllee);
ff(11);
// call f(double)
}
Clearly, ff(iinntt) would have been the best match for ff(11), but only ff(ddoouubbllee) is in scope. In such
cases, local declarations can be added or subtracted to get the desired behavior. As always, intentional hiding can be a useful technique, but unintentional hiding is a source of surprises. When
overloading across class scopes (§15.2.2) or namespace scopes (§8.2.9.2) is wanted, usingdeclarations or uussiinngg-ddiirreeccttiivveess can be used (§8.2.2). See also §8.2.6 and §8.2.9.2.
7.4.3 Manual Ambiguity Resolution [fct.man.ambig]
Declaring too few (or too many) overloaded versions of a function can lead to ambiguities. For
example:
vvooiidd ff11(cchhaarr);
vvooiidd ff11(lloonngg);
vvooiidd ff22(cchhaarr*);
vvooiidd ff22(iinntt*);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
152
Functions
vvooiidd kk(iinntt ii)
{
ff11(ii);
ff22(00);
}
Chapter 7
// ambiguous: f1(char) or f1(long)
// ambiguous: f2(char*) or f2(int*)
Where possible, the thing to do in such cases is to consider the set of overloaded versions of a function as a whole and see if it makes sense according to the semantics of the function. Often the
problem can be solved by adding a version that resolves ambiguities. For example, adding
iinnlliinnee vvooiidd ff11(iinntt nn) { ff11(lloonngg(nn)); }
would resolve all ambiguities similar to ff11(ii) in favor of the larger type lloonngg iinntt.
One can also add an explicit type conversion to resolve a specific call. For example:
ff22(ssttaattiicc__ccaasstt<iinntt*>(00));
However, this is most often simply an ugly stopgap. Soon another similar call will be made and
have to be dealt with.
Some C++ novices get irritated by the ambiguity errors reported by the compiler. More experienced programmers appreciate these error messages as useful indicators of design errors.
7.4.4 Resolution for Multiple Arguments [fct.fct.res]
Given the overload resolution rules, one can ensure that the simplest algorithm (function) will be
used when the efficiency or precision of computations differs significantly for the types involved.
For example:
iinntt ppoow
w(iinntt, iinntt);
ddoouubbllee ppoow
w(ddoouubbllee, ddoouubbllee);
ccoom
mpplleexx
ccoom
mpplleexx
ccoom
mpplleexx
ccoom
mpplleexx
ppoow
w(ddoouubbllee, ccoom
mpplleexx);
ppoow
w(ccoom
mpplleexx, iinntt);
ppoow
w(ccoom
mpplleexx, ddoouubbllee);
ppoow
w(ccoom
mpplleexx, ccoom
mpplleexx);
vvooiidd kk(ccoom
mpplleexx zz)
{
iinntt i = ppoow
w(22,22);
ddoouubbllee d = ppoow
w(22.00,22.00);
ccoom
mpplleexx zz22 = ppoow
w(22,zz);
ccoom
mpplleexx zz33 = ppoow
w(zz,22);
ccoom
mpplleexx zz44 = ppoow
w(zz,zz);
}
// invoke pow(int,int)
// invoke pow(double,double)
// invoke pow(double,complex)
// invoke pow(complex,int)
// invoke pow(complex,complex)
In the process of choosing among overloaded functions with two or more arguments, a best match
is found for each argument using the rules from §7.4. A function that is the best match for one
argument and a better than or equal match for all other arguments is called. If no such function
exists, the call is rejected as ambiguous. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.4.4
Resolution for Multiple Arguments
153
vvooiidd gg()
{
ddoouubbllee d = ppoow
w(22.00,22); // error: pow(int(2.0),2) or pow(2.0,double(2))?
}
The call is ambiguous because 22.00 is the best match for the first argument of
ppoow
w(ddoouubbllee,ddoouubbllee) and 2 is the best match for the second argument of ppoow
w(iinntt,iinntt).
7.5 Default Arguments [fct.defarg]
A general function often needs more arguments than are necessary to handle simple cases. In particular, functions that construct objects (§10.2.3) often provide several options for flexibility. Consider a function for printing an integer. Giving the user an option of what base to print it in seems
reasonable, but in most programs integers will be printed as decimal integer values. For example:
vvooiidd pprriinntt(iinntt vvaalluuee, iinntt bbaassee =1100); // default base is 10
vvooiidd ff()
{
pprriinntt(3311);
pprriinntt(3311,1100);
pprriinntt(3311,1166);
pprriinntt(3311,22);
}
might produce this output:
3311 3311 11ff 1111111111
The effect of a default argument can alternatively be achieved by overloading:
vvooiidd pprriinntt(iinntt vvaalluuee, iinntt bbaassee);
iinnlliinnee vvooiidd pprriinntt(iinntt vvaalluuee) { pprriinntt(vvaalluuee,1100); }
However, overloading makes it less obvious to the reader that the intent is to have a single print
function plus a shorthand.
A default argument is type checked at the time of the function declaration and evaluated at the
time of the call. Default arguments may be provided for trailing arguments only. For example:
iinntt ff(iinntt, iinntt =00, cchhaarr* =00);
iinntt gg(iinntt =00, iinntt =00, cchhaarr*);
iinntt hh(iinntt =00, iinntt, cchhaarr* =00);
// ok
// error
// error
Note that the space between the * and the = is significant (*= is an assignment operator; §6.2):
iinntt nnaassttyy(cchhaarr*=00);
// syntax error
A default argument can be repeated in a subsequent declaration in the same scope but not changed.
For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
154
Functions
vvooiidd ff(iinntt x = 77);
vvooiidd ff(iinntt = 77);
vvooiidd ff(iinntt = 88);
vvooiidd gg()
{
vvooiidd ff(iinntt x = 99);
// ...
}
Chapter 7
// ok
// error: different default arguments
// ok: this declaration hides the outer one
Declaring a name in a nested scope so that the name hides a declaration of the same name in an
outer scope is error prone.
7.6 Unspecified Number of Arguments [fct.stdarg]
For some functions, it is not possible to specify the number and type of all arguments expected in a
call. Such a function is declared by terminating the list of argument declarations with the ellipsis
(...), which means ‘‘and maybe some more arguments.’’ For example:
iinntt pprriinnttff(ccoonnsstt cchhaarr* ...);
This specifies that a call of the C standard library function pprriinnttff() (§21.8) must have at least one
argument, a cchhaarr*, but may or may not have others. For example:
pprriinnttff("H
Heelllloo, w
woorrlldd!\\nn");
pprriinnttff("M
Myy nnaam
mee iiss %ss %ss\\nn", ffiirrsstt__nnaam
mee, sseeccoonndd__nnaam
mee);
pprriinnttff("%dd + %dd = %dd\\nn",22,33,55);
Such a function must rely on information not available to the compiler when interpreting its argument list. In the case of pprriinnttff(), the first argument is a format string containing special character
sequences that allow pprriinnttff() to handle other arguments correctly; %ss means ‘‘expect a cchhaarr*
argument’’ and %dd means ‘‘expect an iinntt argument.’’ However, the compiler cannot in general
know that, so it cannot ensure that the expected arguments are really there or that an argument is of
the proper type. For example,
#iinncclluuddee <ssttddiioo.hh>
iinntt m
maaiinn()
{
pprriinnttff("M
Myy nnaam
mee iiss %ss %ss\\nn",22);
}
will compile and (at best) cause some strange-looking output (try it!).
Clearly, if an argument has not been declared, the compiler does not have the information
needed to perform the standard type checking and type conversion for it. In that case, a cchhaarr or a
sshhoorrtt is passed as an iinntt and a ffllooaatt is passed as a ddoouubbllee. This is not necessarily what the programmer expects.
A well-designed program needs at most a few functions for which the argument types are not
completely specified. Overloaded functions and functions using default arguments can be used to
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.6
Unspecified Number of Arguments
155
take care of type checking in most cases when one would otherwise consider leaving argument
types unspecified. Only when both the number of arguments and the type of arguments vary is the
ellipsis necessary. The most common use of the ellipsis is to specify an interface to C library functions that were defined before C++ provided alternatives:
iinntt ffpprriinnttff(F
FIIL
LE
E*, ccoonnsstt cchhaarr* ...);
iinntt eexxeeccll(ccoonnsstt cchhaarr* ...);
// from <cstdio>
// from UNIX header
A standard set of macros for accessing the unspecified arguments in such functions can be found in
<ccssttddaarrgg>. Consider writing an error function that takes one integer argument indicating the
severity of the error followed by an arbitrary number of strings. The idea is to compose the error
message by passing each word as a separate string argument. The list of string arguments should
be terminated by a null pointer to cchhaarr:
eexxtteerrnn vvooiidd eerrrroorr(iinntt ...);
eexxtteerrnn cchhaarr* iittooaa(iinntt, cchhaarr[]);
// see §6.6[17]
ccoonnsstt cchhaarr* N
Nuullll__ccpp = 00;
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
ssw
wiittcchh (aarrggcc) {
ccaassee 11:
eerrrroorr(00,aarrggvv[00],N
Nuullll__ccpp);
bbrreeaakk;
ccaassee 22:
eerrrroorr(00,aarrggvv[00],aarrggvv[11],N
Nuullll__ccpp);
bbrreeaakk;
ddeeffaauulltt:
cchhaarr bbuuffffeerr[88];
eerrrroorr(11,aarrggvv[00], "w
wiitthh",iittooaa(aarrggcc-11,bbuuffffeerr),"aarrgguum
meennttss", N
Nuullll__ccpp);
}
// ...
}
The function iittooaa() returns the character string representing its integer argument.
Note that using the integer 0 as the terminator would not have been portable: on some implementations, the integer zero and the null pointer do not have the same representation. This illustrates the subtleties and extra work that face the programmer once type checking has been suppressed using the ellipsis.
The error function could be defined like this:
vvooiidd eerrrroorr(iinntt sseevveerriittyy ...) // "severity" followed by a zero-terminated list of char*s
{
vvaa__lliisstt aapp;
vvaa__ssttaarrtt(aapp,sseevveerriittyy);
// arg startup
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
156
Functions
Chapter 7
ffoorr (;;) {
cchhaarr* p = vvaa__aarrgg(aapp,cchhaarr*);
iiff (pp == 00) bbrreeaakk;
cceerrrr << p << ´ ´;
}
vvaa__eenndd(aapp);
// arg cleanup
cceerrrr << ´\\nn´;
iiff (sseevveerriittyy) eexxiitt(sseevveerriittyy);
}
First, a vvaa__lliisstt is defined and initialized by a call of vvaa__ssttaarrtt(). The macro vvaa__ssttaarrtt takes the
name of the vvaa__lliisstt and the name of the last formal argument as arguments. The macro vvaa__aarrgg()
is used to pick the unnamed arguments in order. In each call, the programmer must supply a type;
vvaa__aarrgg() assumes that an actual argument of that type has been passed, but it typically has no way
of ensuring that. Before returning from a function in which vvaa__ssttaarrtt() has been used, vvaa__eenndd()
must be called. The reason is that vvaa__ssttaarrtt() may modify the stack in such a way that a return
cannot successfully be done; vvaa__eenndd() undoes any such modifications.
7.7 Pointer to Function [fct.pf]
There are only two things one can do to a function: call it and take its address. The pointer
obtained by taking the address of a function can then be used to call the function. For example:
vvooiidd eerrrroorr(ssttrriinngg ss) { /* ... */ }
vvooiidd (*eeffcctt)(ssttrriinngg);
vvooiidd ff()
{
eeffcctt = &eerrrroorr;
eeffcctt("eerrrroorr");
}
// pointer to function
// efct points to error
// call error through efct
The compiler will discover that eeffcctt is a pointer and call the function pointed to. That is, dereferencing of a pointer to function using * is optional. Similarly, using & to get the address of a function is optional:
vvooiidd (*ff11)(ssttrriinngg) = &eerrrroorr;
vvooiidd (*ff22)(ssttrriinngg) = eerrrroorr;
vvooiidd gg()
{
ff11("V
Vaassaa");
(*ff11)("M
Maarryy R
Roossee");
}
// ok
// also ok; same meaning as &error
// ok
// also ok
Pointers to functions have argument types declared just like the functions themselves. In pointer
assignments, the complete function type must match exactly. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.7
Pointer to Function
vvooiidd (*ppff)(ssttrriinngg);
vvooiidd ff11(ssttrriinngg);
iinntt ff22(ssttrriinngg);
vvooiidd ff33(iinntt*);
157
// pointer to void(string)
// void(string)
// int(string)
// void(int*)
vvooiidd ff()
{
ppff = &ff11;
ppff = &ff22;
ppff = &ff33;
ppff("H
Heerraa");
ppff(11);
// ok
// error: bad return type
// error: bad argument type
// ok
// error: bad argument type
iinntt i = ppff("Z
Zeeuuss"); // error: void assigned to int
}
The rules for argument passing are the same for calls directly to a function and for calls to a function through a pointer.
It is often convenient to define a name for a pointer-to-function type to avoid using the somewhat nonobvious declaration syntax all the time. Here is an example from a UNIX system header:
ttyyppeeddeeff vvooiidd (*SSIIG
G__T
TY
YP
P)(iinntt);
// from <signal.h>
ttyyppeeddeeff vvooiidd (*SSIIG
G__A
AR
RG
G__T
TY
YP
P)(iinntt);
SSIIG
G__T
TY
YP
P ssiiggnnaall(iinntt, SSIIG
G__A
AR
RG
G__T
TY
YP
P);
An array of pointers to functions is often useful. For example, the menu system for my mousebased editor is implemented using arrays of pointers to functions to represent operations. The system cannot be described in detail here, but this is the general idea:
ttyyppeeddeeff vvooiidd (*P
PF
F)();
P
PF
F eeddiitt__ooppss[] = {
// edit operations
&ccuutt, &ppaassttee, &ccooppyy, &sseeaarrcchh
};
P
PF
F ffiillee__ooppss[] = {
// file management
&ooppeenn, &aappppeenndd, &cclloossee, &w
wrriittee
};
We can then define and initialize the pointers that control actions selected from a menu associated
with the mouse buttons:
P
PF
F* bbuuttttoonn22 = eeddiitt__ooppss;
P
PF
F* bbuuttttoonn33 = ffiillee__ooppss;
In a complete implementation, more information is needed to define each menu item. For example,
a string specifying the text to be displayed must be stored somewhere. As the system is used, the
meaning of mouse buttons changes frequently with the context. Such changes are performed
(partly) by changing the value of the button pointers. When a user selects a menu item, such as
item 3 for button 2, the associated operation is executed:
bbuuttttoonn22[22](); // call button2’s 3rd function
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
158
Functions
Chapter 7
One way to gain appreciation of the expressive power of pointers to functions is to try to write such
code without them – and without using their better-behaved cousins, the virtual functions
(§12.2.6). A menu can be modified at run-time by inserting new functions into the operator table.
It is also easy to construct new menus at run-time.
Pointers to functions can be used to provide a simple form of polymorphic routines, that is, routines that can be applied to objects of many different types:
ttyyppeeddeeff iinntt (*C
CF
FT
T)(ccoonnsstt vvooiidd*, ccoonnsstt vvooiidd*);
vvooiidd ssssoorrtt(vvooiidd* bbaassee, ssiizzee__tt nn, ssiizzee__tt sszz, C
CF
FT
T ccm
mpp)
/*
Sort the "n" elements of vector "base" into increasing order
using the comparison function pointed to by "cmp".
The elements are of size "sz".
Shell sort (Knuth, Vol3, pg84)
*/
{
ffoorr (iinntt ggaapp=nn/22; 00<ggaapp; ggaapp/=22)
ffoorr (iinntt ii=ggaapp; ii<nn; ii++)
ffoorr (iinntt jj=ii-ggaapp; 00<=jj; jj-=ggaapp) {
cchhaarr* b = ssttaattiicc__ccaasstt<cchhaarr*>(bbaassee); // necessary cast
cchhaarr* ppjj = bb+jj*sszz;
// &base[j]
cchhaarr* ppjjgg = bb+(jj+ggaapp)*sszz;
// &base[j+gap]
iiff (ccm
mpp(ppjj,ppjjgg)<00) {
ffoorr (iinntt kk=00; kk<sszz; kk++) {
cchhaarr tteem
mpp = ppjj[kk];
ppjj[kk] = ppjjgg[kk];
ppjjgg[kk] = tteem
mpp;
}
}
// swap base[j] and base[j+gap]:
}
}
The ssssoorrtt() routine does not know the type of the objects it sorts, only the number of elements (the
array size), the size of each element, and the function to call to perform a comparison. The type of
ssssoorrtt() was chosen to be the same as the type of the standard C library sort routine, qqssoorrtt(). Real
programs use qqssoorrtt(), the C++ standard library algorithm ssoorrtt (§18.7.1), or a specialized sort routine. This style of code is common in C, but it is not the most elegant way of expressing this algorithm in C++ (see §13.3, §13.5.2).
Such a sort function could be used to sort a table such as this:
ssttrruucctt U
Usseerr {
cchhaarr* nnaam
mee;
cchhaarr* iidd;
iinntt ddeepptt;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.7
Pointer to Function
U
Usseerr hheeaaddss[] = {
"R
Riittcchhiiee D
D.M
M",
"ddm
mrr",
"SSeetthhii R
R.",
"rraavvii",
"SSzzyym
maannsskkii T
T.G
G.", "ttggss",
"SScchhrryyeerr N
N.L
L.",
"nnllss",
"SScchhrryyeerr N
N.L
L.",
"nnllss",
"K
Keerrnniigghhaann B
B.W
W.", "bbw
wkk",
};
159
1111227711,
1111227722,
1111227733,
1111227744,
1111227755,
1111227766
vvooiidd pprriinntt__iidd(U
Usseerr* vv, iinntt nn)
{
ffoorr (iinntt ii=00; ii<nn; ii++)
ccoouutt << vv[ii].nnaam
mee << ´\\tt´ << vv[ii].iidd << ´\\tt´ << vv[ii].ddeepptt << ´\\nn´;
}
To be able to sort, we must first define appropriate comparison functions. A comparison function
must return a negative value if its first argument is less than the second, zero if the arguments are
equal, and a positive number otherwise:
iinntt ccm
mpp11(ccoonnsstt vvooiidd* pp, ccoonnsstt vvooiidd* qq) // Compare name strings
{
rreettuurrnn ssttrrccm
mpp(ssttaattiicc__ccaasstt<ccoonnsstt U
Usseerr*>(pp)->nnaam
mee,ssttaattiicc__ccaasstt<ccoonnsstt U
Usseerr*>(qq)->nnaam
mee);
}
iinntt ccm
mpp22(ccoonnsstt vvooiidd* pp, ccoonnsstt vvooiidd* qq) // Compare dept numbers
{
rreettuurrnn ssttaattiicc__ccaasstt<ccoonnsstt U
Usseerr*>(pp)->ddeepptt - ssttaattiicc__ccaasstt<ccoonnsstt U
Usseerr*>(qq)->ddeepptt;
}
This program sorts and prints:
iinntt m
maaiinn()
{
ccoouutt << "H
Heeaaddss iinn aallpphhaabbeettiiccaall oorrddeerr:\\nn";
ssssoorrtt(hheeaaddss,66,ssiizzeeooff(U
Usseerr),ccm
mpp11);
pprriinntt__iidd(hheeaaddss,66);
ccoouutt << "\\nn";
ccoouutt << "H
Heeaaddss iinn oorrddeerr ooff ddeeppaarrttm
meenntt nnuum
mbbeerr:\\nn";
ssssoorrtt(hheeaaddss,66,ssiizzeeooff(U
Usseerr),ccm
mpp22);
pprriinntt__iidd(hheeaaddss,66);
}
You can take the address of an overloaded function by assigning to or initializing a pointer to function. In that case, the type of the target is used to select from the set of overloaded functions. For
example:
vvooiidd ff(iinntt);
iinntt ff(cchhaarr);
vvooiidd (*ppff11)(iinntt) = &ff;
// void f(int)
iinntt (*ppff22)(cchhaarr) = &ff;
// int f(char)
vvooiidd (*ppff33)(cchhaarr) = &ff; // error: no void f(char)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
160
Functions
Chapter 7
A function must be called through a pointer to function with exactly the right argument and return
types. There is no implicit conversion of argument or return types when pointers to functions are
assigned or initialized. This means that
iinntt ccm
mpp33(ccoonnsstt m
myyttyyppee*,ccoonnsstt m
myyttyyppee*);
is not a suitable argument for ssssoorrtt(). The reason is that accepting ccm
mpp33 as an argument to
ssssoorrtt() would violate the guarantee that ccm
mpp33 will be called with arguments of type m
myyttyyppee* (see
also §9.2.5).
7.8 Macros [fct.macro]
Macros are very important in C but have far fewer uses in C++. The first rule about macros is:
Don’t use them unless you have to. Almost every macro demonstrates a flaw in the programming
language, in the program, or in the programmer. Because they rearrange the program text before
the compiler proper sees it, macros are also a major problem for many programming tools. So
when you use macros, you should expect inferior service from tools such as debuggers, crossreference tools, and profilers. If you must use macros, please read the reference manual for your
own implementation of the C++ preprocessor carefully and try not to be too clever. Also to warn
readers, follow the convention to name macros using lots of capital letters. The syntax of macros is
presented in §A.11.
A simple macro is defined like this:
#ddeeffiinnee N
NA
AM
ME
E rreesstt ooff lliinnee
Where N
NA
AM
ME
E is encountered as a token, it is replaced by rreesstt ooff lliinnee. For example,
nnaam
meedd = N
NA
AM
ME
E
will expand into
nnaam
meedd = rreesstt ooff lliinnee
A macro can also be defined to take arguments. For example:
#ddeeffiinnee M
MA
AC
C(xx,yy) aarrgguum
meenntt11: x aarrgguum
meenntt22: y
When M
MA
AC
C is used, two argument strings must be presented. They will replace x and y when
M
MA
AC
C() is expanded. For example,
eexxppaannddeedd = M
MA
AC
C(ffoooo bbaarr, yyuukk yyuukk)
will be expanded into
eexxppaannddeedd = aarrgguum
meenntt11: ffoooo bbaarr aarrgguum
meenntt22: yyuukk yyuukk
Macro names cannot be overloaded, and the macro preprocessor cannot handle recursive calls:
#ddeeffiinnee P
PR
RIIN
NT
T(aa,bb) ccoouutt<<(aa)<<(bb)
#ddeeffiinnee P
PR
RIIN
NT
T(aa,bb,cc) ccoouutt<<(aa)<<(bb)<<(cc) /* trouble?: redefines, does not overload */
#ddeeffiinnee F
FA
AC
C(nn) (nn>11)?nn*F
FA
AC
C(nn-11):11
/* trouble: recursive macro */
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.8
Macros
161
Macros manipulate character strings and know little about C++ syntax and nothing about C++ types
or scope rules. Only the expanded form of a macro is seen by the compiler, so an error in a macro
will be reported when the macro is expanded, not when it is defined. This leads to very obscure
error messages.
Here are some plausible macros:
#ddeeffiinnee C
CA
ASSE
E bbrreeaakk;ccaassee
#ddeeffiinnee F
FO
OR
RE
EV
VE
ER
R ffoorr(;;)
Here are some completely unnecessary macros:
#ddeeffiinnee P
PII 33.114411559933
#ddeeffiinnee B
BE
EG
GIIN
N{
#ddeeffiinnee E
EN
ND
D}
Here are some dangerous macros:
#ddeeffiinnee SSQ
QU
UA
AR
RE
E(aa) aa*aa
#ddeeffiinnee IIN
NC
CR
R__xxxx (xxxx)++
To see why they are dangerous, try expanding this:
iinntt xxxx = 00;
// global counter
vvooiidd ff()
{
iinntt xxxx = 00;
iinntt y = SSQ
QU
UA
AR
RE
E(xxxx+22);
IIN
NC
CR
R__xxxx;
}
// local variable
// y=xx+2*xx+2; that is y=xx+(2*xx)+2
// increments local xx
If you must use a macro, use the scope resolution operator :: when referring to global names
(§4.9.4) and enclose occurrences of a macro argument name in parentheses whenever possible. For
example:
#ddeeffiinnee M
MIIN
N(aa,bb) (((aa)<(bb))?(aa):(bb))
If you must write macros complicated enough to require comments, it is wise to use /* */ comments because C preprocessors that do not know about // comments are sometimes used as part of
C++ tools. For example:
#ddeeffiinnee M
M22(aa) ssoom
meetthhiinngg(aa)
/* thoughtful comment */
Using macros, you can design your own private language. Even if you prefer this ‘‘enhanced language’’ to plain C++, it will be incomprehensible to most C++ programmers. Furthermore, the C
preprocessor is a very simple macro processor. When you try to do something nontrivial, you are
likely to find it either impossible or unnecessarily hard to do. The ccoonnsstt, iinnlliinnee, tteem
mppllaattee, and
nnaam
meessppaaccee mechanisms are intended as alternatives to many traditional uses of preprocessor constructs. For example:
ccoonnsstt iinntt aannssw
weerr = 4422;
tteem
mppllaattee<ccllaassss T
T> iinnlliinnee T m
miinn(T
T aa, T bb) { rreettuurrnn (aa<bb)?aa:bb; }
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
162
Functions
Chapter 7
When writing a macro, it is not unusual to need a new name for something. A string can be created
by concatenating two strings using the ## macro operator. For example,
#ddeeffiinnee N
NA
AM
ME
E22(aa,bb) aa##bb
iinntt N
NA
AM
ME
E22(hhaacckk,ccaahh)();
will produce
iinntt hhaacckkccaahh();
for the compiler to read.
The directive
#uunnddeeff X
ensures that no macro called X is defined – whether or not one was before the directive. This
affords some protection against undesired macros. However, it is not always easy to know what the
effects of X on a piece of code were supposed to be.
7.8.1 Conditional Compilation [fct.cond]
One use of macros is almost impossible to avoid. The directive #iiffddeeff iiddeennttiiffiieerr conditionally
causes all input to be ignored until a #eennddiiff directive is seen. For example,
iinntt ff(iinntt a
#iiffddeeff aarrgg__ttw
woo
,iinntt b
#eennddiiff
);
produces
iinntt ff(iinntt a
);
for the compiler to see unless a macro called aarrgg__ttw
woo has been #ddeeffiinneed. This example confuses
tools that assume sane behavior from the programmer.
Most uses of #iiffddeeff are less bizarre, and when used with restraint, #iiffddeeff does little harm. See
also §9.3.3.
Names of the macros used to control #iiffddeeff should be chosen carefully so that they don’t clash
with ordinary identifiers. For example:
ssttrruucctt C
Caallll__iinnffoo {
N
Nooddee* aarrgg__oonnee;
N
Nooddee* aarrgg__ttw
woo;
// ...
};
This innocent-looking source text will cause some confusion should someone write:
#ddeeffiinnee aarrgg__ttw
woo x
Unfortunately, common and unavoidable headers contain many dangerous and unnecessary macros.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 7.9
Advice
163
7.9 Advice [dcl.advice]
[1]
Be suspicious of non-ccoonnsstt reference arguments; if you want the function to modify its arguments, use pointers and value return instead; §5.5.
[2] Use ccoonnsstt reference arguments when you need to minimize copying of arguments; §5.5.
[3] Use ccoonnsstt extensively and consistently; §7.2.
[4] Avoid macros; §7.8.
[5] Avoid unspecified numbers of arguments; §7.6.
[6] Don’t return pointers or references to local variables; §7.3.
[7] Use overloading when functions perform conceptually the same task on different types; §7.4.
[8] When overloading on integers, provide functions to eliminate common ambiguities; §7.4.3.
[9] When considering the use of a pointer to function, consider whether a virtual function
(§2.5.5) or a template (§2.7.2) would be a better alternative; §7.7.
[10] If you must use macros, use ugly names with lots of capital letters; §7.8.
7.10 Exercises [fct.exercises]
1. (∗1) Write declarations for the following: a function taking arguments of type pointer to character and reference to integer and returning no value; a pointer to such a function; a function taking such a pointer as an argument; and a function returning such a pointer. Write the definition
of a function that takes such a pointer as an argument and returns its argument as the return
value. Hint: Use ttyyppeeddeeff.
2. (∗1) What does the following mean? What would it be good for?
ttyyppeeddeeff iinntt (&rriiffiiii) (iinntt, iinntt);
3. (∗1.5) Write a program like ‘‘Hello, world!’’ that takes a name as a command-line argument
and writes ‘‘Hello, name !’’. Modify this program to take any number of names as arguments
and to say hello to each.
4. (∗1.5) Write a program that reads an arbitrary number of files whose names are given as
command-line arguments and writes them one after another on ccoouutt. Because this program
concatenates its arguments to produce its output, you might call it ccaatt.
5. (∗2) Convert a small C program to C++. Modify the header files to declare all functions called
and to declare the type of every argument. Where possible, replace #ddeeffiinnees with eennuum
m, ccoonnsstt,
or iinnlliinnee. Remove eexxtteerrnn declarations from .cc files and if necessary convert all function definitions to C++ function definition syntax. Replace calls of m
maalllloocc() and ffrreeee() with nneew
w and
ddeelleettee. Remove unnecessary casts.
6. (∗2) Implement ssssoorrtt() (§7.7) using a more efficient sorting algorithm. Hint: qqssoorrtt().
7. (∗2.5) Consider:
ssttrruucctt T
Tnnooddee {
ssttrriinngg w
woorrdd;
iinntt ccoouunntt;
T
Tnnooddee* lleefftt;
T
Tnnooddee* rriigghhtt;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
164
Functions
Chapter 7
Write a function for entering new words into a tree of T
Tnnooddees. Write a function to write out a
tree of T
Tnnooddees. Write a function to write out a tree of T
Tnnooddees with the words in alphabetical
order. Modify T
Tnnooddee so that it stores (only) a pointer to an arbitrarily long word stored as an
array of characters on free store using nneew
w. Modify the functions to use the new definition of
T
Tnnooddee.
8. (∗2.5) Write a function to invert a two-dimensional array. Hint: §C.7.
9. (∗2) Write an encryption program that reads from cciinn and writes the encoded characters to ccoouutt.
You might use this simple encryption scheme: the encrypted form of a character c is cc^kkeeyy[ii],
where kkeeyy is a string passed as a command-line argument. The program uses the characters in
kkeeyy in a cyclic manner until all the input has been read. Re-encrypting encoded text with the
same key produces the original text. If no key (or a null string) is passed, then no encryption is
done.
10. (∗3.5) Write a program to help decipher messages encrypted with the method described in
§7.10[9] without knowing the key. Hint: See David Kahn: The Codebreakers, Macmillan,
1967, New York, pp. 207-213.
11. (∗3) Write an eerrrroorr function that takes a pprriinnttff-style format string containing %ss, %cc, and %dd
directives and an arbitrary number of arguments. Don’t use pprriinnttff(). Look at §21.8 if you
don’t know the meaning of %ss, %cc, and %dd. Use <ccssttddaarrgg>.
12. (∗1) How would you choose names for pointer to function types defined using ttyyppeeddeeff?
13. (∗2) Look at some programs to get an idea of the diversity of styles of names actually used.
How are uppercase letters used? How is the underscore used? When are short names such as i
and x used?
14. (∗1) What is wrong with these macro definitions?
#ddeeffiinnee P
PII = 33.114411559933;
#ddeeffiinnee M
MA
AX
X(aa,bb) aa>bb?aa:bb
#ddeeffiinnee ffaacc(aa) (aa)*ffaacc((aa)-11)
15. (∗3) Write a macro processor that defines and expands simple macros (like the C preprocessor
does). Read from cciinn and write to ccoouutt. At first, don’t try to handle macros with arguments.
Hint: The desk calculator (§6.1) contains a symbol table and a lexical analyzer that you could
modify.
16. (∗2) Implement pprriinntt() from §7.5.
17. (∗2) Add functions such as ssqqrrtt(), lloogg(), and ssiinn() to the desk calculator from §6.1. Hint:
Predefine the names and call the functions through an array of pointers to functions. Don’t forget to check the arguments in a function call.
18. (∗1) Write a factorial function that does not use recursion. See also §11.14[6].
19. (∗2) Write functions to add one day, one month, and one year to a D
Daattee as defined in §5.9[13].
Write a function that gives the day of the week for a given D
Daattee. Write a function that gives the
D
Daattee of the first Monday following a given D
Daattee.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
8
________________________________________
________________________________________________________________________________________________________________________________________________________________
Namespaces and Exceptions
The year is 787!
A.D.?
– Monty Python
No rule is so general,
which admits not some exception.
– Robert Burton
Modularity, interfaces, and exceptions — namespaces — uussiinngg — uussiinngg nnaam
meessppaaccee —
avoiding name clashes — name lookup — namespace composition — namespace aliases
— namespaces and C code — exceptions — tthhrroow
w and ccaattcchh — exceptions and program structure — advice — exercises.
8.1 Modularization and Interfaces [name.module]
Any realistic program consists of a number of separate parts. For example, even the simple ‘‘Hello,
world!’’ program involves at least two parts: the user code requests H
Heelllloo, w
woorrlldd! to be printed,
and the I/O system does the printing.
Consider the desk calculator example from §6.1. It can be viewed as being composed of five
parts:
[1] The parser, doing syntax analysis
[2] The lexer, composing tokens out of characters
[3] The symbol table, holding (string,value) pairs
[4] The driver, m
maaiinn()
[5] The error handler
This can be represented graphically:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
166
Namespaces and Exceptions
Chapter 8
driver
parser
lexer
symbol table
error handler
where an arrow means ‘‘using.’’ To simplify the picture, I have not represented the fact that every
part relies on error handling. In fact, the calculator was conceived as three parts, with the driver
and error handler added for completeness.
When one module uses another, it doesn’t need to know everything about the module used.
Ideally, most of the details of a module are unknown to its users. Consequently, we make a distinction between a module and its interface. For example, the parser directly relies on the lexer’s interface (only), rather than on the complete lexer. The lexer simply implements the services advertised
in its interface. This can be presented graphically like this:
driver
parser interface
parser implementation
lexer interface
lexer implementation
symbol table interface
symbol table implementation
error handler
Dashed lines means ‘‘implements.’’ I consider this to be the real structure of the program, and our
job as programmers is to represent this faithfully in code. That done, the code will be simple, efficient, comprehensible, maintainable, etc., because it will directly reflect our fundamental design.
The following sections show how the logical structure of the desk calculator program can be
made clear, and §9.3 shows how the program source text can be physically organized to take advantage of it. The calculator is a tiny program, so in ‘‘real life’’ I wouldn’t bother using namespaces
and separate compilation (§2.4.1, §9.1) to the extent I do here. It is simply used to present techniques useful for larger programs without our drowning in code. In real programs, each ‘‘module’’
represented by a separate namespace will often have hundreds of functions, classes, templates, etc.
To demonstrate a variety of techniques and language features, I develop the modularization of
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.1
Modularization and Interfaces
167
the calculator in stages. In ‘‘real life,’’ a program is unlikely to grow through all of these stages.
An experienced programmer might pick a design that is ‘‘about right’’ from the start. However, as
a program evolves over the years, dramatic structural changes are not uncommon.
Error handling permeates the structure of a program. When breaking up a program into modules or (conversely) when composing a program out of modules, we must take care to minimize
dependencies between modules caused by error handling. C++ provides exceptions to decouple the
detection and reporting of errors from the handling of errors. Therefore, the discussion of how to
represent modules as namespaces (§8.2) is followed by a demonstration of how we can use exceptions to further improve modularity (§8.3).
There are many more notions of modularity than the ones discussed in this chapter and the next.
For example, we might use concurrently executing and communicating processes to represent
important aspects of modularity. Similarly, the use of separate address spaces and the communication of information between address spaces are important topics not discussed here. I consider
these notions of modularity largely independent and orthogonal. Interestingly, in each case, separating a system into modules is easy. The hard problem is to provide safe, convenient, and efficient
communication across module boundaries.
8.2 Namespaces [name.namespace]
A namespace is a mechanism for expressing logical grouping. That is, if some declarations logically belong together according to some criteria, they can be put in a common namespace to
express that fact. For example, the declarations of the parser from the desk calculator (§6.1.1) may
be placed in a namespace P
Paarrsseerr:
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee eexxpprr(bbooooll);
ddoouubbllee pprriim
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee tteerrm
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee eexxpprr(bbooooll ggeett) { /* ... */ }
}
The function eexxpprr() must be declared first and then later defined to break the dependency loop
described in §6.1.1.
The input part of the desk calculator could be also placed in its own namespace:
nnaam
meessppaaccee L
Leexxeerr {
eennuum
m T
Tookkeenn__vvaalluuee {
N
NA
AM
ME
E,
N
NU
UM
MB
BE
ER
R,
P
PL
LU
USS=´+´,
M
MIIN
NU
USS=´-´,
P
PR
RIIN
NT
T=´;´, A
ASSSSIIG
GN
N=´=´,
};
E
EN
ND
D,
M
MU
UL
L=´*´,
L
LP
P=´(´,
D
DIIV
V=´/´,
R
RP
P=´)´
T
Tookkeenn__vvaalluuee ccuurrrr__ttookk;
ddoouubbllee nnuum
mbbeerr__vvaalluuee;
ssttrriinngg ssttrriinngg__vvaalluuee;
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn() { /* ... */ }
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
168
Namespaces and Exceptions
Chapter 8
This use of namespaces makes it reasonably obvious what the lexer and the parser provide to a
user. However, had I included the source code for the functions, this structure would have been
obscured. If function bodies are included in the declaration of a realistically-sized namespace, you
typically have to wade through pages or screenfuls of information to find what services are offered,
that is, to find the interface.
An alternative to relying on separately specified interfaces is to provide a tool that extracts an
interface from a module that includes implementation details. I don’t consider that a good solution.
Specifying interfaces is a fundamental design activity (see §23.4.3.4), a module can provide different interfaces to different users, and often an interface is designed long before the implementation
details are made concrete.
Here is a version of the P
Paarrsseerr with the interface separated from the implementation:
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee pprriim
m(bbooooll);
ddoouubbllee tteerrm
m(bbooooll);
ddoouubbllee eexxpprr(bbooooll);
}
ddoouubbllee P
Paarrsseerr::pprriim
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee P
Paarrsseerr::eexxpprr(bbooooll ggeett) { /* ... */ }
Note that as a result of separating the implementation of the interface, each function now has
exactly one declaration and one definition. Users will see only the interface containing declarations.
The implementation – in this case, the function bodies – will be placed ‘‘somewhere else’’ where a
user need not look.
As shown, a member can be declared within a namespace definition and defined later using the
namespace-name::member-name notation.
Members of a namespace must be introduced using this notation:
nnaam
meessppaaccee nnaam
meessppaaccee-nnaam
mee {
// declaration and definitions
}
We cannot declare a new member of a namespace outside a namespace definition using the qualifier syntax. For example:
vvooiidd P
Paarrsseerr::llooggiiccaall(bbooooll);
// error: no logical() in Parser
The idea is to make it reasonably easy to find all names in a namespace declaration and also to
catch errors such as misspellings and type mismatches. For example:
ddoouubbllee P
Paarrsseerr::ttrreem
m(bbooooll);
ddoouubbllee P
Paarrsseerr::pprriim
m(iinntt);
// error: no trem() in Parser
// error: Parser::prim() takes a bool argument
A namespace is a scope. Thus, ‘‘namespace’’ is a very fundamental and relatively simple concept.
The larger a program is, the more useful namespaces are to express logical separations of its parts.
Ordinary local scopes, global scopes, and classes are namespaces (§C.10.3).
Ideally, every entity in a program belongs to some recognizable logical unit (‘‘module’’).
Therefore, every declaration in a nontrivial program should ideally be in some namespace named to
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2
Namespaces
169
indicate its logical role in the program. The exception is m
maaiinn(), which must be global in order
for the run-time environment to recognize it as special (§8.3.3).
8.2.1 Qualified Names [name.qualified]
A namespace is a scope. The usual scope rules hold for namespaces, so if a name is previously
declared in the namespace or in an enclosing scope, it can be used without further fuss. A name
from another namespace can be used when qualified by the name of its namespace. For example:
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett)
{
ddoouubbllee lleefftt = pprriim
m(ggeett);
// note Parser:: qualification
// no qualification needed
ffoorr (;;)
ssw
wiittcchh (L
Leexxeerr::ccuurrrr__ttookk) {
ccaassee L
Leexxeerr::M
MU
UL
L:
lleefftt *= pprriim
m(ttrruuee);
// ...
}
// ...
// note Lexer:: qualification
// note Lexer:: qualification
// no qualification needed
}
The P
Paarrsseerr qualifier is necessary to state that this tteerrm
m() is the one declared in P
Paarrsseerr and not
some unrelated global function. Because tteerrm
m() is a member of P
Paarrsseerr, it need not use a qualifier
for pprriim
m(). However, had the L
Leexxeerr qualifier not been present, ccuurrrr__ttookk would have been considered undeclared because the members of namespace L
Leexxeerr are not in scope from within the P
Paarrsseerr
namespace.
8.2.2 Using Declarations [name.using.dcl]
When a name is frequently used outside its namespace, it can be a bother to repeatedly qualify it
with its namespace name. Consider:
ddoouubbllee P
Paarrsseerr::pprriim
m(bbooooll ggeett)
{
iiff (ggeett) L
Leexxeerr::ggeett__ttookkeenn();
// handle primaries
ssw
wiittcchh (L
Leexxeerr::ccuurrrr__ttookk) {
ccaassee L
Leexxeerr::N
NU
UM
MB
BE
ER
R:
// floating-point constant
L
Leexxeerr::ggeett__ttookkeenn();
rreettuurrnn L
Leexxeerr::nnuum
mbbeerr__vvaalluuee;
ccaassee L
Leexxeerr::N
NA
AM
ME
E:
{
ddoouubbllee& v = ttaabbllee[L
Leexxeerr::ssttrriinngg__vvaalluuee];
iiff (L
Leexxeerr::ggeett__ttookkeenn() == L
Leexxeerr::A
ASSSSIIG
GN
N) v = eexxpprr(ttrruuee);
rreettuurrnn vv;
}
ccaassee L
Leexxeerr::M
MIIN
NU
USS:
rreettuurrnn -pprriim
m(ttrruuee);
// unary minus
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
170
Namespaces and Exceptions
Chapter 8
ccaassee L
Leexxeerr::L
LP
P:
{
ddoouubbllee e = eexxpprr(ttrruuee);
iiff (L
Leexxeerr::ccuurrrr__ttookk != L
Leexxeerr::R
RP
P) rreettuurrnn E
Errrroorr::eerrrroorr(") eexxppeecctteedd");
L
Leexxeerr::ggeett__ttookkeenn();
// eat ’)’
rreettuurrnn ee;
}
ccaassee L
Leexxeerr::E
EN
ND
D:
rreettuurrnn 11;
ddeeffaauulltt:
rreettuurrnn E
Errrroorr::eerrrroorr("pprriim
maarryy eexxppeecctteedd");
}
}
The repeated qualification L
Leexxeerr is tedious and distracting. This redundancy can be eliminated by
a using-declaration to state in one place that the ggeett__ttookkeenn used in this scope is L
Leexxeerr’s ggeett__ttookkeenn.
For example:
ddoouubbllee P
Paarrsseerr::pprriim
m(bbooooll ggeett)
// handle primaries
{
uussiinngg L
Leexxeerr::ggeett__ttookkeenn; // use Lexer’s get_token
uussiinngg L
Leexxeerr::ccuurrrr__ttookk;
// use Lexer’s curr_tok
uussiinngg E
Errrroorr::eerrrroorr;
// use Error’s error
iiff (ggeett) ggeett__ttookkeenn();
ssw
wiittcchh (ccuurrrr__ttookk) {
ccaassee L
Leexxeerr::N
NU
UM
MB
BE
ER
R:
// floating-point constant
ggeett__ttookkeenn();
rreettuurrnn L
Leexxeerr::nnuum
mbbeerr__vvaalluuee;
ccaassee L
Leexxeerr::N
NA
AM
ME
E:
{
ddoouubbllee& v = ttaabbllee[L
Leexxeerr::ssttrriinngg__vvaalluuee];
iiff (ggeett__ttookkeenn() == L
Leexxeerr::A
ASSSSIIG
GN
N) v = eexxpprr(ttrruuee);
rreettuurrnn vv;
}
ccaassee L
Leexxeerr::M
MIIN
NU
USS:
// unary minus
rreettuurrnn -pprriim
m(ttrruuee);
ccaassee L
Leexxeerr::L
LP
P:
{
ddoouubbllee e = eexxpprr(ttrruuee);
iiff (ccuurrrr__ttookk != L
Leexxeerr::R
RP
P) rreettuurrnn eerrrroorr(") eexxppeecctteedd");
ggeett__ttookkeenn();
// eat ’)’
rreettuurrnn ee;
}
ccaassee L
Leexxeerr::E
EN
ND
D:
rreettuurrnn 11;
ddeeffaauulltt:
rreettuurrnn eerrrroorr("pprriim
maarryy eexxppeecctteedd");
}
}
A using-declaration introduces a local synonym.
It is often a good idea to keep local synonyms as local as possible to avoid confusion.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.2
Using Declarations
171
However, all parser functions use similar sets of names from other modules. We can therefore
place the using-declarations in the P
Paarrsseerr’s namespace definition:
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee pprriim
m(bbooooll);
ddoouubbllee tteerrm
m(bbooooll);
ddoouubbllee eexxpprr(bbooooll);
uussiinngg L
Leexxeerr::ggeett__ttookkeenn;
uussiinngg L
Leexxeerr::ccuurrrr__ttookk;
uussiinngg E
Errrroorr::eerrrroorr;
// use Lexer’s get_token
// use Lexer’s curr_tok
// use Error’s error
}
This allows us to simplify the P
Paarrsseerr functions almost to our original version (§6.1.1):
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett)
{
ddoouubbllee lleefftt = pprriim
m(ggeett);
// multiply and divide
ffoorr (;;)
ssw
wiittcchh (ccuurrrr__ttookk) {
ccaassee L
Leexxeerr::M
MU
UL
L:
lleefftt *= pprriim
m(ttrruuee);
bbrreeaakk;
ccaassee L
Leexxeerr::D
DIIV
V:
iiff (ddoouubbllee d = pprriim
m(ttrruuee)) {
lleefftt /= dd;
bbrreeaakk;
}
rreettuurrnn eerrrroorr("ddiivviiddee bbyy 00");
ddeeffaauulltt:
rreettuurrnn lleefftt;
}
}
I could have introduced the token names into the P
Paarrsseerr’s namespace. However, I left them
explicitly qualified as a reminder of P
Paarrsseerr’s dependency on L
Leexxeerr.
8.2.3 Using Directives [name.using.dir]
What if our aim were to simplify the P
Paarrsseerr functions to be exactly our original versions? This
would be a reasonable aim for a large program that was being converted to using namespaces from
a previous version with less explicit modularity.
A using-directive makes names from a namespace available almost as if they had been declared
outside their namespace (§8.2.8). For example:
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee pprriim
m(bbooooll);
ddoouubbllee tteerrm
m(bbooooll);
ddoouubbllee eexxpprr(bbooooll);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
172
Namespaces and Exceptions
Chapter 8
uussiinngg nnaam
meessppaaccee L
Leexxeerr; // make all names from Lexer available
uussiinngg nnaam
meessppaaccee E
Errrroorr; // make all names from Error available
}
This allows us to write P
Paarrsseerr’s functions exactly as we originally did (§6.1.1):
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett)
{
ddoouubbllee lleefftt = pprriim
m(ggeett);
// multiply and divide
ffoorr (;;)
ssw
wiittcchh (ccuurrrr__ttookk) {
ccaassee M
MU
UL
L:
lleefftt *= pprriim
m(ttrruuee);
bbrreeaakk;
ccaassee D
DIIV
V:
iiff (ddoouubbllee d = pprriim
m(ttrruuee)) {
lleefftt /= dd;
bbrreeaakk;
}
rreettuurrnn eerrrroorr("ddiivviiddee bbyy 00");
ddeeffaauulltt:
rreettuurrnn lleefftt;
}
// Lexer’s curr_tok
// Lexer’s MUL
// Lexer’s DIV
// Error’s error
}
Global using-directives are a tool for transition (§8.2.9) and are otherwise best avoided. In a namespace, a uussiinngg-ddiirreeccttiivvee is a tool for namespace composition (§8.2.8). In a function (only), a
uussiinngg-ddiirreeccttiivvee can be safely used as a notational convenience (§8.3.3.1).
8.2.4 Multiple Interfaces [name.multi]
It should be clear that the namespace definition we evolved for P
Paarrsseerr is not the interface that the
P
Paarrsseerr presents to its users. Instead, it is the set of declarations that is needed to write the individual parser functions conveniently. The P
Paarrsseerr’s interface to its users should be far simpler:
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee eexxpprr(bbooooll);
}
Fortunately, the two namespace-definitions for P
Paarrsseerr can coexist so that each can be used where it
is most appropriate. We see the namespace P
Paarrsseerr used to provide two things:
[1] The common environment for the functions implementing the parser
[2] The external interface offered by the parser to its users
Thus, the driver code, m
maaiinn(), should see only:
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee eexxpprr(bbooooll);
}
// interface for users
The functions implementing the parser should see whichever interface we decided on as the best for
expressing those functions’ shared environment. That is:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.4
Multiple Interfaces
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee pprriim
m(bbooooll);
ddoouubbllee tteerrm
m(bbooooll);
ddoouubbllee eexxpprr(bbooooll);
uussiinngg L
Leexxeerr::ggeett__ttookkeenn;
uussiinngg L
Leexxeerr::ccuurrrr__ttookk;
uussiinngg E
Errrroorr::eerrrroorr;
173
// interface for implementers
// use Lexer’s get_token
// use Lexer’s curr_tok
// use Error’s error
}
or graphically:
P
Paarrsseerr’
P
Paarrsseerr
.
D
Drriivveerr
.
P
Paarrsseerr implementation
The arrows represent ‘‘relies on the interface provided by’’ relations.
P
Paarrsseerr´ is the small interface offered to users. The name P
Paarrsseerr´ (Parser prime) is not a C++
identifier. It was chosen deliberately to indicate that this interface doesn’t have a separate name in
the program. The lack of a separate name need not lead to confusion because programmers naturally invent different and obvious names for the different interfaces and because the physical layout
of the program (see §9.3.2) naturally provides separate (file) names.
The interface offered to implementers is larger than the interface offered to users. Had this
interface been for a realistically-sized module in a real system, it would change more often than the
interface seen by users. It is important that the users of a module (in this case, m
maaiinn() using
P
Paarrsseerr) are insulated from such changes.
We don’t need to use two separate namespaces to express the two different interfaces, but if we
wanted to, we could. Designing interfaces is one of the most fundamental design activities and one
in which major benefits can be gained and lost. Consequently, it is worthwhile to consider what we
are really trying to achieve and to discuss a number of alternatives.
Please keep in mind that the solution presented is the simplest of those we consider, and often
the best. Its main weaknesses are that the two interfaces don’t have separate names and that the
compiler doesn’t necessarily have sufficient information to check the consistency of the two definitions of the namespace. However, even though the compiler doesn’t always get the opportunity to
check the consistency, it usually does. Furthermore, the linker catches most errors missed by the
compiler.
The solution presented here is the one I use for the discussion of physical modularity (§9.3) and
the one I recommend in the absence of further logical constraints (see also §8.2.7).
8.2.4.1 Interface Design Alternatives [name.alternatives]
The purpose of interfaces is to minimize dependencies between different parts of a program. Minimal interfaces lead to systems that are easier to understand, have better data hiding properties, are
easier to modify, and compile faster.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
174
Namespaces and Exceptions
Chapter 8
When dependencies are considered, it is important to remember that compilers and programmers tend to take a somewhat simple-minded approach to them: ‘‘If a definition is in scope at point
X, then anything written at point X depends on anything stated in that definition.’’ Typically,
things are not really that bad because most definitions are irrelevant to most code. Given the definitions we have used, consider:
nnaam
meessppaaccee P
Paarrsseerr {
// interface for implementers
// ...
ddoouubbllee eexxpprr(bbooooll);
// ...
}
iinntt m
maaiinn()
{
// ...
P
Paarrsseerr::eexxpprr(ffaallssee);
// ...
}
The function m
maaiinn() depends on P
Paarrsseerr::eexxpprr() only, but it takes time, brain power, computation, etc., to figure that out. Consequently, for realistically-sized programs people and compilation
systems often play it safe and assume that where there might be a dependency, there is one. This is
typically a perfectly reasonable approach.
Thus, our aim is to express our program so that the set of potential dependencies is reduced to
the set of actual dependencies.
First, we try the obvious: define a user interface to the parser in terms of the implementer interface we already have:
nnaam
meessppaaccee P
Paarrsseerr {
// ...
ddoouubbllee eexxpprr(bbooooll);
// ...
}
// interface for implementers
nnaam
meessppaaccee P
Paarrsseerr__iinntteerrffaaccee {
uussiinngg P
Paarrsseerr::eexxpprr;
}
// interface for users
Clearly, users of P
Paarrsseerr__iinntteerrffaaccee depend only, and indirectly, on P
Paarrsseerr::eexxpprr(). However, a
crude look at the dependency graph gives us this:
P
Paarrsseerr
P
Paarrsseerr__iinntteerrffaaccee
.
.
D
Drriivveerr
.
.
P
Paarrsseerr implementation
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.4.1
Interface Design Alternatives
175
Now the ddrriivveerr appears vulnerable to any change in the P
Paarrsseerr interface from which it was supposed to be insulated. Even this appearance of a dependency is undesirable, so we explicitly
restrict P
Paarrsseerr__iinntteerrffaaccee’s dependency on P
Paarrsseerr by having only the relevant part of the implementer interface to parser (that was called P
Paarrsseerr´ earlier) in scope where we define
P
Paarrsseerr__iinntteerrffaaccee:
nnaam
meessppaaccee P
Paarrsseerr {
// interface for users
ddoouubbllee eexxpprr(bbooooll);
}
nnaam
meessppaaccee P
Paarrsseerr__iinntteerrffaaccee {
uussiinngg P
Paarrsseerr::eexxpprr;
}
// separately named interface for users
or graphically:
P
Paarrsseerr’
P
Paarrsseerr
P
Paarrsseerr__iinntteerrffaaccee
.
.
D
Drriivveerr
.
.
P
Paarrsseerr implementation
To ensure the consistency of P
Paarrsseerr and P
Paarrsseerr´, we again rely on the compilation system as a
whole, rather than on just the compiler working on a single compilation unit. This solution differs
from the one in §8.2.4 only by the extra namespace P
Paarrsseerr__iinntteerrffaaccee. If we wanted to, we could
give P
Paarrsseerr__iinntteerrffaaccee a concrete representation by giving it its own eexxpprr() function:
nnaam
meessppaaccee P
Paarrsseerr__iinntteerrffaaccee {
ddoouubbllee eexxpprr(bbooooll);
}
Now P
Paarrsseerr need not be in scope in order to define P
Paarrsseerr__iinntteerrffaaccee. It needs to be in scope only
where P
Paarrsseerr__iinntteerrffaaccee::eexxpprr() is defined:
ddoouubbllee P
Paarrsseerr__iinntteerrffaaccee::eexxpprr(bbooooll ggeett)
{
rreettuurrnn P
Paarrsseerr::eexxpprr(ggeett);
}
This last variant can be represented graphically like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
176
Namespaces and Exceptions
Chapter 8
P
Paarrsseerr__iinntteerrffaaccee
P
Paarrsseerr
P
Paarrsseerr__iinntteerrffaaccee
implementation
.
.
D
Drriivveerr
.
.
P
Paarrsseerr implementation
Now all dependencies are minimized. Everything is concrete and properly named. However, for
most problems I face, this solution is also massive overkill.
8.2.5 Avoiding Name Clashes [name.clash]
Namespaces are intended to express logical structure. The simplest such structure is the distinction
between code written by one person vs. code written by someone else. This simple distinction can
be of great practical importance.
When we use only a single global scope, it is unnecessarily difficult to compose a program out
of separate parts. The problem is that the supposedly-separate parts each define the same names.
When combined into the same program, these names clash. Consider:
// my.h:
cchhaarr ff(cchhaarr);
iinntt ff(iinntt);
ccllaassss SSttrriinngg { /* ... */ };
// your.h:
cchhaarr ff(cchhaarr);
ddoouubbllee ff(ddoouubbllee);
ccllaassss SSttrriinngg { /* ... */ };
Given these definitions, a third party cannot easily use both m
myy.hh and yyoouurr.hh. The obvious solution is to wrap each set of declarations in its own namespace:
nnaam
meessppaaccee M
Myy {
cchhaarr ff(cchhaarr);
iinntt ff(iinntt);
ccllaassss SSttrriinngg { /* ... */ };
}
nnaam
meessppaaccee Y
Yoouurr {
cchhaarr ff(cchhaarr);
ddoouubbllee ff(ddoouubbllee);
ccllaassss SSttrriinngg { /* ... */ };
}
Now we can use declarations from M
Myy and Y
Yoouurr through explicit qualification (§8.2.1), usingdeclarations (§8.2.2), or using-directives (§8.2.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.5.1
Unnamed Namespaces
177
8.2.5.1 Unnamed Namespaces [name.unnamed]
It is often useful to wrap a set of declarations in a namespace simply to protect against the possibility of name clashes. That is, the aim is to preserve locality of code rather than to present an interface to users. For example:
#iinncclluuddee "hheeaaddeerr.hh"
nnaam
meessppaaccee M
Miinnee {
iinntt aa;
vvooiidd ff() { /* ... */ }
iinntt gg() { /* ... */ }
}
Since we don’t want the name M
Miinnee to be known outside a local context, it simply becomes a
bother to invent a redundant global name that might accidentally clash with someone else’s names.
In that case, we can simply leave the namespace without a name:
#iinncclluuddee "hheeaaddeerr.hh"
nnaam
meessppaaccee {
iinntt aa;
vvooiidd ff() { /* ... */ }
iinntt gg() { /* ... */ }
}
Clearly, there has to be some way of accessing members of an unnamed namespace from the outside. Consequently, an unnamed namespace has an implied using-directive. The previous declaration is equivalent to
nnaam
meessppaaccee $$$ {
iinntt aa;
vvooiidd ff() { /* ... */ }
iinntt gg() { /* ... */ }
}
uussiinngg nnaam
meessppaaccee $$$;
where $$$ is some name unique to the scope in which the namespace is defined. In particular,
unnamed namespaces in different translation units are different. As desired, there is no way of
naming a member of an unnamed namespace from another translation unit.
8.2.6 Name Lookup [name.koenig]
A function taking an argument of type T is more often than not defined in the same namespace as
T
T. Consequently, if a function isn’t found in the context of its use, we look in the namespaces of its
arguments. For example:
nnaam
meessppaaccee C
Chhrroonnoo {
ccllaassss D
Daattee { /* ... */ };
bbooooll ooppeerraattoorr==(ccoonnsstt D
Daattee&, ccoonnsstt ssttdd::ssttrriinngg&);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
178
Namespaces and Exceptions
Chapter 8
ssttdd::ssttrriinngg ffoorrm
maatt(ccoonnsstt D
Daattee&);
// ...
// make string representation
}
vvooiidd ff(C
Chhrroonnoo::D
Daattee dd, iinntt ii)
{
ssttdd::ssttrriinngg s = ffoorrm
maatt(dd);
ssttdd::ssttrriinngg t = ffoorrm
maatt(ii);
}
// Chrono::format()
// error: no format() in scope
This lookup rule saves the programmer a lot of typing compared to using explicit qualification, yet
it doesn’t pollute the namespace the way a using-directive (§8.2.3) can. It is especially useful for
operator operands (§11.2.4) and template arguments (§C.13.8.4), where explicit qualification can
be quite cumbersome.
Note that the namespace itself needs to be in scope and the function must be declared before it
can be found and used.
Naturally, a function can take arguments from more than one namespace. For example:
vvooiidd ff(C
Chhrroonnoo::D
Daattee dd, ssttdd::ssttrriinngg ss)
{
iiff (dd == ss) {
// ...
}
eellssee iiff (dd == "A
Auugguusstt 44, 11991144") {
// ...
}
}
In such cases, we look for the function in the scope of the call (as ever) and in the namespaces of
every argument (including each argument’s class and base classes) and do the usual overload resolution (§7.4) of all functions we find. In particular, for the call dd==ss, we look for ooppeerraattoorr== in
the scope surrounding ff(), in the ssttdd namespace (where == is defined for ssttrriinngg), and in the
C
Chhrroonnoo namespace. There is a ssttdd::ooppeerraattoorr==(), but it doesn’t take a D
Daattee argument, so we
use C
Chhrroonnoo::ooppeerraattoorr==(), which does. See also §11.2.4.
When a class member invokes a function, other members of the same class and its base classes
are preferred over functions potentially found based on the argument types (§11.2.4).
8.2.7 Namespace Aliases [name.alias]
If users give their namespaces short names, the names of different namespaces will clash:
nnaam
meessppaaccee A { // short name, will clash (eventually)
// ...
}
A
A::SSttrriinngg ss11 = "G
Grriieegg";
A
A::SSttrriinngg ss22 = "N
Niieellsseenn";
However, long namespace names can be impractical in real code:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.7
nnaam
meessppaaccee A
Am
meerriiccaann__T
Teelleepphhoonnee__aanndd__T
Teelleeggrraapphh {
// ...
}
Namespace Aliases
179
// too long
A
Am
meerriiccaann__T
Teelleepphhoonnee__aanndd__T
Teelleeggrraapphh::SSttrriinngg ss33 = "G
Grriieegg";
A
Am
meerriiccaann__T
Teelleepphhoonnee__aanndd__T
Teelleeggrraapphh::SSttrriinngg ss44 = "N
Niieellsseenn";
This dilemma can be resolved by providing a short alias for a longer namespace name:
// use namespace alias to shorten names:
nnaam
meessppaaccee A
AT
TT
T=A
Am
meerriiccaann__T
Teelleepphhoonnee__aanndd__T
Teelleeggrraapphh;
A
AT
TT
T::SSttrriinngg ss33 = "G
Grriieegg";
A
AT
TT
T::SSttrriinngg ss44 = "N
Niieellsseenn";
Namespace aliases also allow a user to refer to ‘‘the library’’ and have a single declaration defining
what library that really is. For example:
nnaam
meessppaaccee L
Liibb = F
Foouunnddaattiioonn__lliibbrraarryy__vv22rr1111;
// ...
L
Liibb::sseett ss;
L
Liibb::SSttrriinngg ss55 = "SSiibbeelliiuuss";
This can immensely simplify the task of replacing one version of a library with another. By using
L
Liibb rather than F
Foouunnddaattiioonn__lliibbrraarryy__vv22rr1111 directly, you can update to version ‘‘v3r02’’ by changing the initialization of the alias L
Liibb and recompiling. The recompile will catch source level incompatibilities. On the other hand, overuse of aliases (of any kind) can lead to confusion.
8.2.8 Namespace Composition [name.compose]
Often, we want to compose an interface out of existing interfaces. For example:
nnaam
meessppaaccee H
Hiiss__ssttrriinngg {
ccllaassss SSttrriinngg { /* ... */ };
SSttrriinngg ooppeerraattoorr+(ccoonnsstt SSttrriinngg&, ccoonnsstt SSttrriinngg&);
SSttrriinngg ooppeerraattoorr+(ccoonnsstt SSttrriinngg&, ccoonnsstt cchhaarr*);
vvooiidd ffiillll(cchhaarr);
// ...
}
nnaam
meessppaaccee H
Heerr__vveeccttoorr {
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr { /* ... */ };
// ...
}
nnaam
meessppaaccee M
Myy__lliibb {
uussiinngg nnaam
meessppaaccee H
Hiiss__ssttrriinngg;
uussiinngg nnaam
meessppaaccee H
Heerr__vveeccttoorr;
vvooiidd m
myy__ffcctt(SSttrriinngg&);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
180
Namespaces and Exceptions
Chapter 8
Given this, we can now write the program in terms of M
Myy__lliibb:
vvooiidd ff()
{
M
Myy__lliibb::SSttrriinngg s = "B
Byyrroonn";
// ...
}
// finds My_lib::His_string::String
uussiinngg nnaam
meessppaaccee M
Myy__lliibb;
vvooiidd gg(V
Veeccttoorr<SSttrriinngg>& vvss)
{
// ...
m
myy__ffcctt(vvss[55]);
// ...
}
If an explicitly qualified name (such as M
Myy__lliibb::SSttrriinngg) isn’t declared in the namespace mentioned, the compiler looks in namespaces mentioned in using-directives (such as H
Hiiss__ssttrriinngg).
Only if we need to define something, do we need to know the real namespace of an entity:
vvooiidd M
Myy__lliibb::ffiillll()
{
// ...
}
// error: no fill() declared in My_lib
vvooiidd H
Hiiss__ssttrriinngg::ffiillll()
{
// ...
}
// ok: fill() declared in His_string
vvooiidd M
Myy__lliibb::m
myy__ffcctt(M
Myy__lliibb::V
Veeccttoorr<M
Myy__lliibb::SSttrriinngg>& vv) // ok
{
// ...
}
Ideally, a namespace should
[1] express a logically coherent set of features,
[2] not give users access to unrelated features, and
[3] not impose a significant notational burden on users.
The composition techniques presented here and in the following subsections – together with the
#iinncclluuddee mechanism (§9.2.1) – provide strong support for this.
8.2.8.1 Selection [name.select]
Occasionally, we want access to only a few names from a namespace. We could do that by writing
a namespace declaration containing only those names we want. For example, we could declare a
version of H
Hiiss__ssttrriinngg that provided the SSttrriinngg itself and the concatenation operator only:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.8.1
Selection
181
nnaam
meessppaaccee H
Hiiss__ssttrriinngg {
// part of His_string only
ccllaassss SSttrriinngg { /* ... */ };
SSttrriinngg ooppeerraattoorr+(ccoonnsstt SSttrriinngg&, ccoonnsstt SSttrriinngg&);
SSttrriinngg ooppeerraattoorr+(ccoonnsstt SSttrriinngg&, ccoonnsstt cchhaarr*);
}
However, unless I am the designer or maintainer of H
Hiiss__ssttrriinngg, this can easily get messy. A
change to the ‘‘real’’ definition of H
Hiiss__ssttrriinngg will not be reflected in this declaration. Selection of
features from a namespace is more explicitly made with using-declarations:
nnaam
meessppaaccee M
Myy__ssttrriinngg {
uussiinngg H
Hiiss__ssttrriinngg::SSttrriinngg;
uussiinngg H
Hiiss__ssttrriinngg::ooppeerraattoorr+;
}
// use any + from His_string
A using-declaration brings every declaration with a given name into scope. In particular, a single
using-declaration can bring in every variant of an overloaded function.
In this way, if the maintainer of H
Hiiss__ssttrriinngg adds a member function to SSttrriinngg or an overloaded
version of the concatenation operator, that change will automatically become available to users of
M
Myy__ssttrriinngg. Conversely, if a feature is removed from H
Hiiss__ssttrriinngg or has its interface changed,
affected uses of M
Myy__ssttrriinngg will be detected by the compiler (see also §15.2.2).
8.2.8.2 Composition and Selection [name.comp]
Combining composition (by using-directives) with selection (by using-declarations) yields the
flexibility needed for most real-world examples. With these mechanisms, we can provide access to
a variety of facilities in such a way that we resolve name clashes and ambiguities arising from their
composition. For example:
nnaam
meessppaaccee H
Hiiss__lliibb {
ccllaassss SSttrriinngg { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr { /* ... */ };
// ...
}
nnaam
meessppaaccee H
Heerr__lliibb {
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr { /* ... */ };
ccllaassss SSttrriinngg { /* ... */ };
// ...
}
nnaam
meessppaaccee M
Myy__lliibb {
uussiinngg nnaam
meessppaaccee H
Hiiss__lliibb; // everything from His_lib
uussiinngg nnaam
meessppaaccee H
Heerr__lliibb; // everything from Her_lib
uussiinngg H
Hiiss__lliibb::SSttrriinngg;
uussiinngg H
Heerr__lliibb::V
Veeccttoorr;
// resolve potential clash in favor of His_lib
// resolve potential clash in favor of Her_lib
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt { /* ... */ }; // additional stuff
// ...
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
182
Namespaces and Exceptions
Chapter 8
When looking into a namespace, names explicitly declared there (including names declared by
using-declarations) take priority over names made accessible in another scope by a using-directive
(see also §C.10.1). Consequently, a user of M
Myy__lliibb will see the name clashes for SSttrriinngg and V
Veeccttoorr
resolved in favor of H
Hiiss__lliibb::SSttrriinngg and H
Heerr__lliibb::V
Veeccttoorr. Also, M
Myy__lliibb::L
Liisstt will be used by
default independently of whether H
Hiiss__lliibb or H
Heerr__lliibb are providing a L
Liisstt.
Usually, I prefer to leave a name unchanged when including it into a new namespace. In that
way, I don’t have to remember two different names for the same entity. However, sometimes a
new name is needed or simply nice to have. For example:
nnaam
meessppaaccee L
Liibb22 {
uussiinngg nnaam
meessppaaccee H
Hiiss__lliibb; // everything from His_lib
uussiinngg nnaam
meessppaaccee H
Heerr__lliibb; // everything from Her_lib
uussiinngg H
Hiiss__lliibb::SSttrriinngg;
uussiinngg H
Heerr__lliibb::V
Veeccttoorr;
// resolve potential clash in favor of His_lib
// resolve potential clash in favor of Her_lib
ttyyppeeddeeff H
Heerr__lliibb::SSttrriinngg H
Heerr__ssttrriinngg;
// rename
tteem
mppllaattee<ccllaassss T
T> ccllaassss H
Hiiss__vveecc
// ‘‘rename’’
: ppuubblliicc H
Hiiss__lliibb::V
Veeccttoorr<T
T> { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt { /* ... */ }; // additional stuff
// ...
}
There is no specific language mechanism for renaming. Instead, the general mechanisms for defining new entities are used.
8.2.9 Namespaces and Old Code [name.get]
Millions of lines of C and C++ code rely on global names and existing libraries. How can we use
namespaces to alleviate problems in such code? Redesigning existing code isn’t always a viable
option. Fortunately, it is possible to use C libraries as if they were defined in a namespace. However, this cannot be done for libraries written in C++ (§9.2.4). On the other hand, namespaces are
designed so that they can be introduced with minimal disruption into an older C++ program.
8.2.9.1 Namespaces and C [name.c]
Consider the canonical first C program:
#iinncclluuddee <ssttddiioo.hh>
iinntt m
maaiinn()
{
pprriinnttff("H
Heelllloo, w
woorrlldd!\\nn");
}
Breaking this program wouldn’t be a good idea. Making standard libraries special cases isn’t a
good idea either. Consequently, the language rules for namespaces are designed to make it relatively easy to take a program written without namespaces and turn it into a more explicitly structured one using namespaces. In fact, the calculator program (§6.1) is an example of this.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.9.1
Namespaces and C
183
The using-directive is the key to achieving this. For example, the declarations of the standard C
I/O facilities from the C header ssttddiioo.hh are wrapped in a namespace like this:
// stdio.h:
nnaam
meessppaaccee ssttdd {
// ...
iinntt pprriinnttff(ccoonnsstt cchhaarr* ... );
// ...
}
uussiinngg nnaam
meessppaaccee ssttdd;
This achieves backwards compatibility. Also, a new header file ccssttddiioo is defined for people who
don’t want the names implicitly available:
// cstdio:
nnaam
meessppaaccee ssttdd {
// ...
iinntt pprriinnttff(ccoonnsstt cchhaarr* ... );
// ...
}
C++ standard library implementers who worry about replication of declarations will, of course,
define ssttddiioo.hh by including ccssttddiioo:
// stdio.h:
#iinncclluuddee<ccssttddiioo>
uussiinngg nnaam
meessppaaccee ssttdd;
I consider nonlocal using-directives primarily a transition tool. Most code referring to names from
other namespaces can be expressed more clearly with explicit qualification and using-declarations.
The relationship between namespaces and linkage is described in §9.2.4.
8.2.9.2 Namespaces and Overloading [name.over]
Overloading (§7.4) works across namespaces. This is essential to allow us to migrate existing
libraries to use namespaces with minimal source code changes. For example:
// old A.h:
vvooiidd ff(iinntt);
// ...
// old B.h:
vvooiidd ff(cchhaarr);
// ...
// old user.c:
#iinncclluuddee "A
A.hh"
#iinncclluuddee "B
B.hh"
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
184
Namespaces and Exceptions
vvooiidd gg()
{
ff(´aa´);
}
Chapter 8
// calls the f() from B.h
This program can be upgraded to a version using namespaces without changing the actual code:
// new A.h:
nnaam
meessppaaccee A {
vvooiidd ff(iinntt);
// ...
}
// new B.h:
nnaam
meessppaaccee B {
vvooiidd ff(cchhaarr);
// ...
}
// new user.c:
#iinncclluuddee "A
A.hh"
#iinncclluuddee "B
B.hh"
uussiinngg nnaam
meessppaaccee A
A;
uussiinngg nnaam
meessppaaccee B
B;
vvooiidd gg()
{
ff(´aa´);
}
// calls the f() from B.h
Had we wanted to keep uusseerr.cc completely unchanged, we would have placed the using-directives
in the header files.
8.2.9.3 Namespaces Are Open [name.open]
A namespace is open; that is, you can add names to it from several namespace declarations. For
example:
nnaam
meessppaaccee A {
iinntt ff(); // now A has member f()
}
nnaam
meessppaaccee A {
iinntt gg(); // now A has two members, f() and g()
}
In this way, we can support large program fragments within a single namespace the way an older
library or application lives within the single global namespace. To do this, we must distribute the
namespace definition over several header and source code files. As shown by the calculator example (§8.2.4), the openness of namespaces allows us to present different interfaces to different kinds
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.2.9.3
Namespaces Are Open
185
of users by presenting different parts of a namespace. This openness is also an aid to transition.
For example,
// my header:
vvooiidd ff(); // my function
// ...
#iinncclluuddee<ssttddiioo.hh>
iinntt gg(); // my function
// ...
can be rewritten without reordering of the declarations:
// my header:
nnaam
meessppaaccee M
Miinnee {
vvooiidd ff(); // my function
// ...
}
#iinncclluuddee<ssttddiioo.hh>
nnaam
meessppaaccee M
Miinnee {
iinntt gg(); // my function
// ...
}
When writing new code, I prefer to use many smaller namespaces (see §8.2.8) rather than putting
really major pieces of code into a single namespace. However, that is often impractical when converting major pieces of software to use namespaces.
When defining a previously declared member of a namespace, it is safer to use the M
Miinnee:: syntax than to re-open M
Miinnee. For example:
vvooiidd M
Miinnee::ffff()
{
// ...
}
// error: no ff() declared in Mine
A compiler catches this error. However, because new functions can be defined within a namespace,
a compiler cannot catch the equivalent error in a re-opened namespace:
nnaam
meessppaaccee M
Miinnee { // re-opening Mine to define functions
vvooiidd ffff() // oops! no ff() declared in Mine; ff() is added to Mine by this definition
{
// ...
}
// ...
}
The compiler has no way of knowing that you didn’t want that new ffff().
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
186
Namespaces and Exceptions
Chapter 8
8.3 Exceptions [name.except]
When a program is composed of separate modules, and especially when those modules come from
separately developed libraries, error handling needs to be separated into two distinct parts:
[1] The reporting of error conditions that cannot be resolved locally
[2] The handling of errors detected elsewhere
The author of a library can detect run-time errors but does not in general have any idea what to do
about them. The user of a library may know how to cope with such errors but cannot detect them –
or else they would be handled in the user’s code and not left for the library to find.
In the calculator example, we bypassed this problem by designing the program as a whole. By
doing that, we could fit error handling into our overall framework. However, when we separate the
logical parts of the calculator into separate namespaces, we see that every namespace depends on
namespace E
Errrroorr (§8.2.2) and that the error handling in E
Errrroorr relies on every module behaving
appropriately after an error. Let’s assume that we don’t have the freedom to design the calculator as
a whole and don’t want the tight coupling between E
Errrroorr and all other modules. Instead, assume
that the parser, etc., are written without knowledge of how a driver might like to handle errors.
Even though eerrrroorr() was very simple, it embodied a strategy for error handling:
nnaam
meessppaaccee E
Errrroorr {
iinntt nnoo__ooff__eerrrroorrss;
ddoouubbllee eerrrroorr(ccoonnsstt cchhaarr* ss)
{
ssttdd::cceerrrr << "eerrrroorr: " << s << ´\\nn´;
nnoo__ooff__eerrrroorrss++;
rreettuurrnn 11;
}
}
The eerrrroorr() function writes out an error message, supplies a default value that allows its caller to
continue a computation, and keeps track of a simple error state. Importantly, every part of the program knows that eerrrroorr() exists, how to call it, and what to expect from it. For a program composed of separately-developed libraries, that would be too much to assume.
Exceptions are C++’s means of separating error reporting from error handling. In this section,
exceptions are briefly described in the context of their use in the calculator example. Chapter 14
provides a more extensive discussion of exceptions and their uses.
8.3.1 Throw and Catch [name.throw]
The notion of an exception is provided to help deal with error reporting. For example:
ssttrruucctt R
Raannggee__eerrrroorr {
iinntt ii;
R
Raannggee__eerrrroorr(iinntt iiii) { i = iiii; } // constructor (§2.5.2, §10.2.3)
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.3.1
Throw and Catch
187
cchhaarr ttoo__cchhaarr(iinntt ii)
{
iiff (ii<nnuum
meerriicc__lliim
miittss<cchhaarr>::m
miinn() || nnuum
meerriicc__lliim
miittss<cchhaarr>::m
maaxx()<ii)// see §22.2
tthhrroow
w R
Raannggee__E
Errrroorr();
rreettuurrnn cc;
}
The ttoo__cchhaarr() function either returns the cchhaarr with the numeric value i or throws a R
Raannggee__eerrrroorr.
The fundamental idea is that a function that finds a problem it cannot cope with throws an exception, hoping that its (direct or indirect) caller can handle the problem. A function that wants to handle a problem can indicate that it is willing to catch exceptions of the type used to report the problem. For example, to call ttoo__cchhaarr() and catch the exception it might throw, we could write:
vvooiidd gg(iinntt ii)
{
ttrryy {
cchhaarr c = ttoo__cchhaarr(ii);
// ...
}
ccaattcchh (R
Raannggee__eerrrroorr) {
cceerrrr << "ooooppss\\nn";
}
}
The construct
ccaattcchh ( /* ... */ ) {
// ...
}
is called an exception handler. It can be used only immediately after a block prefixed with the keyword ttrryy or immediately after another exception handler; ccaattcchh is also a keyword. The parentheses
contain a declaration that is used in a way similar to how a function argument declaration is used.
That is, it specifies the type of the objects that can be caught by this handler and optionally names
the object caught. For example, if we wanted to know the value of the R
Raannggee__eerrrroorr thrown, we
would provide a name for the argument to ccaattcchh exactly the way we name function arguments. For
example:
vvooiidd hh(iinntt ii)
{
ttrryy {
cchhaarr c = ttoo__cchhaarr(ii);
// ...
}
ccaattcchh (R
Raannggee__eerrrroorr xx) {
cceerrrr << "ooooppss: ttoo__cchhaarr(" << xx.ii << ")\\nn";
}
}
If any code in a try-block – or called from it – throws an exception, the try-block’s handlers will be
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
188
Namespaces and Exceptions
Chapter 8
examined. If the exception thrown is of a type specified for a handler, that handler is executed. If
not, the exception handlers are ignored and the try-block acts just like an ordinary block.
Basically, C++ exception handling is a way to transfer control to designated code in a calling
function. Where needed, some information about the error can be passed along to the caller. C
programmers can think of exception handling as a well-behaved mechanism replacing
sseettjjm
mpp/lloonnggjjm
mpp (§16.1.2). The important interaction between exception handling and classes is
described in Chapter 14.
8.3.2 Discrimination of Exceptions [name.discrimination]
Typically, a program will have several different possible run-time errors. Such errors can be
mapped into exceptions with distinct names. I prefer to define types with no other purpose than
exception handling. This minimizes confusion about their purpose. In particular, I never use a
built-in type, such as iinntt, as an exception. In a large program, I would have no effective way to
find unrelated uses of iinntt exceptions. Thus, I could never be sure that such other uses didn’t interfere with my use.
Our calculator (§6.1) must handle two kinds of run-time errors: syntax errors and attempts to
divide by zero. No values need to be passed to a handler from the code that detects an attempt to
divide by zero, so zero divide can be represented by a simple empty type:
ssttrruucctt Z
Zeerroo__ddiivviiddee { };
On the other hand, a handler would most likely prefer to get an indication of what kind of syntax
error occurred. Here, we pass a string along:
ssttrruucctt SSyynnttaaxx__eerrrroorr {
ccoonnsstt cchhaarr* pp;
SSyynnttaaxx__eerrrroorr(ccoonnsstt cchhaarr* qq) { p = qq; }
};
For notational convenience, I added a constructor (§2.5.2, §10.2.3) to the ssttrruucctt.
A user of the parser can discriminate between the two exceptions by adding handlers for both to
a ttrryy block. Where needed, the appropriate handler will be entered. If we ‘‘fall through the bottom’’ of a handler, the execution continues at the end of the list of handlers:
ttrryy {
// ...
eexxpprr(ffaallssee);
// we get here if and only if expr() didn’t cause an exception
// ...
}
ccaattcchh (SSyynnttaaxx__eerrrroorr) {
// handle syntax error
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.3.2
Discrimination of Exceptions
189
ccaattcchh (Z
Zeerroo__ddiivviiddee) {
// handle divide by zero
}
// we get here if expr didn’t cause an exception or if a Syntax_error
// or Zero_divide exception was caught (and its handler didn’t return,
// throw an exception, or in some other way alter the flow of control).
A list of handlers looks a bit like a ssw
wiittcchh statement, but there is no need for bbrreeaakk statements. The
syntax of a list of handlers differs from the syntax of a list of cases partly for that reason and partly
to indicate that each handler is a scope (§4.9.4).
A function need not catch all possible exceptions. For example, the previous try-block didn’t
try to catch exceptions potentially generated by the parser’s input operations. Those exceptions
simply ‘‘pass through,’’ searching for a caller with an appropriate handler.
From the language’s point of view, an exception is considered handled immediately upon entry
into its handler so that any exceptions thrown while executing a handler must be dealt with by the
callers of the try-block. For example, this does not cause an infinite loop:
ccllaassss iinnppuutt__oovveerrfflloow
w { /* ... */ };
vvooiidd ff()
{
ttrryy {
// ...
}
ccaattcchh (iinnppuutt__oovveerrfflloow
w) {
// ...
tthhrroow
w iinnppuutt__oovveerrfflloow
w();
}
}
Exception handlers can be nested. For example:
ccllaassss X
XX
XIIII { /* ... */ };
vvooiidd ff()
{
// ...
ttrryy {
// ...
}
ccaattcchh (X
XX
XIIII) {
ttrryy {
// something complicated
}
ccaattcchh (X
XX
XIIII) {
// complicated handler code failed
}
}
// ...
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
190
Namespaces and Exceptions
Chapter 8
However, such nesting is rare in human-written code and is more often than not an indication of
poor style.
8.3.3 Exceptions in the Calculator [name.calc]
Given the basic exception-handling mechanism, we can rework the calculator example from §6.1 to
separate the handling of errors found at run-time from the main logic of the calculator. This will
result in an organization of the program that more realistically matches what is found in programs
built from separate, loosely connected parts.
First, eerrrroorr() can be eliminated. Instead, the parser functions know only the types used to signal errors:
nnaam
meessppaaccee E
Errrroorr {
ssttrruucctt Z
Zeerroo__ddiivviiddee { };
ssttrruucctt SSyynnttaaxx__eerrrroorr {
ccoonnsstt cchhaarr* pp;
SSyynnttaaxx__eerrrroorr(ccoonnsstt cchhaarr* qq) { p = qq; }
};
}
The parser detects three syntax errors:
T
Tookkeenn__vvaalluuee L
Leexxeerr::ggeett__ttookkeenn()
{
uussiinngg nnaam
meessppaaccee ssttdd;
// to use cin, isalpha(), etc.
// ...
ddeeffaauulltt:
// NAME, NAME =, or error
iiff (iissaallpphhaa(cchh)) {
cciinn.ppuuttbbaacckk(cchh);
cciinn >> ssttrriinngg__vvaalluuee;
rreettuurrnn ccuurrrr__ttookk=N
NA
AM
ME
E;
}
tthhrroow
w E
Errrroorr::SSyynnttaaxx__eerrrroorr("bbaadd ttookkeenn");
}
}
ddoouubbllee P
Paarrsseerr::pprriim
m(bbooooll ggeett)
{
// ...
// handle primaries
ccaassee L
Leexxeerr::L
LP
P:
{
ddoouubbllee e = eexxpprr(ttrruuee);
iiff (ccuurrrr__ttookk != L
Leexxeerr::R
RP
P) tthhrroow
w E
Errrroorr::SSyynnttaaxx__eerrrroorr("‘)´ eexxppeecctteedd");
ggeett__ttookkeenn();
// eat ’)’
rreettuurrnn ee;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.3.3
Exceptions in the Calculator
191
ccaassee L
Leexxeerr::E
EN
ND
D:
rreettuurrnn 11;
ddeeffaauulltt:
tthhrroow
w E
Errrroorr::SSyynnttaaxx__eerrrroorr("pprriim
maarryy eexxppeecctteedd");
}
}
When a syntax error is detected, tthhrroow
w is used to transfer control to a handler defined in some
(direct or indirect) caller. The tthhrroow
w operator also passes a value to the handler. For example,
tthhrroow
w SSyynnttaaxx__eerrrroorr("pprriim
maarryy eexxppeecctteedd");
passes a SSyynnttaaxx__eerrrroorr object containing a pointer to the string pprriim
maarryy eexxppeecctteedd to the handler.
Reporting a divide-by-zero error doesn’t require any data to be passed along:
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett)
// multiply and divide
{
// ...
ccaassee L
Leexxeerr::D
DIIV
V:
iiff (ddoouubbllee d = pprriim
m(ttrruuee)) {
lleefftt /= dd;
bbrreeaakk;
}
tthhrroow
w E
Errrroorr::Z
Zeerroo__ddiivviiddee();
// ...
}
The driver can now be defined to handle Z
Zeerroo__ddiivviiddee and SSyynnttaaxx__eerrrroorr exceptions. For example:
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
// ...
w
whhiillee (*iinnppuutt) {
ttrryy {
L
Leexxeerr::ggeett__ttookkeenn();
iiff (L
Leexxeerr::ccuurrrr__ttookk == L
Leexxeerr::E
EN
ND
D) bbrreeaakk;
iiff (L
Leexxeerr::ccuurrrr__ttookk == L
Leexxeerr::P
PR
RIIN
NT
T) ccoonnttiinnuuee;
ccoouutt << P
Paarrsseerr::eexxpprr(ffaallssee) << ´\\nn´;
}
ccaattcchh(E
Errrroorr::Z
Zeerroo__ddiivviiddee) {
cceerrrr << "aatttteem
mpptt ttoo ddiivviiddee bbyy zzeerroo\\nn";
sskkiipp();
}
ccaattcchh(E
Errrroorr::SSyynnttaaxx__eerrrroorr ee) {
cceerrrr << "ssyynnttaaxx eerrrroorr:" << ee.pp << "\\nn";
sskkiipp();
}
}
iiff (iinnppuutt != &cciinn) ddeelleettee iinnppuutt;
rreettuurrnn nnoo__ooff__eerrrroorrss;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
192
Namespaces and Exceptions
Chapter 8
The function sskkiipp() tries to bring the parser into a well-defined state after an error by skipping
tokens until it finds an end-of-line or a semicolon. It, nnoo__ooff__eerrrroorrss,and iinnppuutt are obvious candidates for a D
Drriivveerr namespace:
nnaam
meessppaaccee D
Drriivveerr {
iinntt nnoo__ooff__eerrrroorrss;
ssttdd::iissttrreeaam
m* iinnppuutt;
vvooiidd sskkiipp();
}
vvooiidd D
Drriivveerr::sskkiipp()
{
nnoo__ooff__eerrrroorrss++;
w
whhiillee (*iinnppuutt) {
cchhaarr cchh;
iinnppuutt->ggeett(cchh);
ssw
wiittcchh (cchh) {
ccaassee ´\\nn´:
ccaassee ´;´:
iinnppuutt->ggeett(cchh);
rreettuurrnn;
}
}
}
The code for sskkiipp() is deliberately written at a lower level of abstraction than the parser code so as
to avoid being caught by exceptions from the parser while handling parser exceptions.
I retained the idea of counting the number of errors and reporting that number as the program’s
return value. It is often useful to know if a program encountered an error even if it was able to
recover from it.
I did not put m
maaiinn() in the D
Drriivveerr namespace. The global m
maaiinn() is the initial function of a
program (§3.2); a m
maaiinn() in another namespace has no special meaning.
8.3.3.1 Alternative Error-Handling Strategies [name.strategy]
The original error-handling code was shorter and more elegant than the version using exceptions.
However, it achieved that elegance by tightly coupling all parts of the program. That approach
doesn’t scale well to programs composed of separately developed libraries.
We could consider eliminating the separate error-handling function sskkiipp() by introducing a
state variable in m
maaiinn(). For example:
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
// ...
// example of poor style
bbooooll iinn__eerrrroorr = ffaallssee;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.3.3.1
Alternative Error-Handling Strategies
193
w
whhiillee (*D
Drriivveerr::iinnppuutt) {
ttrryy {
L
Leexxeerr::ggeett__ttookkeenn();
iiff (L
Leexxeerr::ccuurrrr__ttookk == L
Leexxeerr::E
EN
ND
D) bbrreeaakk;
iiff (L
Leexxeerr::ccuurrrr__ttookk == L
Leexxeerr::P
PR
RIIN
NT
T) {
iinn__eerrrroorr = ffaallssee;
ccoonnttiinnuuee;
}
iiff (iinn__eerrrroorr == ffaallssee) ccoouutt << P
Paarrsseerr::eexxpprr(ffaallssee) << ´\\nn´;
}
ccaattcchh(E
Errrroorr::Z
Zeerroo__ddiivviiddee) {
cceerrrr << "aatttteem
mpptt ttoo ddiivviiddee bbyy zzeerroo\\nn";
iinn__eerrrroorr = ttrruuee;
}
ccaattcchh(E
Errrroorr::SSyynnttaaxx__eerrrroorr ee) {
cceerrrr << "ssyynnttaaxx eerrrroorr:" << ee.pp << "\\nn";
iinn__eerrrroorr = ttrruuee;
}
}
iiff (D
Drriivveerr::iinnppuutt != ssttdd::cciinn) ddeelleettee D
Drriivveerr::iinnppuutt;
rreettuurrnn D
Drriivveerr::nnoo__ooff__eerrrroorrss;
}
I consider this a bad idea for several reasons:
[1] State variables are a common source of confusion and errors, especially if they are allowed
to proliferate and affect larger sections of a program. In particular, I consider the version of
m
maaiinn() using iinn__eerrrroorr less readable than the version using sskkiipp().
[2] It is generally a good strategy to keep error handling and ‘‘normal’’ code separate.
[3] Doing error handling using the same level of abstraction as the code that caused the error is
hazardous; the error-handling code might repeat the same error that triggered the error handling in the first place. I leave it as an exercise to find how that can happen for the version
of m
maaiinn() using iinn__eerrrroorr (§8.5[7]).
[4] It is more work to modify the ‘‘normal’’ code to add error-handling code than to add separate error-handling routines.
Exception handling is intended for dealing with nonlocal problems. If an error can be handled
locally, it almost always should be. For example, there is no reason to use an exception to handle
the too-many-arguments error:
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
uussiinngg nnaam
meessppaaccee ssttdd;
uussiinngg nnaam
meessppaaccee D
Drriivveerr;
ssw
wiittcchh (aarrggcc) {
ccaassee 11:
iinnppuutt = &cciinn;
bbrreeaakk;
// read from standard input
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
194
Namespaces and Exceptions
Chapter 8
ccaassee 22:
// read argument string
iinnppuutt = nneew
w iissttrriinnggssttrreeaam
m(aarrggvv[11]);
bbrreeaakk;
ddeeffaauulltt:
cceerrrr << "ttoooo m
maannyy aarrgguum
meennttss\\nn";
rreettuurrnn 11;
}
// as before
}
Exceptions are discussed further in Chapter 14.
8.4 Advice [name.advice]
[1] Use namespaces to express logical structure; §8.2.
[2] Place every nonlocal name, except m
maaiinn(), in some namespace; §8.2.
[3] Design a namespace so that you can conveniently use it without accidentally gaining access to
unrelated namespaces; §8.2.4.
[4] Avoid very short names for namespaces; §8.2.7.
[5] If necessary, use namespace aliases to abbreviate long namespaces names; §8.2.7.
[6] Avoid placing heavy notational burdens on users of your namespaces; §8.2.2, §8.2.3.
[7] Use the N
Naam
meessppaaccee::m
meem
mbbeerr notation when defining namespace members; §8.2.8.
[8] Use uussiinngg nnaam
meessppaaccee only for transition or within a local scope; §8.2.9.
[9] Use exceptions to decouple the treatment of ‘‘errors’’ from the code dealing with the ordinary
processing; §8.3.3.
[10] Use user-defined rather than built-in types as exceptions; §8.3.2.
[11] Don’t use exceptions when local control structures are sufficient; §8.3.3.1.
8.5 Exercises [name.exercises]
1. (∗2.5) Write a doubly-linked list of ssttrriinngg module in the style of the SSttaacckk module from §2.4.
Exercise it by creating a list of names of programming languages. Provide a ssoorrtt() function
for that list, and provide a function that reverses the order of the strings in it.
2. (∗2) Take some not-too-large program that uses at least one library that does not use namespaces and modify it to use a namespace for that library. Hint: §8.2.9.
3. (∗2) Modify the desk calculator program into a module in the style of §2.4 using namespaces.
Don’t use any global using-directives. Keep a record of the mistakes you made. Suggest ways
of avoiding such mistakes in the future.
4. (∗1) Write a program that throws an exception in one function and catches it in another.
5. (∗2) Write a program consisting of functions calling each other to a calling depth of 10. Give
each function an argument that determines at which level an exception is thrown. Have
m
maaiinn() catch these exceptions and print out which exception is caught. Don’t forget the case
in which an exception is caught in the function that throws it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 8.5
Exercises
195
6. (∗2) Modify the program from §8.5[5] to measure if there is a difference in the cost of catching
exceptions depending on where in a class stack the exception is thrown. Add a string object to
each function and measure again.
7. (∗1) Find the error in the first version of m
maaiinn() in §8.3.3.1.
8. (∗2) Write a function that either returns a value or that throws that value based on an argument.
Measure the difference in run-time between the two ways.
9. (∗2) Modify the calculator version from §8.5[3] to use exceptions. Keep a record of the mistakes you make. Suggest ways of avoiding such mistakes in the future.
10. (∗2.5) Write pplluuss(), m
miinnuuss(), m
muullttiippllyy(), and ddiivviiddee() functions that check for possible
overflow and underflow and that throw exceptions if such errors happen.
11. (∗2) Modify the calculator to use the functions from §8.5[10].
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
196
Namespaces and Exceptions
Chapter 8
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
9
________________________________________
________________________________________________________________________________________________________________________________________________________________
Source Files and Programs
Form must follow function.
– Le Corbusier
Separate compilation — linking — header files — standard library headers — the onedefinition rule — linkage to non-C++ code — linkage and pointers to functions — using
headers to express modularity — single-header organization — multiple-header organization — include guards — programs — advice — exercises.
9.1 Separate Compilation [file.separate]
A file is the traditional unit of storage (in a file system) and the traditional unit of compilation.
There are systems that do not store, compile, and present C++ programs to the programmer as sets
of files. However, the discussion here will concentrate on systems that employ the traditional use
of files.
Having a complete program in one file is usually impossible. In particular, the code for the
standard libraries and the operating system is typically not supplied in source form as part of a
user’s program. For realistically-sized applications, even having all of the user’s own code in a single file is both impractical and inconvenient. The way a program is organized into files can help
emphasize its logical structure, help a human reader understand the program, and help the compiler
to enforce that logical structure. Where the unit of compilation is a file, all of a file must be recompiled whenever a change (however small) has been made to it or to something on which it depends.
For even a moderately sized program, the amount of time spent recompiling can be significantly
reduced by partitioning the program into files of suitable size.
A user presents a source file to the compiler. The file is then preprocessed; that is, macro processing (§7.8) is done and #iinncclluuddee directives bring in headers (§2.4.1, §9.2.1). The result of preprocessing is called a translation unit. This unit is what the compiler proper works on and what the
C++ language rules describe. In this book, I differentiate between source file and translation unit
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
198
Source Files and Programs
Chapter 9
only where necessary to distinguish what the programmer sees from what the compiler considers.
To enable separate compilation, the programmer must supply declarations providing the type
information needed to analyze a translation unit in isolation from the rest of the program. The
declarations in a program consisting of many separately compiled parts must be consistent in
exactly the same way the declarations in a program consisting of a single source file must be. Your
system will have tools to help ensure this. In particular, the linker can detect many kinds of inconsistencies. The linker is the program that binds together the separately compiled parts. A linker is
sometimes (confusingly) called a loader. Linking can be done completely before a program starts
to run. Alternatively, new code can be added to the program (‘‘dynamically linked’’) later.
The organization of a program into source files is commonly called the physical structure of a
program. The physical separation of a program into separate files should be guided by the logical
structure of the program. The same dependency concerns that guide the composition of programs
out of namespaces guide its composition into source files. However, the logical and physical structure of a program need not be identical. For example, it can be useful to use several source files to
store the functions from a single namespace, to store a collection of namespace definitions in a single file, and to scatter the definition of a namespace over several files (§8.2.4).
Here, we will first consider some technicalities relating to linking and then discuss two ways of
breaking the desk calculator (§6.1, §8.2) into files.
9.2 Linkage [file.link]
Names of functions, classes, templates, variables, namespaces, enumerations, and enumerators
must be used consistently across all translation units unless they are explicitly specified to be local.
It is the programmer’s task to ensure that every namespace, class, function, etc. is properly
declared in every translation unit in which it appears and that all declarations referring to the same
entity are consistent. For example, consider two files:
// file1.c:
iinntt x = 11;
iinntt ff() { /* do something */ }
// file2.c:
eexxtteerrnn iinntt xx;
iinntt ff();
vvooiidd gg() { x = ff(); }
The x and ff() used by gg() in ffiillee22.cc are the ones defined in ffiillee11.cc. The keyword eexxtteerrnn indicates that the declaration of x in ffiillee22.cc is (just) a declaration and not a definition (§4.9). Had x
been initialized, eexxtteerrnn would simply be ignored because a declaration with an initializer is always
a definition. An object must be defined exactly once in a program. It may be declared many times,
but the types must agree exactly. For example:
// file1.c:
iinntt x = 11;
iinntt b = 11;
eexxtteerrnn iinntt cc;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.2
// file2.c:
iinntt xx;
eexxtteerrnn ddoouubbllee bb;
eexxtteerrnn iinntt cc;
Linkage
199
// meaning int x = 0;
There are three errors here: x is defined twice, b is declared twice with different types, and c is
declared twice but not defined. These kinds of errors (linkage errors) cannot be detected by a compiler that looks at only one file at a time. Most, however, are detectable by the linker. Note that a
variable defined without an initializer in the global or a namespace scope is initialized by default.
This is not the case for local variables (§4.9.5, §10.4.2) or objects created on the free store (§6.2.6).
For example, the following program fragment contains two errors:
// file1.c:
iinntt xx;
iinntt ff() { rreettuurrnn xx; }
// file2.c:
iinntt xx;
iinntt gg() { rreettuurrnn ff(); }
The call of ff() in ffiillee22.cc is an error because ff() has not been declared in ffiillee22.cc. Also, the program will not link because x is defined twice. Note that these are not errors in C (§B.2.2).
A name that can be used in translation units different from the one in which it was defined is
said to have external linkage. All the names in the previous examples have external linkage. A
name that can be referred to only in the translation unit in which it is defined is said to have
internal linkage.
An iinnlliinnee function (§7.1.1, §10.2.9) must be defined – by identical definitions (§9.2.3) – in
every translation unit in which it is used. Consequently, the following example isn’t just bad taste;
it is illegal:
// file1.c:
iinnlliinnee iinntt ff(iinntt ii) { rreettuurrnn ii; }
// file2.c:
iinnlliinnee iinntt ff(iinntt ii) { rreettuurrnn ii+11; }
Unfortunately, this error is hard for an implementation to catch, and the following – otherwise perfectly logical – combination of external linkage and inlining is banned to make life simpler for
compiler writers:
// file1.c:
eexxtteerrnn iinnlliinnee iinntt gg(iinntt ii);
iinntt hh(iinntt ii) { rreettuurrnn gg(ii); }
// error: g() undefined in this translation unit
// file2.c:
eexxtteerrnn iinnlliinnee iinntt gg(iinntt ii) { rreettuurrnn ii+11; }
By default, ccoonnsstts (§5.4) and ttyyppeeddeeffs (§4.9.7) have internal linkage. Consequently, this example
is legal (although potentially confusing):
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
200
Source Files and Programs
Chapter 9
// file1.c:
ttyyppeeddeeff iinntt T
T;
ccoonnsstt iinntt x = 77;
// file2.c:
ttyyppeeddeeff vvooiidd T
T;
ccoonnsstt iinntt x = 88;
Global variables that are local to a single compilation unit are a common source of confusion and
are best avoided. To ensure consistency, you should usually place global ccoonnsstts and iinnlliinnees in
header files only (§9.2.1).
A ccoonnsstt can be given external linkage by an explicit declaration:
// file1.c:
eexxtteerrnn ccoonnsstt iinntt a = 7777;
// file2.c:
eexxtteerrnn ccoonnsstt iinntt aa;
vvooiidd gg()
{
ccoouutt << a << ´\\nn´;
}
Here, gg() will print 7777.
An unnamed namespace (§8.2.5) can be used to make names local to a compilation unit. The
effect of an unnamed namespace is very similar to that of internal linkage. For example:
// file 1.c:
nnaam
meessppaaccee {
ccllaassss X { /* ... */ };
vvooiidd ff();
iinntt ii;
// ...
}
// file2.c:
ccllaassss X { /* ... */ };
vvooiidd ff();
iinntt ii;
// ...
The function ff() in ffiillee11.cc is not the same function as the ff() in ffiillee22.cc. Having a name local to
a translation unit and also using that same name elsewhere for an entity with external linkage is
asking for trouble.
In C and older C++ programs, the keyword ssttaattiicc is (confusingly) used to mean ‘‘use internal
linkage’’ (§B.2.3). Don’t use ssttaattiicc except inside functions (§7.1.2) and classes (§10.2.4).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.2.1
Header Files
201
9.2.1 Header Files [file.header]
The types in all declarations of the same object, function, class, etc., must be consistent. Consequently, the source code submitted to the compiler and later linked together must be consistent.
One imperfect but simple method of achieving consistency for declarations in different translation
units is to #iinncclluuddee header files containing interface information in source files containing executable code and/or data definitions.
The #iinncclluuddee mechanism is a text manipulation facility for gathering source program fragments
together into a single unit (file) for compilation. The directive
#iinncclluuddee "ttoo__bbee__iinncclluuddeedd"
replaces the line in which the #iinncclluuddee appears with the contents of the file ttoo__bbee__iinncclluuddeedd. The
content should be C++ source text because the compiler will proceed to read it.
To include standard library headers, use the angle brackets < and > around the name instead of
quotes. For example:
#iinncclluuddee <iioossttrreeaam
m>
#iinncclluuddee "m
myyhheeaaddeerr.hh"
// from standard include directory
// from current directory
Unfortunately, spaces are significant within the < > or " " of an include directive:
#iinncclluuddee < iioossttrreeaam
m >
// will not find <iostream>
It may seem extravagant to recompile a file each time it is included somewhere, but the included
files typically contain only declarations and not code needing extensive analysis by the compiler.
Furthermore, most modern C++ implementations provide some form of precompiling of header
files to minimize the work needed to handle repeated compilation of the same header.
As a rule of thumb, a header may contain:
________________________________________________________________________
Named namespaces
nnaam
meessppaaccee N { /* ...... */ }
Type definitions
ssttrruucctt P
Pooiinntt { iinntt xx, yy; };
tteem
mppllaattee<ccllaassss T
T> ccllaassss Z
Z;
Template declarations
tteem
mppllaattee<ccllaassss T
T> ccllaassss V { /* ...... */ };
Template definitions
Function declarations
eexxtteerrnn iinntt ssttrrlleenn(ccoonnsstt cchhaarr*);
Inline function definitions
iinnlliinnee cchhaarr ggeett(cchhaarr* pp) { rreettuurrnn *pp++; }
Data declarations
eexxtteerrnn iinntt aa;
ccoonnsstt ffllooaatt ppii = 33..114411559933;
Constant definitions
eennuum
mL
Liigghhtt { rreedd, yyeelllloow
w, ggrreeeenn };
Enumerations
Name declarations
ccllaassss M
Maattrriixx;
Include directives
#iinncclluuddee <aallggoorriitthhm
m>
Macro definitions
#ddeeffiinnee V
VE
ER
RSSIIO
ON
N 1122
#iiffddeeff ____ccpplluusspplluuss
Conditional compilation directives
Comments
/* cchheecckk ffoorr eenndd ooff ffiillee */
________________________________________________________________________
This rule of thumb for what may be placed in a header is not a language requirement. It is simply a
reasonable way of using the #iinncclluuddee mechanism to express the physical structure of a program.
Conversely, a header should never contain:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
202
Source Files and Programs
Chapter 9
______________________________________________________________________
Ordinary function definitions
cchhaarr ggeett(cchhaarr* pp) { rreettuurrnn *pp++; }
Data definitions
iinntt aa;
sshhoorrtt ttbbll[] = { 11, 22, 3 };
Aggregate definitions
nnaam
meessppaaccee { /* ...... */ }
Unnamed namespaces
______________________________________________________________________
Exported template definitions
eexxppoorrtt tteem
mppllaattee<ccllaassss T
T> ff(T
T tt) { /* ...... */ }
Header files are conventionally suffixed by .hh, and files containing function or data definitions are
suffixed by .cc. They are therefore often referred to as ‘‘.h files’’ and ‘‘.c files,’’ respectively.
Other conventions, such as .C
C, .ccxxxx, .ccpppp, and .cccc, are also found. The manual for your compiler will be quite specific about this issue.
The reason for recommending that the definition of simple constants, but not the definition of
aggregates, be placed in header files is that it is hard for implementations to avoid replication of
aggregates presented in several translation units. Furthermore, the simple cases are far more common and therefore more important for generating good code.
It is wise not to be too clever about the use of #iinncclluuddee. My recommendation is to #iinncclluuddee
only complete declarations and definitions and to do so only in the global scope, in linkage specification blocks, and in namespace definitions when converting old code (§9.2.2). As usual, it is wise
to avoid macro magic. One of my least favorite activities is tracking down an error caused by a
name being macro-substituted into something completely different by a macro defined in an indirectly #iinncclluuddeed header that I have never even heard of.
9.2.2 Standard Library Headers [file.std.header]
The facilities of the standard library are presented through a set of standard headers (§16.1.2). No
suffix is needed for standard library headers; they are known to be headers because they are
included using the #iinncclluuddee<...> syntax rather than #iinncclluuddee"...". The absence of a .hh suffix does not imply anything about how the header is stored. A header such as <m
maapp> may be
stored as a text file called m
maapp.hh in a standard directory. On the other hand, standard headers are
not required to be stored in a conventional manner. An implementation is allowed to take advantage of knowledge of the standard library definition to optimize the standard library implementation
and the way standard headers are handled. For example, an implementation might have knowledge
of the standard math library (§22.3) built in and treat #iinncclluuddee<ccm
maatthh> as a switch that makes the
standard math functions available without reading any file.
For each C standard-library header <X
X.hh>, there is a corresponding standard C++ header <ccX
X>.
For example, #iinncclluuddee<ccssttddiioo> provides what #iinncclluuddee<ssttddiioo.hh> does. A typical ssttddiioo.hh will
look something like this:
#iiffddeeff ____ccpplluusspplluuss
nnaam
meessppaaccee ssttdd {
// for C++ compliers only (§9.2.4)
// the standard library is defined in namespace std (§8.2.9)
eexxtteerrnn "C
C" {
#eennddiiff
// stdio functions have C linkage (§9.2.4)
// ...
iinntt pprriinnttff(ccoonnsstt cchhaarr* ...);
// ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.2.2
#iiffddeeff ____ccpplluusspplluuss
}
}
uussiinngg nnaam
meessppaaccee ssttdd;
#eennddiiff
Standard Library Headers
203
// make stdio available in global namespace
That is, the actual declarations are (most likely) shared, but linkage and namespace issues must be
addressed to allow C and C++ to share a header.
9.2.3 The One-Definition Rule [file.odr]
A given class, enumeration, and template, etc., must be defined exactly once in a program.
From a practical point of view, this means that there must be exactly one definition of, say, a
class residing in a single file somewhere. Unfortunately, the language rule cannot be that simple.
For example, the definition of a class may be composed through macro expansion (ugh!), while a
definition of a class may be textually included in two source files by #iinncclluuddee directives (§9.2.1).
Worse, a ‘‘file’’ isn’t a concept that is part of the C and C++ language definitions; there exist implementations that do not store programs in source files.
Consequently, the rule in the standard that says that there must be a unique definition of a class,
template, etc., is phrased in a somewhat more complicated and subtle manner. This rule is commonly referred to as ‘‘the one-definition rule,’’ the ODR. That is, two definitions of a class, template, or inline function are accepted as examples of the same unique definition if and only if
[1] they appear in different translation units, and
[2] they are token-for-token identical, and
[3] the meanings of those tokens are the same in both translation units.
For example:
// file1.c:
ssttrruucctt S { iinntt aa; cchhaarr bb; };
vvooiidd ff(SS*);
// file2.c:
ssttrruucctt S { iinntt aa; cchhaarr bb; };
vvooiidd ff(SS* pp) { /* ... */ }
The ODR says that this example is valid and that S refers to the same class in both source files.
However, it is unwise to write out a definition twice like that. Someone maintaining ffiillee22.cc will
naturally assume that the definition of S in ffiillee22.cc is the only definition of S and so feel free to
change it. This could introduce a hard-to-detect error.
The intent of the ODR is to allow inclusion of a class definition in different translation units
from a common source file. For example:
// file s.h:
ssttrruucctt S { iinntt aa; cchhaarr bb; };
vvooiidd ff(SS*);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
204
Source Files and Programs
Chapter 9
// file1.c:
#iinncclluuddee "ss.hh"
// use f() here
// file2.c:
#iinncclluuddee "ss.hh"
vvooiidd ff(SS* pp) { /* ... */ }
or graphically:
ss..hh::
ssttrruucctt S { iinntt aa; cchhaarr bb; };
vvooiidd ff(SS*);
ffiillee11..cc::
ffiillee22..cc::
#iinncclluuddee ""ss..hh""
// use f() here
#iinncclluuddee ""ss..hh""
vvooiidd ff(SS* pp) { /* ...... */ }
Here are examples of the three ways of violating the ODR:
// file1.c:
ssttrruucctt SS11 { iinntt aa; cchhaarr bb; };
ssttrruucctt SS11 { iinntt aa; cchhaarr bb; };
// error: double definition
This is an error because a ssttrruucctt may not be defined twice in a single translation unit.
// file1.c:
ssttrruucctt SS22 { iinntt aa; cchhaarr bb; };
// file2.c:
ssttrruucctt SS22 { iinntt aa; cchhaarr bbbb; }; // error
This is an error because SS22 is used to name classes that differ in a member name.
// file1.c:
ttyyppeeddeeff iinntt X
X;
ssttrruucctt SS33 { X aa; cchhaarr bb; };
// file2.c:
ttyyppeeddeeff cchhaarr X
X;
ssttrruucctt SS33 { X aa; cchhaarr bb; };
// error
Here the two definitions of SS33 are token-for-token identical, but the example is an error because the
meaning of the name X has sneakily been made to differ in the two files.
Checking against inconsistent class definitions in separate translation units is beyond the ability
of most C++ implementations. Consequently, declarations that violate the ODR can be a source of
subtle errors. Unfortunately, the technique of placing shared definitions in headers and #iinncclluuddiinngg
them doesn’t protect against this last form of ODR violation. Local typedefs and macros can
change the meaning of #iinncclluuddeed declarations:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.2.3
The One-Definition Rule
205
// file s.h:
ssttrruucctt S { P
Pooiinntt aa; cchhaarr bb; };
// file1.c:
#ddeeffiinnee P
Pooiinntt iinntt
#iinncclluuddee "ss.hh"
// ...
// file2.c:
ccllaassss P
Pooiinntt { /* ... */ };
#iinncclluuddee "ss.hh"
// ...
The best defense against this kind of hackery is to make headers as self-contained as possible. For
example, if class P
Pooiinntt had been declared in the ss.hh header the error would have been detected.
A template definition can be #iinncclluuddeed in several translation units as long as the ODR is
adhered to. In addition, an exported template can be used given only a declaration:
// file1.c:
eexxppoorrtt tteem
mppllaattee<ccllaassss T
T> T ttw
wiiccee(T
T tt) { rreettuurrnn tt+tt; }
// file2.c:
tteem
mppllaattee<ccllaassss T
T> T ttw
wiiccee(T
T tt);
iinntt gg(iinntt ii) { rreettuurrnn ttw
wiiccee(ii); }
// declaration
The keyword eexxppoorrtt means ‘‘accessible from another translation unit’’ (§13.7).
9.2.4 Linkage to Non-C++ Code [file.c]
Typically, a C++ program contains parts written in other languages. Similarly, it is common for
C++ code fragments to be used as parts of programs written mainly in some other language. Cooperation can be difficult between program fragments written in different languages and even between
fragments written in the same language but compiled with different compilers. For example, different languages and different implementations of the same language may differ in their use of
machine registers to hold arguments, the layout of arguments put on a stack, the layout of built-in
types such as strings and integers, the form of names passed by the compiler to the linker, and the
amount of type checking required from the linker. To help, one can specify a linkage convention to
be used in an eexxtteerrnn declaration. For example, this declares the C and C++ standard library function ssttrrccppyy() and specifies that it should be linked according to the C linkage conventions:
eexxtteerrnn "C
C" cchhaarr* ssttrrccppyy(cchhaarr*, ccoonnsstt cchhaarr*);
The effect of this declaration differs from the effect of the ‘‘plain’’ declaration
eexxtteerrnn cchhaarr* ssttrrccppyy(cchhaarr*, ccoonnsstt cchhaarr*);
only in the linkage convention used for calling ssttrrccppyy().
The eexxtteerrnn ""C
C"" directive is particularly useful because of the close relationship between C and
C++. Note that the C in eexxtteerrnn ""C
C"" names a linkage convention and not a language. Often, eexxtteerrnn
""C
C"" is used to link to Fortran and assembler routines that happen to conform to the conventions of a
C implementation.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
206
Source Files and Programs
Chapter 9
An eexxtteerrnn ""C
C"" directive specifies the linkage convention (only) and does not affect the semantics of calls to the function. In particular, a function declared eexxtteerrnn ""C
C"" still obeys the C++ type
checking and argument conversion rules and not the weaker C rules. For example:
eexxtteerrnn "C
C" iinntt ff();
iinntt gg()
{
rreettuurrnn ff(11);
}
// error: no argument expected
Adding eexxtteerrnn ""C
C"" to a lot of declarations can be a nuisance. Consequently, there is a mechanism
to specify linkage to a group of declarations. For example:
eexxtteerrnn "C
C" {
cchhaarr* ssttrrccppyy(cchhaarr*, ccoonnsstt cchhaarr*);
iinntt ssttrrccm
mpp(ccoonnsstt cchhaarr*, ccoonnsstt cchhaarr*);
iinntt ssttrrlleenn(ccoonnsstt cchhaarr*);
// ...
}
This construct, commonly called a linkage block, can be used to enclose a complete C header to
make a header suitable for C++ use. For example:
eexxtteerrnn "C
C" {
#iinncclluuddee <ssttrriinngg.hh>
}
This technique is commonly used to produce a C++ header from a C header. Alternatively, conditional compilation (§7.8.1) can be used to create a common C and C++ header:
#iiffddeeff ____ccpplluusspplluuss
eexxtteerrnn "C
C" {
#eennddiiff
cchhaarr* ssttrrccppyy(cchhaarr*, ccoonnsstt cchhaarr*);
iinntt ssttrrccm
mpp(ccoonnsstt cchhaarr*, ccoonnsstt cchhaarr*);
iinntt ssttrrlleenn(ccoonnsstt cchhaarr*);
// ...
#iiffddeeff ____ccpplluusspplluuss
}
#eennddiiff
The predefined macro name ____ccpplluusspplluuss is used to ensure that the C++ constructs are edited out
when the file is used as a C header.
Any declaration can appear within a linkage block:
eexxtteerrnn "C
C" {
// any declaration here, for example:
iinntt gg11;
// definition
eexxtteerrnn iinntt gg22; // declaration, not definition
}
In particular, the scope and storage class of variables are not affected, so gg11 is still a global variable
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Linkage to Non-C++ Code
Section 9.2.4
207
– and is still defined rather than just declared. To declare but not define a variable, you must apply
the keyword eexxtteerrnn directly in the declaration. For example:
eexxtteerrnn "C
C" iinntt gg33;
// declaration, not definition
This looks odd at first glance. However, it is a simple consequence of keeping the meaning
unchanged when adding ""C
C"" to an extern declaration and the meaning of a file unchanged when
enclosing it in a linkage block.
A name with C linkage can be declared in a namespace. The namespace will affect the way the
name is accessed in the C++ program, but not the way a linker sees it. The pprriinnttff() from ssttdd is a
typical example:
#iinncclluuddee<ccssttddiioo>
vvooiidd ff()
{
ssttdd::pprriinnttff("H
Heelllloo, ");
pprriinnttff("w
woorrlldd!\\nn");
}
// ok
// error: no global printf()
Even when called ssttdd::pprriinnttff, it is still the same old C pprriinnttff() (§21.8).
Note that this allows us to include libraries with C linkage into a namespace of our choice rather
than polluting the global namespace. Unfortunately, the same flexibility is not available to us for
headers defining functions with C++ linkage in the global namespace. The reason is that linkage of
C++ entities must take namespaces into account so that the object files generated will reflect the use
or lack of use of namespaces.
9.2.5 Linkage and Pointers to Functions [file.ptof]
When mixing C and C++ code fragments in one program, we sometimes want to pass pointers to
functions defined in one language to functions defined in the other. If the two implementations of
the two languages share linkage conventions and function-call mechanisms, such passing of pointers to functions is trivial. However, such commonality cannot in general be assumed, so care must
be taken to ensure that a function is called the way it expects to be called.
When linkage is specified for a declaration, the specified linkage applies to all function types,
function names, and variable names introduced by the declaration(s). This makes all kinds of
strange – and occasionally essential – combinations of linkage possible. For example:
ttyyppeeddeeff iinntt (*F
FT
T)(ccoonnsstt vvooiidd*, ccoonnsstt vvooiidd*);
// FT has C++ linkage
eexxtteerrnn "C
C" {
ttyyppeeddeeff iinntt (*C
CF
FT
T)(ccoonnsstt vvooiidd*, ccoonnsstt vvooiidd*);
vvooiidd qqssoorrtt(vvooiidd* pp, ssiizzee__tt nn, ssiizzee__tt sszz, C
CF
FT
T ccm
mpp);
}
// CFT has C linkage
// cmp has C linkage
vvooiidd iissoorrtt(vvooiidd* pp, ssiizzee__tt nn, ssiizzee__tt sszz, F
FT
T ccm
mpp);
// cmp has C++ linkage
vvooiidd xxssoorrtt(vvooiidd* pp, ssiizzee__tt nn, ssiizzee__tt sszz, C
CF
FT
T ccm
mpp);
// cmp has C linkage
eexxtteerrnn "C
C" vvooiidd yyssoorrtt(vvooiidd* pp, ssiizzee__tt nn, ssiizzee__tt sszz, F
FT
T ccm
mpp); // cmp has C++ linkage
iinntt ccoom
mppaarree(ccoonnsstt vvooiidd*, ccoonnsstt vvooiidd*);
eexxtteerrnn "C
C" iinntt ccccm
mpp(ccoonnsstt vvooiidd*, ccoonnsstt vvooiidd*);
// compare() has C++ linkage
// ccmp() has C linkage
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
208
Source Files and Programs
Chapter 9
vvooiidd ff(cchhaarr* vv, iinntt sszz)
{
qqssoorrtt(vv,sszz,11,&ccoom
mppaarree); // error
qqssoorrtt(vv,sszz,11,&ccccm
mpp);
// ok
iissoorrtt(vv,sszz,11,&ccoom
mppaarree); // ok
iissoorrtt(vv,sszz,11,&ccccm
mpp);
// error
}
An implementation in which C and C++ use the same calling conventions might accept the cases
marked error as a language extension.
9.3 Using Header Files [file.using]
To illustrate the use of headers, I present a few alternative ways of expressing the physical structure
of the calculator program (§6.1, §8.2).
9.3.1 Single Header File [file.single]
The simplest solution to the problem of partitioning a program into several files is to put the definitions in a suitable number of .cc files and to declare the types needed for them to communicate in a
single .hh file that each .cc file #iinncclluuddees. For the calculator program, we might use five .cc files –
lleexxeerr.cc, ppaarrsseerr.cc, ttaabbllee.cc, eerrrroorr.cc, and m
maaiinn.cc – to hold function and data definitions, plus the
header ddcc.hh to hold the declarations of every name used in more than one .cc file.
The header ddcc.hh would look like this:
// dc.h:
nnaam
meessppaaccee E
Errrroorr {
ssttrruucctt Z
Zeerroo__ddiivviiddee { };
ssttrruucctt SSyynnttaaxx__eerrrroorr {
ccoonnsstt cchhaarr* pp;
SSyynnttaaxx__eerrrroorr(ccoonnsstt cchhaarr* qq) { p = qq; }
};
}
#iinncclluuddee <ssttrriinngg>
nnaam
meessppaaccee L
Leexxeerr {
eennuum
m T
Tookkeenn__vvaalluuee {
N
NA
AM
ME
E,
N
NU
UM
MB
BE
ER
R,
E
EN
ND
D,
P
PL
LU
USS=´+´,
M
MIIN
NU
USS=´-´, M
MU
UL
L=´*´,
P
PR
RIIN
NT
T=´;´, A
ASSSSIIG
GN
N=´=´, L
LP
P=´(´,
};
D
DIIV
V=´/´,
R
RP
P=´)´
eexxtteerrnn T
Tookkeenn__vvaalluuee ccuurrrr__ttookk;
eexxtteerrnn ddoouubbllee nnuum
mbbeerr__vvaalluuee;
eexxtteerrnn ssttdd::ssttrriinngg ssttrriinngg__vvaalluuee;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.3.1
Single Header File
209
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn();
}
nnaam
meessppaaccee P
Paarrsseerr {
ddoouubbllee pprriim
m(bbooooll ggeett);
ddoouubbllee tteerrm
m(bbooooll ggeett);
ddoouubbllee eexxpprr(bbooooll ggeett);
// handle primaries
// multiply and divide
// add and subtract
uussiinngg L
Leexxeerr::ggeett__ttookkeenn;
uussiinngg L
Leexxeerr::ccuurrrr__ttookk;
}
#iinncclluuddee <m
maapp>
eexxtteerrnn ssttdd::m
maapp<ssttdd::ssttrriinngg,ddoouubbllee> ttaabbllee;
nnaam
meessppaaccee D
Drriivveerr {
eexxtteerrnn iinntt nnoo__ooff__eerrrroorrss;
eexxtteerrnn ssttdd::iissttrreeaam
m* iinnppuutt;
vvooiidd sskkiipp();
}
The keyword eexxtteerrnn is used for every declaration of a variable to ensure that multiple definitions do
not occur as we #iinncclluuddee ddcc.hh in the various .cc files. The corresponding definitions are found in
the appropriate .cc files.
Leaving out the actual code, lleexxeerr.cc will look something like this:
// lexer.c:
#iinncclluuddee "ddcc.hh"
#iinncclluuddee <iioossttrreeaam
m>
#iinncclluuddee <ccccttyyppee>
L
Leexxeerr::T
Tookkeenn__vvaalluuee L
Leexxeerr::ccuurrrr__ttookk;
ddoouubbllee L
Leexxeerr::nnuum
mbbeerr__vvaalluuee;
ssttdd::ssttrriinngg L
Leexxeerr::ssttrriinngg__vvaalluuee;
L
Leexxeerr::T
Tookkeenn__vvaalluuee L
Leexxeerr::ggeett__ttookkeenn() { /* ... */ }
Using headers in this manner ensures that every declaration in a header will at some point be
included in the file containing its definition. For example, when compiling lleexxeerr.cc the compiler
will be presented with:
nnaam
meessppaaccee L
Leexxeerr { // from dc.h
// ...
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn();
}
// ...
L
Leexxeerr::T
Tookkeenn__vvaalluuee L
Leexxeerr::ggeett__ttookkeenn() { /* ... */ }
This ensures that the compiler will detect any inconsistencies in the types specified for a name. For
example, had ggeett__ttookkeenn() been declared to return a T
Tookkeenn__vvaalluuee, but defined to return an iinntt, the
compilation of lleexxeerr.cc would have failed with a type-mismatch error. If a definition is missing,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
210
Source Files and Programs
Chapter 9
the linker will catch the problem. If a declaration is missing, some .cc file will fail to compile.
File ppaarrsseerr.cc will look like this:
// parser.c:
#iinncclluuddee "ddcc.hh"
ddoouubbllee P
Paarrsseerr::pprriim
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee P
Paarrsseerr::eexxpprr(bbooooll ggeett) { /* ... */ }
File ttaabbllee.cc will look like this:
// table.c:
#iinncclluuddee "ddcc.hh"
ssttdd::m
maapp<ssttdd::ssttrriinngg,ddoouubbllee> ttaabbllee;
The symbol table is simply a variable of the standard library m
maapp type. This defines ttaabbllee to be
global. In a realistically-sized program, this kind of minor pollution of the global namespace builds
up and eventually causes problems. I left this sloppiness here simply to get an opportunity to warn
against it.
Finally, file m
maaiinn.cc will look like this:
// main.c:
#iinncclluuddee "ddcc.hh"
#iinncclluuddee <ssssttrreeaam
m>
iinntt D
Drriivveerr::nnoo__ooff__eerrrroorrss = 00;
ssttdd::iissttrreeaam
m* D
Drriivveerr::iinnppuutt = 00;
vvooiidd D
Drriivveerr::sskkiipp() { /* ... */ }
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[]) { /* ... */ }
To be recognized as the m
maaiinn() of the program, m
maaiinn() must be a global function, so no namespace is used here.
The physical structure of the system can be presented like this:
.
<
<ssssttrreeaam
m>
>
.
.
<
<m
maapp>
>
.
.
.
<
<ssttrriinngg>
>
.
<
<ccccttyyppee>
>
.
.
<
<iioossttrreeaam
m>
>
.
.
dc.h
.
.
ddrriivveerr..cc
.
.
ppaarrsseerr..cc
.
ttaabbllee..cc
.
lleexxeerr..cc
Note that the headers on the top are all headers for standard library facilities. For many forms of
program analysis, these libraries can be ignored because they are well known and stable. For tiny
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.3.1
Single Header File
211
programs, the structure can be simplified by moving all #iinncclluuddee directives to the common header.
This single-header style of physical partitioning is most useful when the program is small and
its parts are not intended to be used separately. Note that when namespaces are used, the logical
structure of the program is still represented within ddcc.hh. If namespaces are not used, the structure
is obscured, although comments can be a help.
For larger programs, the single header file approach is unworkable in a conventional file-based
development environment. A change to the common header forces recompilation of the whole program, and updates of that single header by several programmers are error-prone. Unless strong
emphasis is placed on programming styles relying heavily on namespaces and classes, the logical
structure deteriorates as the program grows.
9.3.2 Multiple Header Files [file.multi]
An alternative physical organization lets each logical module have its own header defining the
facilities it provides. Each .cc file then has a corresponding .hh file specifying what it provides (its
interface). Each .cc file includes its own .hh file and usually also other .hh files that specify what it
needs from other modules in order to implement the services advertised in the interface. This physical organization corresponds to the logical organization of a module. The interface for users is put
into its .hh file, the interface for implementers is put into a file suffixed __iim
mppll.hh, and the module’s
definitions of functions, variables, etc. are placed in .cc files. In this way, the parser is represented
by three files. The parser’s user interface is provided by ppaarrsseerr.hh:
// parser.h:
nnaam
meessppaaccee P
Paarrsseerr {
// interface for users
ddoouubbllee eexxpprr(bbooooll ggeett);
}
The shared environment for the functions implementing the parser is presented by ppaarrsseerr__iim
mppll.hh:
// parser_impl.h:
#iinncclluuddee "ppaarrsseerr.hh"
#iinncclluuddee "eerrrroorr.hh"
#iinncclluuddee "lleexxeerr.hh"
nnaam
meessppaaccee P
Paarrsseerr {
// interface for implementers
ddoouubbllee pprriim
m(bbooooll ggeett);
ddoouubbllee tteerrm
m(bbooooll ggeett);
ddoouubbllee eexxpprr(bbooooll ggeett);
uussiinngg L
Leexxeerr::ggeett__ttookkeenn;
uussiinngg L
Leexxeerr::ccuurrrr__ttookk;
}
The user’s header ppaarrsseerr.hh is #iinncclluuddeed to give the compiler a chance to check consistency
(§9.3.1).
The functions implementing the parser are stored in ppaarrsseerr.cc together with #iinncclluuddee directives
for the headers that the P
Paarrsseerr functions need:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
212
Source Files and Programs
Chapter 9
// parser.c:
#iinncclluuddee "ppaarrsseerr__iim
mppll.hh"
#iinncclluuddee "ttaabbllee.hh"
ddoouubbllee P
Paarrsseerr::pprriim
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee P
Paarrsseerr::tteerrm
m(bbooooll ggeett) { /* ... */ }
ddoouubbllee P
Paarrsseerr::eexxpprr(bbooooll ggeett) { /* ... */ }
Graphically, the parser and the driver’s use of it look like this:
.
ppaarrsseerr..hh
.
.
.
.
.
lleexxeerr..hh
.
ppaarrsseerr__iim
mppll..hh
.
ddrriivveerr..cc
.
.
eerrrroorr..hh
.
ttaabbllee..hh
.
.
ppaarrsseerr..cc
As intended, this is a rather close match to the logical structure described in §8.3.3. To simplify
this structure, we could have #iinncclluuddeed ttaabbllee.hh in ppaarrsseerr__iim
mppll.hh rather than in ppaarrsseerr.cc. However, ttaabbllee.hh is an example of something that is not necessary to express the shared context of the
parser functions; it is needed only by their implementation. In fact, it is used by just one function,
eexxpprr(), so if we were really keen on minimizing dependencies we could place eexxpprr() in its own
.cc file and #iinncclluuddee ttaabbllee.hh there only:
.
ppaarrsseerr..hh
.
.
lleexxeerr..hh
.
ppaarrsseerr__iim
mppll..hh
.
.
.
eerrrroorr..hh
.
.
ttaabbllee..hh
.
.
.
ppaarrsseerr..cc
eexxpprr..cc
Such elaboration is not appropriate except for larger modules. For realistically-sized modules, it is
common to #iinncclluuddee extra files where needed for individual functions. Furthermore, it is not
uncommon to have more than one __iim
mppll.hh, since different subsets of the module’s functions need
different shared contexts.
Please note that the __iim
mppll.hh notation is not a standard or even a common convention; it is simply the way I like to name things.
Why bother with this more complicated scheme of multiple header files? It clearly requires far
less thought simply to throw every declaration into a single header, as was done for ddcc.hh.
The multiple-header organization scales to modules several magnitudes larger than our toy
parser and to programs several magnitudes larger than our calculator. The fundamental reason for
using this type of organization is that it provides a better localization of concerns. When analyzing
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.3.2
Multiple Header Files
213
and modifying a large program, it is essential for a programmer to focus on a relatively small chunk
of code. The multiple-header organization makes it easy to determine exactly what the parser code
depends on and to ignore the rest of the program. The single-header approach forces us to look at
every declaration used by any module and decide if it is relevant. The simple fact is that maintenance of code is invariably done with incomplete information and from a local perspective. The
multiple-header organization allows us to work successfully ‘‘from the inside out’’ with only a
local perspective. The single-header approach – like every other organization centered around a
global repository of information – requires a top-down approach and will forever leave us wondering exactly what depends on what.
The better localization leads to less information needed to compile a module, and thus to faster
compiles. The effect can be dramatic. I have seen compile times drop by a factor of ten as the
result of a simple dependency analysis leading to a better use of headers.
9.3.2.1 Other Calculator Modules [file.multi.etc]
The remaining calculator modules can be organized similarly to the parser. However, those modules are so small that they don’t require their own __iim
mppll.hh files. Such files are needed only where
a logical module consists of many functions that need a shared context.
The error handler was reduced to the set of exception types so that no eerrrroorr.cc was needed:
// error.h:
nnaam
meessppaaccee E
Errrroorr {
ssttrruucctt Z
Zeerroo__ddiivviiddee { };
ssttrruucctt SSyynnttaaxx__eerrrroorr {
ccoonnsstt cchhaarr* pp;
SSyynnttaaxx__eerrrroorr(ccoonnsstt cchhaarr* qq) { p = qq; }
};
}
The lexer provides a rather large and messy interface:
// lexer.h:
#iinncclluuddee <ssttrriinngg>
nnaam
meessppaaccee L
Leexxeerr {
eennuum
m T
Tookkeenn__vvaalluuee {
N
NA
AM
ME
E,
N
NU
UM
MB
BE
ER
R,
E
EN
ND
D,
P
PL
LU
USS=´+´,
M
MIIN
NU
USS=´-´, M
MU
UL
L=´*´,
P
PR
RIIN
NT
T=´;´, A
ASSSSIIG
GN
N=´=´, L
LP
P=´(´,
};
D
DIIV
V=´/´,
R
RP
P=´)´
eexxtteerrnn T
Tookkeenn__vvaalluuee ccuurrrr__ttookk;
eexxtteerrnn ddoouubbllee nnuum
mbbeerr__vvaalluuee;
eexxtteerrnn ssttdd::ssttrriinngg ssttrriinngg__vvaalluuee;
T
Tookkeenn__vvaalluuee ggeett__ttookkeenn();
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
214
Source Files and Programs
Chapter 9
In addition to lleexxeerr.hh, the implementation of the lexer depends on eerrrroorr.hh, <iioossttrreeaam
m>, and the
functions determining the kinds of characters declared in <ccccttyyppee>:
// lexer.c:
#iinncclluuddee "lleexxeerr.hh"
#iinncclluuddee "eerrrroorr.hh"
#iinncclluuddee <iioossttrreeaam
m>
#iinncclluuddee <ccccttyyppee>
L
Leexxeerr::T
Tookkeenn__vvaalluuee L
Leexxeerr::ccuurrrr__ttookk;
ddoouubbllee L
Leexxeerr::nnuum
mbbeerr__vvaalluuee;
ssttdd::ssttrriinngg L
Leexxeerr::ssttrriinngg__vvaalluuee;
L
Leexxeerr::T
Tookkeenn__vvaalluuee L
Leexxeerr::ggeett__ttookkeenn() { /* ... */ }
We could have factored out the #iinncclluuddee statements for eerrrroorr.hh as the L
Leexxeerr’s __iim
mppll.hh file.
However, I considered that excessive for this tiny program.
As usual, we #iinncclluuddee the interface offered by the module – in this case, lleexxeerr.hh – in the
module’s implementation to give the compiler a chance to check consistency.
The symbol table is essentially self-contained, although the standard library header <m
maapp>
could drag in all kinds of interesting stuff to implement an efficient m
maapp template class:
// table.h:
#iinncclluuddee <m
maapp>
#iinncclluuddee <ssttrriinngg>
eexxtteerrnn ssttdd::m
maapp<ssttdd::ssttrriinngg,ddoouubbllee> ttaabbllee;
Because we assume that every header may be #iinncclluuddeed in several .cc files, we must separate the
declaration of ttaabbllee from its definition, even though the difference between ttaabbllee.cc and ttaabbllee.hh is
the single keyword eexxtteerrnn:
// table.c:
#iinncclluuddee "ttaabbllee.hh"
ssttdd::m
maapp<ssttdd::ssttrriinngg,ddoouubbllee> ttaabbllee;
Basically, the driver depends on everything:
// main.c:
#iinncclluuddee "ppaarrsseerr.hh"
#iinncclluuddee "lleexxeerr.hh"
#iinncclluuddee "eerrrroorr.hh"
#iinncclluuddee "ttaabbllee.hh"
nnaam
meessppaaccee D
Drriivveerr {
iinntt nnoo__ooff__eerrrroorrss;
ssttdd::iissttrreeaam
m* iinnppuutt;
vvooiidd sskkiipp();
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.3.2.1
Other Calculator Modules
215
#iinncclluuddee <ssssttrreeaam
m>
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[]) { /* ... */ }
Because the D
Drriivveerr namespace is used exclusively by m
maaiinn(), I placed it in m
maaiinn.cc. Alternatively, I could have factored it out as ddrriivveerr.hh and #iinncclluuddeed it.
For a larger system, it is usually worthwhile organizing things so that the driver has fewer direct
dependencies. Often, is it also worth minimizing what is done in m
maaiinn() by having m
maaiinn() call a
driver function placed in a separate source file. This is particularly important for code intended to
be used as a library. Then, we cannot rely on code in m
maaiinn() and must be prepared to be called
from a variety of functions (§9.6[8]).
9.3.2.2 Use of Headers [file.multi.use]
The number of headers to use for a program is a function of many factors. Many of these factors
have more to do with the way files are handled on your system than with C++. For example, if your
editor does not have facilities for looking at several files at the same time, then using many headers
becomes less attractive. Similarly, if opening and reading 20 files of 50 lines each is noticeably
more time-consuming than reading a single file of 1000 lines, you might think twice before using
the multiple-header style for a small project.
A word of caution: a dozen headers plus the standard headers for the program’s execution environment (which can often be counted in the hundreds) are usually manageable. However, if you
partition the declarations of a large program into the logically minimal-sized headers (putting each
structure declaration in its own file, etc.), you can easily get an unmanageable mess of hundreds of
files even for minor projects. I find that excessive.
For large projects, multiple headers are unavoidable. In such projects, hundreds of files (not
counting standard headers) are the norm. The real confusion starts when they start to be counted in
the thousands. At that scale, the basic techniques discussed here still apply, but their management
becomes a Herculean task. Remember that for realistically-sized programs, the single-header style
is not an option. Such programs will have multiple headers. The choice between the two styles of
organization occurs (repeatedly) for the parts that make up the program.
The single-header style and the multiple-header style are not really alternatives to each other.
They are complementary techniques that must be considered whenever a significant module is
designed and must be reconsidered as a system evolves. It’s crucial to remember that one interface
doesn’t serve all equally well. It is usually worthwhile to distinguish between the implementers’
interface and the users’ interface. In addition, many larger systems are structured so that providing
a simple interface for the majority of users and a more extensive interface for expert users is a good
idea. The expert users’ interfaces (‘‘complete interfaces’’) tend to #iinncclluuddee many more features
than the average user would ever want to know about. In fact, the average users’ interface can
often be identified by eliminating features that require the inclusion of headers that define facilities
that would be unknown to the average user. The term ‘‘average user’’ is not derogatory. In the
fields in which I don’t have to be an expert, I strongly prefer to be an average user. In that way, I
minimize hassles.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
216
Source Files and Programs
Chapter 9
9.3.3 Include Guards [file.guards]
The idea of the multiple-header approach is to represent each logical module as a consistent, selfcontained unit. Viewed from the program as a whole, many of the declarations needed to make
each logical module complete are redundant. For larger programs, such redundancy can lead to
errors, as a header containing class definitions or inline functions gets #iinncclluuddeed twice in the same
compilation unit (§9.2.3).
We have two choices. We can
[1] reorganize our program to remove the redundancy, or
[2] find a way to allow repeated inclusion of headers.
The first approach – which led to the final version of the calculator – is tedious and impractical for
realistically-sized programs. We also need that redundancy to make the individual parts of the program comprehensible in isolation.
The benefits of an analysis of redundant #iinncclluuddees and the resulting simplifications of the program can be significant both from a logical point of view and by reducing compile times. However, it can rarely be complete, so some method of allowing redundant #iinncclluuddees must be applied.
Preferably, it must be applied systematically, since there is no way of knowing how thorough an
analysis a user will find worthwhile.
The traditional solution is to insert include guards in headers. For example:
// error.h:
#iiffnnddeeff C
CA
AL
LC
C__E
ER
RR
RO
OR
R__H
H
#ddeeffiinnee C
CA
AL
LC
C__E
ER
RR
RO
OR
R__H
H
nnaam
meessppaaccee E
Errrroorr {
// ...
}
#eennddiiff
// CALC_ERROR_H
The contents of the file between the #iiffnnddeeff and #eennddiiff are ignored by the compiler if
C
CA
AL
LC
C__E
ER
RR
RO
OR
R__H
H is defined. Thus, the first time eerrrroorr.hh is seen during a compilation, its contents are read and C
CA
AL
LC
C__E
ER
RR
RO
OR
R__H
H is given a value. Should the compiler be presented with
eerrrroorr.hh again during the compilation, the contents are ignored. This is a piece of macro hackery,
but it works and it is pervasive in the C and C++ worlds. The standard headers all have include
guards.
Header files are included in essentially arbitrary contexts, and there is no namespace protection
against macro name clashes. Consequently, I choose rather long and ugly names as my include
guards.
Once people get used to headers and include guards, they tend to include lots of headers directly
and indirectly. Even with C++ implementations that optimize the processing of headers, this can be
undesirable. It can cause unnecessarily long compile time, and it can bring lloottss of declarations and
macros into scope. The latter might affect the meaning of the program in unpredictable and adverse
ways. Headers should be included only when necessary.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.4
Programs
217
9.4 Programs [file.programs]
A program is a collection of separately compiled units combined by a linker. Every function,
object, type, etc., used in this collection must have a unique definition (§4.9, §9.2.3). The program
must contain exactly one function called m
maaiinn() (§3.2). The main computation performed by the
program starts with the invocation of m
maaiinn() and ends with a return from m
maaiinn(). The iinntt
returned by m
maaiinn() is passed to whatever system invoked m
maaiinn() as the result of the program.
This simple story must be elaborated on for programs that contain global variables (§10.4.9) or
that throw an uncaught exception (§14.7).
9.4.1 Initialization of Nonlocal Variables [file.nonlocal]
In principle, a variable defined outside any function (that is, global, namespace, and class ssttaattiicc
variables) is initialized before m
maaiinn() is invoked. Such nonlocal variables in a translation unit are
initialized in their declaration order (§10.4.9). If such a variable has no explicit initializer, it is by
default initialized to the default for its type (§10.4.2). The default initializer value for built-in types
and enumerations is 00. For example:
ddoouubbllee x = 22;
// nonlocal variables
ddoouubbllee yy;
ddoouubbllee ssqqxx = ssqqrrtt(xx+yy);
Here, x and y are initialized before ssqqxx, so ssqqrrtt(22) is called.
There is no guaranteed order of initialization of global variables in different translation units.
Consequently, it is unwise to create order dependencies between initializers of global variables in
different compilation units. In addition, it is not possible to catch an exception thrown by the initializer of a global variable (§14.7). It is generally best to minimize the use of global variables and
in particular to limit the use of global variables requiring complicated initialization.
Several techniques exist for enforcing an order of initialization of global variables in different
translation units. However, none are both portable and efficient. In particular, dynamically linked
libraries do not coexist happily with global variables that have complicated dependencies.
Often, a function returning a reference is a good alternative to a global variable. For example:
iinntt& uussee__ccoouunntt()
{
ssttaattiicc iinntt uucc = 00;
rreettuurrnn uucc;
}
A call uussee__ccoouunntt() now acts as a global variable except that it is initialized at its first use (§5.5).
For example:
vvooiidd ff()
{
ccoouutt << ++uussee__ccoouunntt();
// ...
}
// read and increment
The initialization of nonlocal static variables is controlled by whatever mechanism an
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
218
Source Files and Programs
Chapter 9
implementation uses to start up a C++ program. This mechanism is guaranteed to work properly
only if m
maaiinn() is executed. Consequently, one should avoid nonlocal variables that require runtime initialization in C++ code intended for execution as a fragment of a non-C++ program.
Note that variables initialized by constant expressions (§C.5) cannot depend on the value of
objects from other translation units and do not require run-time initialization. Such variables are
therefore safe to use in all cases.
9.4.1.1 Program Termination [file.termination]
A program can terminate in several ways:
– By returning from m
maaiinn()
– By calling eexxiitt()
– By calling aabboorrtt()
– By throwing an uncaught exception
In addition, there are a variety of ill-behaved and implementation-dependent ways of making a program crash.
If a program is terminated using the standard library function eexxiitt(), the destructors for constructed static objects are called (§10.4.9, §10.2.4). However, if the program is terminated using
the standard library function aabboorrtt(), they are not. Note that this implies that eexxiitt() does not terminate a program immediately. Calling eexxiitt() in a destructor may cause an infinite recursion. The
type of eexxiitt() is
vvooiidd eexxiitt(iinntt);
Like the return value of m
maaiinn() (§3.2), eexxiitt()’s argument is returned to ‘‘the system’’ as the value
of the program. Zero indicates successful completion.
Calling eexxiitt() means that the local variables of the calling function and its callers will not have
their destructors invoked. Throwing an exception and catching it ensures that local objects are
properly destroyed (§14.4.7). Also, a call of eexxiitt() terminates the program without giving the
caller of the function that called eexxiitt() a chance to deal with the problem. It is therefore often best
to leave a context by throwing an exception and letting a handler decide what to do next.
The C (and C++) standard library function aatteexxiitt() offers the possibility to have code executed
at program termination. For example:
vvooiidd m
myy__cclleeaannuupp();
vvooiidd ssoom
meew
whheerree()
{
iiff (aatteexxiitt(&m
myy__cclleeaannuupp)==00) {
// my_cleanup will be called at normal termination
}
eellssee {
// oops: too many atexit functions
}
}
This strongly resembles the automatic invocation of destructors for global variables at program termination (§10.4.9, §10.2.4). Note that an argument to aatteexxiitt() cannot take arguments or return a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 9.4.1.1
Program Termination
219
result. Also, there is an implementation-defined limit to the number of atexit functions; aatteexxiitt()
indicates when that limit is reached by returning a nonzero value. These limitations make aatteexxiitt()
less useful than it appears at first glance.
The destructor of an object created before a call of aatteexxiitt(ff) will be invoked after f is invoked.
The destructor of an object created after a call of aatteexxiitt(ff) will be invoked before f is invoked.
The eexxiitt(), aabboorrtt(), and aatteexxiitt() functions are declared in <ccssttddlliibb>.
9.5 Advice [file.advice]
[1] Use header files to represent interfaces and to emphasize logical structure; §9.1, §9.3.2.
[2] #iinncclluuddee a header in the source file that implements its functions; §9.3.1.
[3] Don’t define global entities with the same name and similar-but-different meanings in different translation units; §9.2.
[4] Avoid non-inline function definitions in headers; §9.2.1.
[5] Use #iinncclluuddee only at global scope and in namespaces; §9.2.1.
[6] #iinncclluuddee only complete declarations; §9.2.1.
[7] Use include guards; §9.3.3.
[8] #iinncclluuddee C headers in namespaces to avoid global names; §9.3.2.
[9] Make headers self-contained; §9.2.3.
[10] Distinguish between users’ interfaces and implementers’ interfaces; §9.3.2.
[11] Distinguish between average users’ interfaces and expert users’ interfaces; §9.3.2.
[12] Avoid nonlocal objects that require run-time initialization in code intended for use as part of
non-C++ programs; §9.4.1.
9.6 Exercises [file.exercises]
1. (∗2) Find where the standard library headers are kept on your system. List their names. Are
any nonstandard headers kept together with the standard ones? Can any nonstandard headers be
#iinncclluuddeed using the <> notation?
2. (∗2) Where are the headers for nonstandard library ‘‘foundation’’ libraries kept?
3. (∗2.5) Write a program that reads a source file and writes out the names of files #iinncclluuddeed.
Indent file names to show files #iinncclluuddeedd by included files. Try this program on some real
source files (to get an idea of the amount of information included).
4. (∗3) Modify the program from the previous exercise to print the number of comment lines, the
number of non-comment lines, and the number of non-comment, whitespace-separated words
for each file #iinncclluuddeed.
5. (∗2.5) An external include guard is a construct that tests outside the file it is guarding and
iinncclluuddees only once per compilation. Define such a construct, devise a way of testing it, and discuss its advantages and disadvantages compared to the include guards described in §9.3.3. Is
there any significant run-time advantage to external include guards on your system.
6. (∗3) How is dynamic linking achieved on your system. What restrictions are placed on dynamically linked code? What requirements are placed on code for it to be dynamically linked?
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
220
Source Files and Programs
Chapter 9
7. (∗3) Open and read 100 files containing 1500 characters each. Open and read one file containing 150,000 characters. Hint: See example in §21.5.1. Is there a performance difference?
What is the highest number of files that can be simultaneously open on your system? Consider
these questions in relation to the use of #iinncclluuddee files.
8. (∗2) Modify the desk calculator so that it can be invoked from m
maaiinn() or from other functions
as a simple function call.
9. (∗2) Draw the ‘‘module dependency diagrams’’ (§9.3.2) for the version of the calculator that
used eerrrroorr() instead of exceptions (§8.2.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Part II
Abstraction Mechanisms
This part describes C++’s facilities for defining and using new types. Techniques commonly called object-oriented programming and generic programming are presented.
Chapters
10
11
12
13
14
15
Classes
Operator Overloading
Derived Classes
Templates
Exception Handling
Class Hierarchies
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
222
Abstraction Mechanisms
Part II
‘‘... there is nothing more difficult to carry out, nor more doubtful of success, nor more
dangerous to handle, than to initiate a new order of things. For the reformer makes
enemies of all those who profit by the old order, and only lukewarm defenders in all
those who would profit by the new order...’’
— Nicollo Machiavelli (‘‘The Prince’’ §vi)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
10
________________________________________
________________________________________________________________________________________________________________________________________________________________
Classes
Those types are not "abstract";
they are as real as int and float.
– Doug McIlroy
Concepts and classes — class members — access control — constructors — ssttaattiicc
members — default copy — ccoonnsstt member functions — tthhiiss — ssttrruucctts — in-class function definition — concrete classes — member functions and helper functions — overloaded operators — use of concrete classes — destructors — default construction —
local variables — user-defined copy — nneew
w and ddeelleettee — member objects — arrays —
static storage — temporary variables — unions — advice — exercises.
10.1 Introduction [class.intro]
The aim of the C++ class concept is to provide the programmer with a tool for creating new types
that can be used as conveniently as the built-in types. In addition, derived classes (Chapter 12) and
templates (Chapter 13) provide ways of organizing related classes that allow the programmer to
take advantage of their relationships.
A type is a concrete representation of a concept. For example, the C++ built-in type ffllooaatt with
its operations +, -, *, etc., provides a concrete approximation of the mathematical concept of a real
number. A class is a user-defined type. We design a new type to provide a definition of a concept
that has no direct counterpart among the built-in types. For example, we might provide a type
T
Trruunnkk__lliinnee in a program dealing with telephony, a type E
Exxpplloossiioonn for a videogame, or a type
lliisstt<P
Paarraaggrraapphh> for a text-processing program. A program that provides types that closely match
the concepts of the application tends to be easier to understand and easier to modify than a program
that does not. A well-chosen set of user-defined types makes a program more concise. In addition,
it makes many sorts of code analysis feasible. In particular, it enables the compiler to detect illegal
uses of objects that would otherwise remain undetected until the program is thoroughly tested.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
224
Classes
Chapter 10
The fundamental idea in defining a new type is to separate the incidental details of the implementation (e.g., the layout of the data used to store an object of the type) from the properties essential to the correct use of it (e.g., the complete list of functions that can access the data). Such a separation is best expressed by channeling all uses of the data structure and internal housekeeping routines through a specific interface.
This chapter focuses on relatively simple ‘‘concrete’’ user-defined types that logically don’t differ much from built-in types. Ideally, such types should not differ from built-in types in the way
they are used, only in the way they are created.
10.2 Classes [class.class]
A class is a user-defined type. This section introduces the basic facilities for defining a class, creating objects of a class, and manipulating such objects.
10.2.1 Member Functions [class.member]
Consider implementing the concept of a date using a ssttrruucctt to define the representation of a D
Daattee
and a set of functions for manipulating variables of this type:
ssttrruucctt D
Daattee {
iinntt dd, m
m, yy;
};
vvooiidd
vvooiidd
vvooiidd
vvooiidd
// representation
iinniitt__ddaattee(D
Daattee& dd, iinntt, iinntt, iinntt);
aadddd__yyeeaarr(D
Daattee& dd, iinntt nn);
aadddd__m
moonntthh(D
Daattee& dd, iinntt nn);
aadddd__ddaayy(D
Daattee& dd, iinntt nn);
// initialize d
// add n years to d
// add n months to d
// add n days to d
There is no explicit connection between the data type and these functions. Such a connection can
be established by declaring the functions as members:
ssttrruucctt D
Daattee {
iinntt dd, m
m, yy;
vvooiidd
vvooiidd
vvooiidd
vvooiidd
iinniitt(iinntt dddd, iinntt m
mm
m, iinntt yyyy);
aadddd__yyeeaarr(iinntt nn);
aadddd__m
moonntthh(iinntt nn);
aadddd__ddaayy(iinntt nn);
// initialize
// add n years
// add n months
// add n days
};
Functions declared within a class definition (a ssttrruucctt is a kind of class; §10.2.8) are called member
functions and can be invoked only for a specific variable of the appropriate type using the standard
syntax for structure member access. For example:
D
Daattee m
myy__bbiirrtthhddaayy;
vvooiidd ff()
{
D
Daattee ttooddaayy;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.2.1
Member Functions
225
ttooddaayy.iinniitt(1166,1100,11999966);
m
myy__bbiirrtthhddaayy.iinniitt(3300,1122,11995500);
D
Daattee ttoom
moorrrroow
w = ttooddaayy;
ttoom
moorrrroow
w.aadddd__ddaayy(11);
// ...
}
Because different structures can have member functions with the same name, we must specify the
structure name when defining a member function:
vvooiidd D
Daattee::iinniitt(iinntt dddd, iinntt m
mm
m, iinntt yyyy)
{
d = dddd;
m=m
mm
m;
y = yyyy;
}
In a member function, member names can be used without explicit reference to an object. In that
case, the name refers to that member of the object for which the function was invoked. For example, when D
Daattee::iinniitt() is invoked for ttooddaayy, m
m=m
mm
m assigns to ttooddaayy.m
m. On the other hand,
when D
Daattee::iinniitt() is invoked for m
myy__bbiirrtthhddaayy, m
m=m
mm
m assigns to m
myy__bbiirrtthhddaayy.m
m. A class
member function always ‘‘knows’’ for which object it was invoked.
The construct
ccllaassss X { ... };
is called a class definition because it defines a new type. For historical reasons, a class definition is
often referred to as a class declaration. Also, like declarations that are not definitions, a class definition can be replicated in different source files using #iinncclluuddee without violating the one-definition
rule (§9.2.3).
10.2.2 Access Control [class.access]
The declaration of D
Daattee in the previous subsection provides a set of functions for manipulating a
D
Daattee. However, it does not specify that those functions should be the only ones to depend directly
on D
Daattee’s representation and the only ones to directly access objects of class D
Daattee. This restriction
can be expressed by using a ccllaassss instead of a ssttrruucctt:
ccllaassss D
Daattee {
iinntt dd, m
m, yy;
ppuubblliicc:
vvooiidd iinniitt(iinntt dddd, iinntt m
mm
m, iinntt yyyy);
vvooiidd aadddd__yyeeaarr(iinntt nn);
vvooiidd aadddd__m
moonntthh(iinntt nn);
vvooiidd aadddd__ddaayy(iinntt nn);
// initialize
// add n years
// add n months
// add n days
};
The ppuubblliicc label separates the class body into two parts. The names in the first, private, part can be
used only by member functions. The second, public, part constitutes the public interface to objects
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
226
Classes
Chapter 10
of the class. A ssttrruucctt is simply a ccllaassss whose members are public by default (§10.2.8); member
functions can be defined and used exactly as before. For example:
iinnlliinnee vvooiidd D
Daattee::aadddd__yyeeaarr(iinntt nn)
{
y += nn;
}
However, nonmember functions are barred from using private members. For example:
vvooiidd ttiim
meew
waarrpp(D
Daattee& dd)
{
dd.yy -= 220000;
// error: Date::y is private
}
There are several benefits to be obtained from restricting access to a data structure to an explicitly
declared list of functions. For example, any error causing a D
Daattee to take on an illegal value (for
example, December 36, 1985) must be caused by code in a member function. This implies that the
first stage of debugging – localization – is completed before the program is even run. This is a
special case of the general observation that any change to the behavior of the type D
Daattee can and
must be effected by changes to its members. In particular, if we change the representation of a
class, we need only change the member functions to take advantage of the new representation.
User code directly depends only on the public interface and need not be rewritten (although it may
need to be recompiled). Another advantage is that a potential user need examine only the definition
of the member functions in order to learn to use a class.
The protection of private data relies on restriction of the use of the class member names. It can
therefore be circumvented by address manipulation and explicit type conversion. But this, of
course, is cheating. C++ protects against accident rather than deliberate circumvention (fraud).
Only hardware can protect against malicious use of a general-purpose language, and even that is
hard to do in realistic systems.
The iinniitt() function was added partially because it is generally useful to have a function that
sets the value of an object and partly because making the data private forces us to provide it.
10.2.3 Constructors [class.ctor]
The use of functions such as iinniitt() to provide initialization for class objects is inelegant and errorprone. Because it is nowhere stated that an object must be initialized, a programmer can forget to
do so – or do so twice (often with equally disastrous results). A better approach is to allow the programmer to declare a function with the explicit purpose of initializing objects. Because such a
function constructs values of a given type, it is called a constructor. A constructor is recognized by
having the same name as the class itself. For example:
ccllaassss D
Daattee {
// ...
D
Daattee(iinntt, iinntt, iinntt);
};
// constructor
When a class has a constructor, all objects of that class will be initialized. If the constructor
requires arguments, these arguments must be supplied:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.2.3
D
Daattee
D
Daattee
D
Daattee
D
Daattee
Constructors
227
ttooddaayy = D
Daattee(2233,66,11998833);
xxm
maass(2255,1122,11999900);
// abbreviated form
m
myy__bbiirrtthhddaayy;
// error: initializer missing
rreelleeaassee11__00(1100,1122);
// error: 3rd argument missing
It is often nice to provide several ways of initializing a class object. This can be done by providing
several constructors. For example:
ccllaassss D
Daattee {
iinntt dd, m
m, yy;
ppuubblliicc:
// ...
D
Daattee(iinntt, iinntt, iinntt);
D
Daattee(iinntt, iinntt);
D
Daattee(iinntt);
D
Daattee();
D
Daattee(ccoonnsstt cchhaarr*);
};
// day, month, year
// day, month, today’s year
// day, today’s month and year
// default Date: today
// date in string representation
Constructors obey the same overloading rules as do other functions (§7.4). As long as the constructors differ sufficiently in their argument types, the compiler can select the correct one for each use:
D
Daattee
D
Daattee
D
Daattee
D
Daattee
ttooddaayy(44);
jjuullyy44("JJuullyy 44, 11998833");
gguuyy("55 N
Noovv");
nnoow
w;
// default initialized as today
The proliferation of constructors in the D
Daattee example is typical. When designing a class, a programmer is always tempted to add features just because somebody might want them. It takes more
thought to carefully decide what features are really needed and to include only those. However,
that extra thought typically leads to smaller and more comprehensible programs. One way of
reducing the number of related functions is to use default arguments (§7.5). In the D
Daattee, each argument can be given a default value interpreted as ‘‘pick the default: ttooddaayy.’’
ccllaassss D
Daattee {
iinntt dd, m
m, yy;
ppuubblliicc:
D
Daattee(iinntt dddd =00, iinntt m
mm
m =00, iinntt yyyy =00);
// ...
};
D
Daattee::D
Daattee(iinntt dddd, iinntt m
mm
m, iinntt yyyy)
{
d = dddd ? dddd : ttooddaayy.dd;
m=m
mm
m?m
mm
m : ttooddaayy.m
m;
y = yyyy ? yyyy : ttooddaayy.yy;
// check that the Date is valid
}
When an argument value is used to indicate ‘‘pick the default,’’ the value chosen must be outside
the set of possible values for the argument. For ddaayy and m
moonntthh, this is clearly so, but for yyeeaarr, zero
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
228
Classes
Chapter 10
may not be an obvious choice. Fortunately, there is no year zero on the European calendar; 1AD
(yyeeaarr==11) comes immediately after 1BC (yyeeaarr==-11).
10.2.4 Static Members [class.static]
The convenience of a default value for D
Daattees was bought at the cost of a significant hidden problem. Our D
Daattee class became dependent on the global variable ttooddaayy. This D
Daattee class can be used
only in a context in which ttooddaayy is defined and correctly used by every piece of code. This is the
kind of constraint that causes a class to be useless outside the context in which it was first written.
Users get too many unpleasant surprises trying to use such context-dependent classes, and maintenance becomes messy. Maybe ‘‘just one little global variable’’ isn’t too unmanageable, but that
style leads to code that is useless except to its original programmer. It should be avoided.
Fortunately, we can get the convenience without the encumbrance of a publicly accessible global variable. A variable that is part of a class, yet is not part of an object of that class, is called a
ssttaattiicc member. There is exactly one copy of a ssttaattiicc member instead of one copy per object, as for
ordinary non-ssttaattiicc members. Similarly, a function that needs access to members of a class, yet
doesn’t need to be invoked for a particular object, is called a ssttaattiicc member function.
Here is a redesign that preserves the semantics of default constructor values for D
Daattee without
the problems stemming from reliance on a global:
ccllaassss D
Daattee {
iinntt dd, m
m, yy;
ssttaattiicc D
Daattee ddeeffaauulltt__ddaattee;
ppuubblliicc:
D
Daattee(iinntt dddd =00, iinntt m
mm
m =00, iinntt yyyy =00);
// ...
ssttaattiicc vvooiidd sseett__ddeeffaauulltt(iinntt, iinntt, iinntt);
};
We can now define the D
Daattee constructor like this:
D
Daattee::D
Daattee(iinntt dddd, iinntt m
mm
m, iinntt yyyy)
{
d = dddd ? dddd : ddeeffaauulltt__ddaattee.dd;
m=m
mm
m?m
mm
m : ddeeffaauulltt__ddaattee.m
m;
y = yyyy ? yyyy : ddeeffaauulltt__ddaattee.yy;
// check that the Date is valid
}
We can change the default date when appropriate. A static member can be referred to like any
other member. In addition, a static member can be referred to without mentioning an object.
Instead, its name is qualified by the name of its class. For example:
vvooiidd ff()
{
D
Daattee::sseett__ddeeffaauulltt(44,55,11994455);
}
Static members – both function and data members – must be defined somewhere. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.2.4
Static Members
229
D
Daattee D
Daattee::ddeeffaauulltt__ddaattee(1166,1122,11777700);
vvooiidd D
Daattee::sseett__ddeeffaauulltt(iinntt dd, iinntt m
m, iinntt yy)
{
D
Daattee::ddeeffaauulltt__ddaattee = D
Daattee(dd,m
m,yy);
}
Now the default value is Beethoven’s birth date – until someone decides otherwise.
Note that D
Daattee() serves as a notation for the value of D
Daattee::ddeeffaauulltt__ddaattee. For example:
D
Daattee ccooppyy__ooff__ddeeffaauulltt__ddaattee = D
Daattee();
Consequently, we don’t need a separate function for reading the default date.
10.2.5 Copying Class Objects [class.default.copy]
By default, class objects can be copied. In particular, a class object can be initialized with a copy
of another object of the same class. This can be done even where constructors have been declared.
For example:
D
Daattee d = ttooddaayy;
// initialization by copy
By default, the copy of a class object is a copy of each member. If that default is not the behavior
wanted for a class X
X, a more appropriate behavior can be provided by defining a copy constructor,
X
X::X
X(ccoonnsstt X
X&). This is discussed further in §10.4.4.1.
Similarly, class objects can by default be copied by assignment. For example:
vvooiidd ff(D
Daattee& dd)
{
d = ttooddaayy;
}
Again, the default semantics is memberwise copy. If that is not the right choice for a class X
X, the
user can define an appropriate assignment operator (§10.4.4.1).
10.2.6 Constant Member Functions [class.constmem]
The D
Daattee defined so far provides member functions for giving a D
Daattee a value and changing it.
Unfortunately, we didn’t provide a way of examining the value of a D
Daattee. This problem can easily
be remedied by adding functions for reading the day, month, and year:
ccllaassss D
Daattee {
iinntt dd, m
m, yy;
ppuubblliicc:
iinntt ddaayy() ccoonnsstt { rreettuurrnn dd; }
iinntt m
moonntthh() ccoonnsstt { rreettuurrnn m
m; }
iinntt yyeeaarr() ccoonnsstt;
// ...
};
Note the ccoonnsstt after the (empty) argument list in the function declarations. It indicates that these
functions do not modify the state of a D
Daattee.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
230
Classes
Chapter 10
Naturally, the compiler will catch accidental attempts to violate this promise. For example:
iinnlliinnee iinntt D
Daattee::yyeeaarr() ccoonnsstt
{
rreettuurrnn yy++;
// error: attempt to change member value in const function
}
When a ccoonnsstt member function is defined outside its class, the ccoonnsstt suffix is required:
iinnlliinnee iinntt D
Daattee::yyeeaarr() ccoonnsstt
{
rreettuurrnn yy;
}
iinnlliinnee iinntt D
Daattee::yyeeaarr()
{
rreettuurrnn yy;
}
// correct
// error: const missing in member function type
In other words, the ccoonnsstt is part of the type of D
Daattee::ddaayy() and D
Daattee::yyeeaarr().
A ccoonnsstt member function can be invoked for both ccoonnsstt and non-ccoonnsstt objects, whereas a nonccoonnsstt member function can be invoked only for non-ccoonnsstt objects. For example:
vvooiidd ff(D
Daattee& dd, ccoonnsstt D
Daattee& ccdd)
{
iinntt i = dd.yyeeaarr();
// ok
dd.aadddd__yyeeaarr(11);
// ok
iinntt j = ccdd.yyeeaarr();
ccdd.aadddd__yyeeaarr(11);
// ok
// error: cannot change value of const cd
}
10.2.7 Self-Reference [class.this]
The state update functions aadddd__yyeeaarr(), aadddd__m
moonntthh(), and aadddd__ddaayy() were defined not to return
values. For such a set of related update functions, it is often useful to return a reference to the
updated object so that the operations can be chained. For example, we would like to write
vvooiidd ff(D
Daattee& dd)
{
// ...
dd.aadddd__ddaayy(11).aadddd__m
moonntthh(11).aadddd__yyeeaarr(11);
// ...
}
to add a day, a month, and a year to dd. To do this, each function must be declared to return a reference to a D
Daattee:
ccllaassss D
Daattee {
// ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.2.7
Self-Reference
231
D
Daattee& aadddd__yyeeaarr(iinntt nn); // add n years
D
Daattee& aadddd__m
moonntthh(iinntt nn); // add n months
D
Daattee& aadddd__ddaayy(iinntt nn);
// add n days
};
Each (nonstatic) member function knows what object it was invoked for and can explictly refer to
it. For example:
D
Daattee& D
Daattee::aadddd__yyeeaarr(iinntt nn)
{
iiff (dd==2299 && m
m==22 && !lleeaappyyeeaarr(yy+nn)) { // beware of February 29
d = 11;
m = 33;
}
y += nn;
rreettuurrnn *tthhiiss;
}
The expression *tthhiiss refers to the object for which a member function is invoked. It is equivalent
to Simula’s T
TH
HIISS and Smalltalk’s sseellff.
In a nonstatic member function, the keyword tthhiiss is a pointer to the object for which the function was invoked. In a non-ccoonnsstt member function of class X
X, the type of tthhiiss is X *ccoonnsstt. The
ccoonnsstt makes it clear that the user is not supposed to change the value of tthhiiss. In a ccoonnsstt member
function of class X
X, the type of tthhiiss is ccoonnsstt X *ccoonnsstt to prevent modification of the object itself
(see also §5.4.1).
Most uses of tthhiiss are implicit. In particular, every reference to a nonstatic member from within
a class relies on an implicit use of tthhiiss to get the member of the appropriate object. For example,
the aadddd__yyeeaarr function could equivalently, but tediously, have been defined like this:
D
Daattee& D
Daattee::aadddd__yyeeaarr(iinntt nn)
{
iiff (tthhiiss->dd==2299 && tthhiiss->m
m==22 && !lleeaappyyeeaarr(tthhiiss->yy+nn)) {
tthhiiss->dd = 11;
tthhiiss->m
m = 33;
}
tthhiiss->yy += nn;
rreettuurrnn *tthhiiss;
}
One common explicit use of tthhiiss is in linked-list manipulation (e.g., §24.3.7.4).
10.2.7.1 Physical and Logical Constness [class.const]
Occasionally, a member function is logically ccoonnsstt, but it still needs to change the value of a member. To a user, the function appears not to change the state of its object. However, some detail that
the user cannot directly observe is updated. This is often called logical constness. For example,
the D
Daattee class might have a function returning a string representation that a user could use for output. Constructing this representation could be a relatively expensive operation. Therefore, it would
make sense to keep a copy so that repeated requests would simply return the copy, unless the
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
232
Classes
Chapter 10
D
Daattee’s value had been changed. Caching values like that is more common for more complicated
data structures, but let’s see how it can be achieved for a D
Daattee:
ccllaassss D
Daattee {
bbooooll ccaacchhee__vvaalliidd;
ssttrriinngg ccaacchhee;
vvooiidd ccoom
mppuuttee__ccaacchhee__vvaalluuee();
// ...
ppuubblliicc:
// ...
ssttrriinngg ssttrriinngg__rreepp() ccoonnsstt;
};
// fill cache
// string representation
From a user’s point of view, ssttrriinngg__rreepp doesn’t change the state of its D
Daattee, so it clearly should be
a ccoonnsstt member function. On the other hand, the cache needs to be filled before it can be used.
This can be achieved through brute force:
ssttrriinngg D
Daattee::ssttrriinngg__rreepp() ccoonnsstt
{
iiff (ccaacchhee__vvaalliidd == ffaallssee) {
D
Daattee* tthh = ccoonnsstt__ccaasstt<D
Daattee*>(tthhiiss); // cast away const
tthh->ccoom
mppuuttee__ccaacchhee__vvaalluuee();
tthh->ccaacchhee__vvaalliidd = ttrruuee;
}
rreettuurrnn ccaacchhee;
}
That is, the ccoonnsstt__ccaasstt operator (§15.4.2.1) is used to obtain a pointer of type D
Daattee* to tthhiiss. This
is hardly elegant, and it is not guaranteed to work when applied to an object that was originally
declared as a ccoonnsstt. For example:
D
Daattee dd11;
ccoonnsstt D
Daattee dd22;
ssttrriinngg ss11 = dd11.ssttrriinngg__rreepp();
ssttrriinngg ss22 = dd22.ssttrriinngg__rreepp();
// undefined behavior
In the case of dd11, ssttrriinngg__rreepp() simply casts back to dd11’s original type so that the call will work.
However, dd22 was defined as a ccoonnsstt and the implementation could have applied some form of
memory protection to ensure that its value wasn’t corrupted. Consequently, dd22.ssttrriinngg__rreepp() is
not guaranteed to give a single predictable result on all implementations.
10.2.7.2 Mutable [class.mutable]
The explicit type conversion ‘‘casting away ccoonnsstt’’ and its consequent implementation-dependent
behavior can be avoided by declaring the data involved in the cache management to be m
muuttaabbllee:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.2.7.2
Mutable
233
ccllaassss D
Daattee {
m
muuttaabbllee bbooooll ccaacchhee__vvaalliidd;
m
muuttaabbllee ssttrriinngg ccaacchhee;
vvooiidd ccoom
mppuuttee__ccaacchhee__vvaalluuee() ccoonnsstt; // fill (mutable) cache
// ...
ppuubblliicc:
// ...
ssttrriinngg ssttrriinngg__rreepp() ccoonnsstt;
// string representation
};
The storage specifier m
muuttaabbllee specifies that a member should be stored in a way that allows updating – even when it is a member of a ccoonnsstt object. In other words, m
muuttaabbllee means ‘‘can never be
ccoonnsstt.’’ This can be used to simplify the definition of ssttrriinngg__rreepp():
ssttrriinngg D
Daattee::ssttrriinngg__rreepp() ccoonnsstt
{
iiff (!ccaacchhee__vvaalliidd) {
ccoom
mppuuttee__ccaacchhee__vvaalluuee();
ccaacchhee__vvaalliidd = ttrruuee;
}
rreettuurrnn ccaacchhee;
}
and makes reasonable uses of ssttrriinngg__rreepp() valid. For example:
D
Daattee dd33;
ccoonnsstt D
Daattee dd44;
ssttrriinngg ss33 = dd33.ssttrriinngg__rreepp();
ssttrriinngg ss44 = dd44.ssttrriinngg__rreepp();
// ok!
Declaring members m
muuttaabbllee is most appropriate when (only) part of a representation is allowed to
change. If most of an object changes while the object remains logically ccoonnsstt, it is often better to
place the changing data in a separate object and access it indirectly. If that technique is used, the
string-with-cache example becomes:
ssttrruucctt ccaacchhee {
bbooooll vvaalliidd;
ssttrriinngg rreepp;
};
ccllaassss D
Daattee {
ccaacchhee* cc;
// initialize in constructor (§10.4.6)
vvooiidd ccoom
mppuuttee__ccaacchhee__vvaalluuee() ccoonnsstt; // fill what cache refers to
// ...
ppuubblliicc:
// ...
ssttrriinngg ssttrriinngg__rreepp() ccoonnsstt;
// string representation
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
234
Classes
Chapter 10
ssttrriinngg D
Daattee::ssttrriinngg__rreepp() ccoonnsstt
{
iiff (!cc->vvaalliidd) {
ccoom
mppuuttee__ccaacchhee__vvaalluuee();
cc->vvaalliidd = ttrruuee;
}
rreettuurrnn cc->rreepp;
}
The programming techniques that support a cache generalize to various forms of lazy evaluation.
10.2.8 Structures and Classes [class.struct]
By definition, a ssttrruucctt is a class in which members are by default public; that is,
ssttrruucctt s { ...
is simply shorthand for
ccllaassss s { ppuubblliicc: ...
The access specifier pprriivvaattee: can be used to say that the members following are private, just as
ppuubblliicc: says that the members following are public. Except for the different names, the following
declarations are equivalent:
ccllaassss D
Daattee11 {
iinntt dd, m
m, yy;
ppuubblliicc:
D
Daattee11(iinntt dddd, iinntt m
mm
m, iinntt yyyy);
vvooiidd aadddd__yyeeaarr(iinntt nn);
// add n years
};
ssttrruucctt D
Daattee22 {
pprriivvaattee:
iinntt dd, m
m, yy;
ppuubblliicc:
D
Daattee22(iinntt dddd, iinntt m
mm
m, iinntt yyyy);
vvooiidd aadddd__yyeeaarr(iinntt nn);
// add n years
};
Which style you use depends on circumstances and taste. I usually prefer to use ssttrruucctt for classes
that have all data public. I think of such classes as ‘‘not quite proper types, just data structures.’’
Constructors and access functions can be quite useful even for such structures, but as a shorthand
rather than guarantors of properties of the type (invariants, see §24.3.7.1).
It is not a requirement to declare data first in a class. In fact, it often makes sense to place data
members last to emphasize the functions providing the public user interface. For example:
ccllaassss D
Daattee33 {
ppuubblliicc:
D
Daattee33(iinntt dddd, iinntt m
mm
m, iinntt yyyy);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.2.8
vvooiidd aadddd__yyeeaarr(iinntt nn);
pprriivvaattee:
iinntt dd, m
m, yy;
};
Structures and Classes
235
// add n years
In real code, where both the public interface and the implementation details typically are more
extensive than in tutorial examples, I usually prefer the style used for D
Daattee33.
Access specifiers can be used many times in a single class declaration. For example:
ccllaassss D
Daattee44 {
ppuubblliicc:
D
Daattee44(iinntt dddd, iinntt m
mm
m, iinntt yyyy);
pprriivvaattee:
iinntt dd, m
m, yy;
ppuubblliicc:
vvooiidd aadddd__yyeeaarr(iinntt nn);
// add n years
};
Having more than one public section, as in D
Daattee44, tends to be messy. So does having more than
one private section. However, allowing many access specifiers in a class is useful for machinegenerated code.
10.2.9 In-Class Function Definitions [class.inline]
A member function defined within the class definition – rather than simply declared there – is
taken to be an inline member function. That is, in-class definition of member functions is for small,
frequently-used functions. Like the class definition it is part of, a member function defined in-class
can be replicated in several translation units using #iinncclluuddee. Like the class itself, its meaning must
be the same wherever it is used (§9.2.3).
The style of placing the definition of data members last in a class can lead to a minor problem
with public inline functions that refer to the representation. Consider:
ccllaassss D
Daattee {
// potentially confusing
ppuubblliicc:
iinntt ddaayy() ccoonnsstt { rreettuurrnn dd; } // return Date::d
// ...
pprriivvaattee:
iinntt dd, m
m, yy;
};
This is perfectly good C++ code because a member function declared within a class can refer to
every member of the class as if the class were completely defined before the member function bodies were considered. However, this can confuse human readers.
Consequently, I usually either place the data first or define the inline member functions after the
class itself. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
236
Classes
Chapter 10
ccllaassss D
Daattee {
ppuubblliicc:
iinntt ddaayy() ccoonnsstt;
// ...
pprriivvaattee:
iinntt dd, m
m, yy;
};
iinnlliinnee iinntt D
Daattee::ddaayy() ccoonnsstt { rreettuurrnn dd; }
10.3 Efficient User-Defined Types [class.concrete]
The previous section discussed bits and pieces of the design of a D
Daattee class in the context of introducing the basic language features for defining classes. Here, I reverse the emphasis and discuss
the design of a simple and efficient D
Daattee class and show how the language features support this
design.
Small, heavily-used abstractions are common in many applications. Examples are Latin characters, Chinese characters, integers, floating-point numbers, complex numbers, points, pointers, coordinates, transforms, (pointer,offset) pairs, dates, times, ranges, links, associations, nodes,
(value,unit) pairs, disk locations, source code locations, B
BC
CD
D characters, currencies, lines, rectangles, scaled fixed-point numbers, numbers with fractions, character strings, vectors, and arrays.
Every application uses several of these. Often, a few of these simple concrete types are used heavily. A typical application uses a few directly and many more indirectly from libraries.
C++ and other programming languages directly support a few of these abstractions. However,
most are not, and cannot be, supported directly because there are too many of them. Furthermore,
the designer of a general-purpose programming language cannot foresee the detailed needs of every
application. Consequently, mechanisms must be provided for the user to define small concrete
types. Such types are called concrete types or concrete classes to distinguish them from abstract
classes (§12.3) and classes in class hierarchies (§12.2.4, §12.4).
It was an explicit aim of C++ to support the definition and efficient use of such user-defined
data types very well. They are a foundation of elegant programming. As usual, the simple and
mundane is statistically far more significant than the complicated and sophisticated.
In this light, let us build a better date class:
ccllaassss D
Daattee {
ppuubblliicc:
// public interface:
eennuum
m M
Moonntthh { jjaann=11, ffeebb, m
maarr, aapprr, m
maayy, jjuunn, jjuull, aauugg, sseepp, oocctt, nnoovv, ddeecc };
ccllaassss B
Baadd__ddaattee { }; // exception class
D
Daattee(iinntt dddd =00, M
Moonntthh m
mm
m =M
Moonntthh(00), iinntt yyyy =00); // 0 means ‘‘pick a default’’
// functions for examining the Date:
iinntt ddaayy() ccoonnsstt;
M
Moonntthh m
moonntthh() ccoonnsstt;
iinntt yyeeaarr() ccoonnsstt;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.3
ssttrriinngg ssttrriinngg__rreepp() ccoonnsstt;
vvooiidd cchhaarr__rreepp(cchhaarr ss[]) ccoonnsstt;
Efficient User-Defined Types
237
// string representation
// C-style string representation
ssttaattiicc vvooiidd sseett__ddeeffaauulltt(iinntt, M
Moonntthh, iinntt);
// functions for changing the Date:
D
Daattee& aadddd__yyeeaarr(iinntt nn);
D
Daattee& aadddd__m
moonntthh(iinntt nn);
D
Daattee& aadddd__ddaayy(iinntt nn);
pprriivvaattee:
iinntt dd, m
m, yy;
ssttaattiicc D
Daattee ddeeffaauulltt__ddaattee;
};
// add n years
// add n months
// add n days
// representation
This set of operations is fairly typical for a user-defined type:
[1] A constructor specifying how objects/variables of the type are to be initialized.
[2] A set of functions allowing a user to examine a D
Daattee. These functions are marked ccoonnsstt to
indicate that they don’t modify the state of the object/variable for which they are called.
[3] A set of functions allowing the user to manipulate D
Daattees without actually having to know
the details of the representation or fiddle with the intricacies of the semantics.
[4] A set of implicitly defined operations to allow D
Daattees to be freely copied.
[5] A class, B
Baadd__ddaattee, to be used for reporting errors as exceptions.
I defined a M
Moonntthh type to cope with the problem of remembering, for example, whether the 7th of
June is written D
Daattee(66,77) (American style) or D
Daattee(77,66) (European style). I also added a
mechanism for dealing with default arguments.
I considered introducing separate types D
Daayy and Y
Yeeaarr to cope with possible confusion of
D
Daattee(11999955,jjuull,2277) and D
Daattee(2277,jjuull,11999955). However, these types would not be as useful as
the M
Moonntthh type. Almost all such errors are caught at run-time anyway – the 26th of July year 27 is
not a common date in my work. How to deal with historical dates before year 1800 or so is a tricky
issue best left to expert historians. Furthermore, the day of the month can’t be properly checked in
isolation from its month and year. See §11.7.1 for a way of defining a convenient Y
Yeeaarr type.
The default date must be defined as a valid D
Daattee somewhere. For example:
D
Daattee D
Daattee::ddeeffaauulltt__ddaattee(2222,jjaann,11990011);
I omitted the cache technique from §10.2.7.1 as unnecessary for a type this simple. If needed, it
can be added as an implementation detail without affecting the user interface.
Here is a small – and contrived – example of how D
Daattees can be used:
vvooiidd ff(D
Daattee& dd)
{
D
Daattee llvvbb__ddaayy = D
Daattee(1166,D
Daattee::ddeecc,dd.yyeeaarr());
iiff (dd.ddaayy()==2299 && dd.m
moonntthh()==D
Daattee::ffeebb) {
// ...
}
iiff (m
miiddnniigghhtt()) dd.aadddd__ddaayy(11);
ccoouutt << "ddaayy aafftteerr:" << dd+11 << ´\\nn´;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
238
Classes
Chapter 10
This assumes that the output operator << and the addition operator + have been declared for D
Daattees.
I do that in §10.3.3.
Note the D
Daattee::ffeebb notation. The function ff() is not a member of D
Daattee, so it must specify that
it is referring to D
Daattee’s ffeebb and not to some other entity.
Why is it worthwhile to define a specific type for something as simple as a date? After all, we
could define a structure:
ssttrruucctt D
Daattee {
iinntt ddaayy, m
moonntthh, yyeeaarr;
};
and let programmers decide what to do with it. If we did that, though, every user would either have
to manipulate the components of D
Daattees directly or provide separate functions for doing so. In
effect, the notion of a date would be scattered throughout the system, which would make it hard to
understand, document, or change. Inevitably, providing a concept as only a simple structure causes
extra work for every user of the structure.
Also, even though the D
Daattee type seems simple, it takes some thought to get right. For example,
incrementing a D
Daattee must deal with leap years, with the fact that months are of different lengths,
and so on (note: §10.6[1]). Also, the day-month-and-year representation is rather poor for many
applications. If we decided to change it, we would need to modify only a designated set of functions. For example, to represent a D
Daattee as the number of days before or after January 1, 1970, we
would need to change only D
Daattee’s member functions (§10.6[2]).
10.3.1 Member Functions [class.memfct]
Naturally, an implementation for each member function must be provided somewhere. For example, here is the definition of D
Daattee’s constructor:
D
Daattee::D
Daattee(iinntt dddd, M
Moonntthh m
mm
m, iinntt yyyy)
{
iiff (yyyy == 00) yyyy = ddeeffaauulltt__ddaattee.yyeeaarr();
iiff (m
mm
m == 00) m
mm
m = ddeeffaauulltt__ddaattee.m
moonntthh();
iiff (dddd == 00) dddd = ddeeffaauulltt__ddaattee.ddaayy();
iinntt m
maaxx;
ssw
wiittcchh (m
mm
m) {
ccaassee ffeebb:
m
maaxx = 2288+lleeaappyyeeaarr(yyyy);
bbrreeaakk;
ccaassee aapprr: ccaassee jjuunn: ccaassee sseepp: ccaassee nnoovv:
m
maaxx = 3300;
bbrreeaakk;
ccaassee jjaann: ccaassee m
maarr: ccaassee m
maayy: ccaassee jjuull: ccaassee aauugg: ccaassee oocctt: ccaassee ddeecc:
m
maaxx = 3311;
bbrreeaakk;
ddeeffaauulltt:
tthhrroow
w B
Baadd__ddaattee(); // someone cheated
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.3.1
Member Functions
239
iiff (dddd<11 || m
maaxx<dddd) tthhrroow
w B
Baadd__ddaattee();
y = yyyy;
m=m
mm
m;
d = dddd;
}
The constructor checks that the data supplied denotes a valid D
Daattee. If not, say for
D
Daattee(3300,D
Daattee::ffeebb,11999944), it throws an exception (§8.3, Chapter 14), which indicates that
something went wrong in a way that cannot be ignored. If the data supplied is acceptable, the obvious initialization is done. Initialization is a relatively complicated operation because it involves
data validation. This is fairly typical. On the other hand, once a D
Daattee has been created, it can be
used and copied without further checking. In other words, the constructor establishes the invariant
for the class (in this case, that it denotes a valid date). Other member functions can rely on that
invariant and must maintain it. This design technique can simplify code immensely (see §24.3.7.1).
I’m using the value M
Moonntthh(00) – which doesn’t represent a month – to represent ‘‘pick the
default month.’’ I could have defined an enumerator in M
Moonntthh specifically to represent that. But I
decided that it was better to use an obviously anomalous value to represent ‘‘pick the default
month’’ rather than give the appearance that there were 13 months in a year. Note that 0 can be
used because it is within the range guaranteed for the enumeration M
Moonntthh (§4.8).
I considered factoring out the data validation in a separate function iiss__ddaattee(). However, I
found the resulting user code more complicated and less robust than code relying on catching the
exception. For example, assuming that >> is defined for D
Daattee:
vvooiidd ffiillll(vveeccttoorr<D
Daattee>& aaaa)
{
w
whhiillee (cciinn) {
D
Daattee dd;
ttrryy {
cciinn >> dd;
}
ccaattcchh (D
Daattee::B
Baadd__ddaattee) {
// my error handling
ccoonnttiinnuuee;
}
aaaa.ppuusshh__bbaacckk(dd); // see §3.7.3
}
}
As is common for such simple concrete types, the definitions of member functions vary between
the trivial and the not-too-complicated. For example:
iinnlliinnee iinntt D
Daattee::ddaayy() ccoonnsstt
{
rreettuurrnn dd;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
240
Classes
Chapter 10
D
Daattee& D
Daattee::aadddd__m
moonntthh(iinntt nn)
{
iiff (nn==00) rreettuurrnn *tthhiiss;
iiff (nn>00) {
iinntt ddeellttaa__yy = nn/1122;
iinntt m
mm
m=m
m+nn%1122;
iiff (1122 < m
mm
m) { // note: int(dec)==12
ddeellttaa__yy++;
m
mm
m -= 1122;
}
// handle the cases where Month(mm) doesn’t have day d
y += ddeellttaa__yy;
m=M
Moonntthh(m
mm
m);
rreettuurrnn *tthhiiss;
}
// handle negative n
rreettuurrnn *tthhiiss;
}
10.3.2 Helper Functions [class.helper]
Typically, a class has a number of functions associated with it that need not be defined in the class
itself because they don’t need direct access to the representation. For example:
iinntt ddiiffff(D
Daattee aa, D
Daattee bb); // number of days in the range [a,b) or [b,a)
bbooooll lleeaappyyeeaarr(iinntt yy);
D
Daattee nneexxtt__w
weeeekkddaayy(D
Daattee dd);
D
Daattee nneexxtt__ssaattuurrddaayy(D
Daattee dd);
Defining such functions in the class itself would complicate the class interface and increase the
number of functions that would potentially need to be examined when a change to the representation was considered.
How are such functions ‘‘associated’’ with class D
Daattee? Traditionally, their declarations were
simply placed in the same file as the declaration of class D
Daattee, and users who needed D
Daattees would
make them all available by including the file that defined the interface (§9.2.1). For example:
#iinncclluuddee "D
Daattee.hh"
In addition to using a specific D
Daattee.hh header, or as an alternative, we can make the association
explicit by enclosing the class and its helper functions in a namespace (§8.2):
nnaam
meessppaaccee C
Chhrroonnoo {
// facilities for dealing with time
ccllaassss D
Daattee { /* ... */};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.3.2
Helper Functions
241
iinntt ddiiffff(D
Daattee aa, D
Daattee bb);
bbooooll lleeaappyyeeaarr(iinntt yy);
D
Daattee nneexxtt__w
weeeekkddaayy(D
Daattee dd);
D
Daattee nneexxtt__ssaattuurrddaayy(D
Daattee dd);
// ...
}
The C
Chhrroonnoo namespace would naturally also contain related classes, such as T
Tiim
mee and SSttooppw
waattcchh,
and their helper functions. Using a namespace to hold a single class is usually an over-elaboration
that leads to inconvenience.
10.3.3 Overloaded Operators [class.over]
It is often useful to add functions to enable conventional notation. For example, the ooppeerraattoorr==
function defines the equality operator == to work for D
Daattees:
iinnlliinnee bbooooll ooppeerraattoorr==(D
Daattee aa, D
Daattee bb) // equality
{
rreettuurrnn aa.ddaayy()==bb.ddaayy() && aa.m
moonntthh()==bb.m
moonntthh() && aa.yyeeaarr()==bb.yyeeaarr();
}
Other obvious candidates are:
bbooooll ooppeerraattoorr!=(D
Daattee, D
Daattee);
bbooooll ooppeerraattoorr<(D
Daattee, D
Daattee);
bbooooll ooppeerraattoorr>(D
Daattee, D
Daattee);
// ...
// inequality
// less than
// greater than
D
Daattee& ooppeerraattoorr++(D
Daattee& dd);
D
Daattee& ooppeerraattoorr--(D
Daattee& dd);
// increase Date by one day
// decrease Date by one day
D
Daattee& ooppeerraattoorr+=(D
Daattee& dd, iinntt nn);
D
Daattee& ooppeerraattoorr-=(D
Daattee& dd, iinntt nn);
// add n days
// subtract n days
D
Daattee ooppeerraattoorr+(D
Daattee dd, iinntt nn);
D
Daattee ooppeerraattoorr-(D
Daattee dd, iinntt nn);
// add n days
// subtract n days
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m&, D
Daattee dd); // output d
iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m&, D
Daattee& dd); // read into d
For D
Daattee, these operators can be seen as mere conveniences. However, for many types – such as
complex numbers (§11.3), vectors (§3.7.1), and function-like objects (§18.4) – the use of conventional operators is so firmly entrenched in people’s minds that their definition is almost mandatory.
Operator overloading is discussed in Chapter 11.
10.3.4 The Significance of Concrete Classes [class.significance]
I call simple user-defined types, such as D
Daattee, concrete types to distinguish them from abstract
classes (§2.5.4) and class hierarchies (§12.3) and also to emphasize their similarity to built-in types
such as iinntt and cchhaarr. They have also been called value types, and their use value-oriented
programming. Their model of use and the ‘‘philosophy’’ behind their design are quite different
from what is often advertised as object-oriented programming (§2.6.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
242
Classes
Chapter 10
The intent of a concrete type is to do a single, relatively small thing well and efficiently. It is
not usually the aim to provide the user with facilities to modify the behavior of a concrete type. In
particular, concrete types are not intended to display polymorphic behavior (see §2.5.5, §12.2.6).
If you don’t like some detail of a concrete type, you build a new one with the desired behavior.
If you want to ‘‘reuse’’ a concrete type, you use it in the implementation of your new type exactly
as you would have used an iinntt. For example:
ccllaassss D
Daattee__aanndd__ttiim
mee {
pprriivvaattee:
D
Daattee dd;
T
Tiim
mee tt;
ppuubblliicc:
D
Daattee__aanndd__ttiim
mee(D
Daattee dd, T
Tiim
mee tt);
D
Daattee__aanndd__ttiim
mee(iinntt dd, D
Daattee::M
Moonntthh m
m, iinntt yy, T
Tiim
mee tt);
// ...
};
The derived class mechanism discussed in Chapter 12 can be used to define new types from a concrete class by describing the desired differences. The definition of V
Veecc from vveeccttoorr (§3.7.2) is an
example of this.
With a reasonably good compiler, a concrete class such as D
Daattee incurs no hidden overhead in
time or space. The size of a concrete type is known at compile time so that objects can be allocated
on the run-time stack (that is, without free-store operations). The layout of each object is known at
compile time so that inlining of operations is trivially achieved. Similarly, layout compatibility
with other languages, such as C and Fortran, comes without special effort.
A good set of such types can provide a foundation for applications. Lack of suitable ‘‘small
efficient types’’ in an application can lead to gross run-time and space inefficiencies when overly
general and expensive classes are used. Alternatively, lack of concrete types can lead to obscure
programs and time wasted when each programmer writes code to directly manipulate ‘‘simple and
frequently used’’ data structures.
10.4 Objects [class.objects]
Objects can be created in several ways. Some are local variables, some are global variables, some
are members of classes, etc. This section discusses these alternatives, the rules that govern them,
the constructors used to initialize objects, and the destructors used to clean up objects before they
become unusable.
10.4.1 Destructors [class.dtor]
A constructor initializes an object. In other words, it creates the environment in which the member
functions operate. Sometimes, creating that environment involves acquiring a resource – such as a
file, a lock, or some memory – that must be released after use (§14.4.7). Thus, some classes need a
function that is guaranteed to be invoked when an object is destroyed in a manner similar to the
way a constructor is guaranteed to be invoked when an object is created. Inevitably, such functions
are called destructors. They typically clean up and release resources. Destructors are called
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.1
Destructors
243
implicitly when an automatic variable goes out of scope, an object on the free store is deleted, etc.
Only in very unusual circumstances does the user need to call a destructor explicitly (§10.4.11).
The most common use of a destructor is to release memory acquired in a constructor. Consider
a simple table of elements of some type N
Naam
mee. The constructor for T
Taabbllee must allocate memory to
hold the elements. When the table is somehow deleted, we must ensure that this memory is
reclaimed for further use elsewhere. We do this by providing a special function to complement the
constructor:
ccllaassss N
Naam
mee {
ccoonnsstt cchhaarr* ss;
// ...
};
ccllaassss T
Taabbllee {
N
Naam
mee* pp;
ssiizzee__tt sszz;
ppuubblliicc:
T
Taabbllee(ssiizzee__tt s = 1155) { p = nneew
w N
Naam
mee[sszz = ss]; }// constructor
~T
Taabbllee() { ddeelleettee[] pp; }
// destructor
N
Naam
mee* llooookkuupp(ccoonnsstt cchhaarr *);
bbooooll iinnsseerrtt(N
Naam
mee*);
};
The destructor notation ~T
Taabbllee() uses the complement symbol ~ to hint at the destructor’s relation to the T
Taabbllee() constructor.
A matching constructor/destructor pair is the usual mechanism for implementing the notion of a
variably-sized object in C++. Standard library containers, such as m
maapp, use a variant of this technique for providing storage for their elements, so the following discussion illustrates techniques
you rely on every time you use a standard container (including a standard ssttrriinngg). The discussion
applies to types without a destructor, also. Such types are seen simply as having a destructor that
does nothing.
10.4.2 Default Constructors [class.default]
Similarly, most types can be considered to have a default constructor. A default constructor is a
constructor that can be called without supplying an argument. Because of the default argument 1155,
T
Taabbllee::T
Taabbllee(ssiizzee__tt) is a default constructor. If a user has declared a default constructor, that
one will be used; otherwise, the compiler will try to generate one if needed and if the user hasn’t
declared other constructors. A compiler-generated default constructor implicitly calls the default
constructors for a class’ members of class type and bases (§12.2.2). For example:
ssttrruucctt T
Taabblleess {
iinntt ii;
iinntt vvii[1100];
T
Taabbllee tt11;
T
Taabbllee vvtt[1100];
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
244
Classes
Chapter 10
T
Taabblleess tttt;
Here, tttt will be initialized using a generated default constructor that calls T
Taabbllee(1155) for tttt.tt11 and
each element of tttt.vvtt. On the other hand, tttt.ii and the elements of tttt.vvii are not initialized because
those objects are not of a class type. The reasons for the dissimilar treatment of classes and built-in
types are C compatibility and fear of run-time overhead.
Because ccoonnsstts and references must be initialized (§5.5, §5.4), a class containing ccoonnsstt or reference members cannot be default-constructed unless the programmer explicitly supplies a constructor (§10.4.6.1). For example:
ssttrruucctt X {
ccoonnsstt iinntt aa;
ccoonnsstt iinntt& rr;
};
X xx; // error: no default constructor for X
Default constructors can be invoked explicitly (§10.4.10). Built-in types also have default constructors (§6.2.8).
10.4.3 Construction and Destruction [class.ctor.dtor]
Consider the different ways an object can be created and how it gets destroyed afterwards. An
object can be created as:
§10.4.4 A named automatic object, which is created each time its declaration is encountered
in the execution of the program and destroyed each time the program exits the block
in which it occurs
§10.4.5 A free-store object, which is created using the nneew
w operator and destroyed using the
ddeelleettee operator
§10.4.6 A nonstatic member object, which is created as a member of another class object and
created and destroyed when the object of which it is a member is created and
destroyed
§10.4.7 An array element, which is created and destroyed when the array of which it is an element is created and destroyed
§10.4.8 A local static object, which is created the first time its declaration is encountered in
the execution of the program and destroyed once at the termination of the program
§10.4.9 A global, namespace, or class static object, which is created once ‘‘at the start of the
program’’ and destroyed once at the termination of the program
§10.4.10 A temporary object, which is created as part of the evaluation of an expression and
destroyed at the end of the full expression in which it occurs
§10.4.11 An object placed in memory obtained from a user-supplied function guided by arguments supplied in the allocation operation
§10.4.12 A uunniioonn member, which may not have a constructor or a destructor
This list is roughly sorted in order of importance. The following subsections explain these various
ways of creating objects and their uses.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.4
Local Variables
245
10.4.4 Local Variables [class.local]
The constructor for a local variable is executed each time the thread of control passes through the
declaration of the local variable. The destructor for a local variable is executed each time the local
variable’s block is exited. Destructors for local variables are executed in reverse order of their construction. For example:
vvooiidd ff(iinntt ii)
{
T
Taabbllee aaaa;
T
Taabbllee bbbb;
iiff (ii>00) {
T
Taabbllee cccc;
// ...
}
T
Taabbllee dddd;
// ...
}
Here, aaaa, bbbb, and dddd are constructed (in that order) each time ff() is called, and dddd, bbbb, and aaaa are
destroyed (in that order) each time we return from ff(). If ii>00 for a call, cccc will be constructed after
bbbb and destroyed before dddd is constructed.
10.4.4.1 Copying Objects [class.copy]
If tt11 and tt22 are objects of a class T
Taabbllee, tt22=tt11 by default means a memberwise copy of tt11 into tt22
(§10.2.5). Having assignment interpreted this way can cause a surprising (and usually undesired)
effect when used on objects of a class with pointer members. Memberwise copy is usually the
wrong semantics for copying objects containing resources managed by a constructor/destructor
pair. For example:
vvooiidd hh()
{
T
Taabbllee tt11;
T
Taabbllee tt22 = tt11; // copy initialization: trouble
T
Taabbllee tt33;
tt33 = tt22;
// copy assignment: trouble
}
Here, the T
Taabbllee default constructor is called twice: once each for tt11 and tt33. It is not called for tt22
because that variable was initialized by copying. However, the T
Taabbllee destructor is called three
times: once each for tt11, tt22, and tt33! The default interpretation of assignment is memberwise copy, so
tt11, tt22, and tt33 will, at the end of hh(), each contain a pointer to the array of names allocated on the
free store when tt11 was created. No pointer to the array of names allocated when tt33 was created
remains because it was overwritten by the tt33=tt22 assignment. Thus, in the absence of automatic
garbage collection (§10.4.5), its storage will be lost to the program forever. On the other hand, the
array created for tt11 appears in tt11, tt22, and tt33, so it will be deleted thrice. The result of that is undefined and probably disastrous.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
246
Classes
Chapter 10
Such anomalies can be avoided by defining what it means to copy a T
Taabbllee:
ccllaassss T
Taabbllee {
// ...
T
Taabbllee(ccoonnsstt T
Taabbllee&);
T
Taabbllee& ooppeerraattoorr=(ccoonnsstt T
Taabbllee&);
};
// copy constructor
// copy assignment
The programmer can define any suitable meaning for these copy operations, but the traditional one
for this kind of container is to copy the contained elements (or at least to give the user of the container the appearance that a copy has been done; see §11.12). For example:
T
Taabbllee::T
Taabbllee(ccoonnsstt T
Taabbllee& tt)
// copy constructor
{
p = nneew
w N
Naam
mee[sszz=tt.sszz];
ffoorr (iinntt i = 00; ii<sszz; ii++) pp[ii] = tt.pp[ii];
}
T
Taabbllee& T
Taabbllee::ooppeerraattoorr=(ccoonnsstt T
Taabbllee& tt)
// assignment
{
iiff (tthhiiss != &tt) {
// beware of self-assignment: t = t
ddeelleettee[] pp;
p = nneew
w N
Naam
mee[sszz=tt.sszz];
ffoorr (iinntt i = 00; ii<sszz; ii++) pp[ii] = tt.pp[ii];
}
rreettuurrnn *tthhiiss;
}
As is almost always the case, the copy constructor and the copy assignment differ considerably.
The fundamental reason is that a copy constructor initializes uninitialized memory, whereas the
copy assignment operator must correctly deal with a well-constructed object.
Assignment can be optimized in some cases, but the general strategy for an assignment operator
is simple: protect against self-assignment, delete old elements, initialize, and copy in new elements.
Usually every nonstatic member must be copied (§10.4.6.3).
10.4.5 Free Store [class.free]
An object created on the free store has its constructor invoked by the nneew
w operator and exists until
the ddeelleettee operator is applied to a pointer to it. Consider:
iinntt m
maaiinn()
{
T
Taabbllee* p = nneew
w T
Taabbllee;
T
Taabbllee* q = nneew
w T
Taabbllee;
ddeelleettee pp;
ddeelleettee pp; // probably causes run-time error
}
The constructor T
Taabbllee::T
Taabbllee() is called twice. So is the destructor T
Taabbllee::~T
Taabbllee(). Unfortunately, the nneew
ws and the ddeelleettees in this example don’t match, so the object pointed to by p is
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.5
Free Store
247
deleted twice and the object pointed to by q not at all. Not deleting an object is typically not an
error as far as the language is concerned; it is only a waste of space. However, in a program that is
meant to run for a long time, such a memory leak is a serious and hard-to-find error. There are
tools available for detecting such leaks. Deleting p twice is a serious error; the behavior is undefined and most likely disastrous.
Some C++ implementations automatically recycle the storage occupied by unreachable objects
(garbage collecting implementations), but their behavior is not standardized. Even when a garbage
collector is running, ddeelleettee will invoke a destructor if one is defined, so it is still a serious error to
delete an object twice. In many cases, that is only a minor inconvenience. In particular, where a
garbage collector is known to exist, destructors that do memory management only can be eliminated. This simplification comes at the cost of portability and for some programs, a possible
increase in run time and a loss of predictability of run-time behavior (§C.9.1).
After ddeelleettee has been applied to an object, it is an error to access that object in any way. Unfortunately, implementations cannot reliably detect such errors.
The user can specify how nneew
w does allocation and how ddeelleettee does deallocation (see §6.2.6.2
and §15.6). It is also possible to specify the way an allocation, initialization (construction), and
exceptions interact (see §14.4.5 and §19.4.5). Arrays on the free store are discussed in §10.4.7.
10.4.6 Class Objects as Members [class.m]
Consider a class that might be used to hold information for a small organization:
ccllaassss C
Clluubb {
ssttrriinngg nnaam
mee;
T
Taabbllee m
meem
mbbeerrss;
T
Taabbllee ooffffiicceerrss;
D
Daattee ffoouunnddeedd;
// ...
C
Clluubb(ccoonnsstt ssttrriinngg& nn, D
Daattee ffdd);
};
The C
Clluubb’s constructor takes the name of the club and its founding date as arguments. Arguments
for a member’s constructor are specified in a member initializer list in the definition of the constructor of the containing class. For example:
C
Clluubb::C
Clluubb(ccoonnsstt ssttrriinngg& nn, D
Daattee ffdd)
: nnaam
mee(nn), m
meem
mbbeerrss(), ooffffiicceerrss(), ffoouunnddeedd(ffdd)
{
// ...
}
The member initializers are preceded by a colon and the individual member initializers are separated by commas.
The members’ constructors are called before the body of the containing class’ own constructor
is executed. The constructors are called in the order in which they are declared in the class rather
than the order in which they appear in the initializer list. To avoid confusion, it is best to specify
the initializers in declaration order. The member destructors are called in the reverse order of construction.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
248
Classes
Chapter 10
If a member constructor needs no arguments, the member need not be mentioned in the member
initializer list, so
C
Clluubb::C
Clluubb(ccoonnsstt ssttrriinngg& nn, D
Daattee ffdd)
: nnaam
mee(nn), ffoouunnddeedd(ffdd)
{
// ...
}
is equivalent to the previous version. In each case, C
Clluubb::ooffffiicceerrss is constructed by T
Taabbllee::T
Taabbllee
with the default argument 1155.
When a class object containing class objects is destroyed, the body of that object’s own
destructor (if one is specified) is executed first and then the members’ destructors are executed in
reverse order of declaration. A constructor assembles the execution environment for the member
functions for a class from the bottom up (members first). The destructor disassembles it from the
top down (members last).
10.4.6.1 Necessary Member Initialization [class.ref.init]
Member initializers are essential for types for which initialization differs from assignment – that is,
for member objects of classes without default constructors, for ccoonnsstt members, and for reference
members. For example:
ccllaassss X {
ccoonnsstt iinntt ii;
C
Clluubb cc;
C
Clluubb& ppcc;
// ...
X
X(iinntt iiii, ccoonnsstt ssttrriinngg& nn, D
Daattee dd, C
Clluubb& cc) : ii(iiii), cc(nn,dd), ppcc(cc) { }
};
There isn’t any other way to initialize such members, and it is an error not to initialize objects of
those types. For most types, however, the programmer has a choice between using an initializer
and using an assignment. In that case, I usually prefer to use the member initializer syntax, thus
making explicit the fact that initialization is being done. Often, there also is an efficiency advantage to using the initializer syntax. For example:
ccllaassss P
Peerrssoonn {
ssttrriinngg nnaam
mee;
ssttrriinngg aaddddrreessss;
// ...
P
Peerrssoonn(ccoonnsstt P
Peerrssoonn&);
P
Peerrssoonn(ccoonnsstt ssttrriinngg& nn, ccoonnsstt ssttrriinngg& aa);
};
P
Peerrssoonn::P
Peerrssoonn(ccoonnsstt ssttrriinngg& nn, ccoonnsstt ssttrriinngg& aa)
: nnaam
mee(nn)
{
aaddddrreessss = aa;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.6.1
Necessary Member Initialization
249
Here nnaam
mee is initialized with a copy of nn. On the other hand, aaddddrreessss is first initialized to the
empty string and then a copy of a is assigned.
10.4.6.2 Member Constants [class.memconst]
It is also possible to initialize a static integral constant member by adding a constant-expression initializer to its member declaration. For example:
ccllaassss C
Cuurriioouuss {
ppuubblliicc:
ssttaattiicc ccoonnsstt iinntt cc11 = 77;
ssttaattiicc iinntt cc22 = 1111;
ccoonnsstt iinntt cc33 = 1133;
ssttaattiicc ccoonnsstt iinntt cc44 = ff(1177);
ssttaattiicc ccoonnsstt ffllooaatt cc55 = 77.00;
// ...
};
// ok, but remember definition
// error: not const
// error: not static
// error: in-class initializer not constant
// error: in-class not integral
If (and only if) you use an initialized member in a way that requires it to be stored as an object in
memory, the member must be (uniquely) defined somewhere. The initializer may not be repeated:
ccoonnsstt iinntt C
Cuurriioouuss::cc11;
// necessary, but don’t repeat initializer here
ccoonnsstt iinntt* p = &C
Cuurriioouuss::cc11;
// ok: Curious::c1 has been defined
Alternatively, you can use an enumerator (§4.8, §14.4.6, §15.3) as a symbolic constant within a
class declaration. For example:
ccllaassss X {
eennuum
m { cc11 = 77, cc22 = 1111, cc33 = 1133, cc44 = 1177 };
// ...
};
In that way, you are not tempted to initialize variables, floating-point numbers, etc. within a class.
10.4.6.3 Copying Members [class.mem.copy]
A default copy constructor or default copy assignment (§10.4.4.1) copies all elements of a class. If
this copy cannot be done, it is an error to try to copy an object of such a class. For example:
ccllaassss U
Unniiqquuee__hhaannddllee {
pprriivvaattee:
// copy operations are private to prevent copying (§11.2.2)
U
Unniiqquuee__hhaannddllee(ccoonnsstt U
Unniiqquuee__hhaannddllee&);
U
Unniiqquuee__hhaannddllee& ooppeerraattoorr=(ccoonnsstt U
Unniiqquuee__hhaannddllee&);
ppuubblliicc:
// ...
};
ssttrruucctt Y {
// ...
U
Unniiqquuee__hhaannddllee aa;
};
// requires explicit initialization
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
250
Classes
Y yy11;
Y yy22 = yy11;
Chapter 10
// error: cannot copy Y::a
In addition, a default assignment cannot be generated if a nonstatic member is a reference, a ccoonnsstt,
or a user-defined type without a copy assignment.
Note that the default copy constructor leaves a reference member referring to the same object in
both the original and the copied object. This can be a problem if the object referred to is supposed
to be deleted.
When writing a copy constructor, we must take care to copy every element that needs to be
copied. By default, elements are default-initialized, but that is often not what is desired in a copy
constructor. For example:
P
Peerrssoonn::P
Peerrssoonn(ccoonnsstt P
Peerrssoonn& aa) : nnaam
mee(aa.nnaam
mee) { }
// beware!
Here, I forgot to copy the aaddddrreessss, so aaddddrreessss is initialized to the empty string by default. When
adding a new member to a class, always check if there are user-defined constructors that need to be
updated in order to initialize and copy the new member.
10.4.7 Arrays [class.array]
If an object of a class can be constructed without supplying an explicit initializer, then arrays of that
class can be defined. For example:
T
Taabbllee ttbbll[1100];
This will create an array of 1100 T
Taabbllees and initialize each T
Taabbllee by a call of T
Taabbllee::T
Taabbllee() with
the default argument 1155.
There is no way to specify explicit arguments for a constructor in an array declaration. If you
absolutely must initialize members of an array with different values, you can write a default constructor that directly or indirectly reads and writes nonlocal data. For example:
ccllaassss IIbbuuffffeerr {
ssttrriinngg bbuuff;
ppuubblliicc:
IIbbuuffffeerr() { cciinn>>bbuuff; }
// ...
};
vvooiidd ff()
{
IIbbuuffffeerr w
woorrddss[110000]; // each word initialized from cin
// ...
}
It is usually best to avoid such subtleties.
The destructor for each constructed element of an array is invoked when that array is destroyed.
This is done implicitly for arrays that are not allocated using nneew
w. Like C, C++ doesn’t distinguish
between a pointer to an individual object and a pointer to the initial element of an array (§5.3).
Consequently, the programmer must state whether an array or an individual object is being deleted.
For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.7
vvooiidd ff(iinntt sszz)
{
T
Taabbllee* tt11 = nneew
w
T
Taabbllee* tt22 = nneew
w
T
Taabbllee* tt33 = nneew
w
T
Taabbllee* tt44 = nneew
w
ddeelleettee tt11;
ddeelleettee[] tt22;
ddeelleettee[] tt33;
ddeelleettee tt44;
Arrays
251
T
Taabbllee;
T
Taabbllee[sszz];
T
Taabbllee;
T
Taabbllee[sszz];
// right
// right
// wrong: trouble
// wrong: trouble
}
Exactly how arrays and individual objects are allocated is implementation-dependent. Therefore,
different implementations will react differently to incorrect uses of the ddeelleettee and ddeelleettee[] operators. In simple and uninteresting cases like the previous one, a compiler can detect the problem, but
generally something nasty will happen at run time.
The special destruction operator for arrays, ddeelleettee[], isn’t logically necessary. However, suppose the implementation of the free store had been required to hold sufficient information for every
object to tell if it was an individual or an array. The user could have been relieved of a burden, but
that obligation would have imposed significant time and space overheads on some C++ implementations.
As always, if you find C-style arrays too cumbersome, use a class such as vveeccttoorr (§3.7.1, §16.3)
instead. For example:
vvooiidd gg()
{
vveeccttoorr<T
Taabbllee>* pp11 = nneew
w vveeccttoorr<T
Taabbllee>(1100);
T
Taabbllee* pp22 = nneew
w T
Taabbllee;
ddeelleettee pp11;
ddeelleettee pp22;
}
10.4.8 Local Static Store [class.obj.static]
The constructor for a local static object (§7.1.2) is called the first time the thread of control passes
through the object’s definition. Consider this:
vvooiidd ff(iinntt ii)
{
ssttaattiicc T
Taabbllee ttbbll;
// ...
iiff (ii) {
ssttaattiicc T
Taabbllee ttbbll22;
// ...
}
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
252
Classes
Chapter 10
iinntt m
maaiinn()
{
ff(00);
ff(11);
ff(22);
// ...
}
Here, the constructor is called for ttbbll once the first time ff() is called. Because ttbbll is declared
ssttaattiicc, it does not get destroyed on return from ff() and it does not get constructed a second time
when ff() is called again. Because the block containing the declaration of ttbbll22 doesn’t get executed
for the call ff(00), ttbbll22 doesn’t get constructed until the call ff(11). It does not get constructed again
when its block is entered a second time.
The destructors for local static objects are invoked in the reverse order of their construction
when the program terminates (§9.4.1.1). Exactly when is unspecified.
10.4.9 Nonlocal Store [class.global]
A variable defined outside any function (that is, global, namespace, and class ssttaattiicc variables) is
initialized (constructed) before m
maaiinn() is invoked, and any such variable that has been constructed
will have its destructor invoked after exit from m
maaiinn(). Dynamic linking complicates this picture
slightly by delaying the initialization until the code is linked into the running program.
Constructors for nonlocal objects in a translation unit are executed in the order their definitions
occur. Consider:
ccllaassss X {
// ...
ssttaattiicc T
Taabbllee m
meem
mttbbll;
};
T
Taabbllee ttbbll;
T
Taabbllee X
X::m
meem
mttbbll;
nnaam
meessppaaccee Z {
T
Taabbllee ttbbll22;
}
The order of construction is ttbbll, then X
X::m
meem
mttbbll, and then Z
Z::ttbbll22. Note that a declaration (as
opposed to a definition), such as the declaration of m
meem
mttbbll in X
X, doesn’t affect the order of construction. The destructors are called in the reverse order of construction: Z
Z::ttbbll22, then
X
X::m
meem
mttbbll, and then ttbbll.
No implementation-independent guarantees are made about the order of construction of nonlocal objects in different compilation units. For example:
// file1.c:
T
Taabbllee ttbbll11;
// file2.c:
T
Taabbllee ttbbll22;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.9
Nonlocal Store
253
Whether ttbbll11 is constructed before ttbbll22 or vice versa is implementation-dependent. The order isn’t
even guaranteed to be fixed in every particular implementation. Dynamic linking, or even a small
change in the compilation process, can alter the sequence. The order of destruction is similarly
implementation-dependent.
Sometimes when you design a library, it is necessary, or simply convenient, to invent a type
with a constructor and a destructor with the sole purpose of initialization and cleanup. Such a type
would be used once only: to allocate a static object so that the constructor and the destructor are
called. For example:
ccllaassss Z
Zlliibb__iinniitt {
Z
Zlliibb__iinniitt();
~Z
Zlliibb__iinniitt();
};
// get Zlib ready for use
// clean up after Zlib
ccllaassss Z
Zlliibb {
ssttaattiicc Z
Zlliibb__iinniitt xx;
// ...
};
Unfortunately, it is not guaranteed that such an object is initialized before its first use and destroyed
after its last use in a program consisting of separately compiled units. A particular C++ implementation may provide such a guarantee, but most don’t. A programmer may ensure proper initialization by implementing the strategy that the implementations usually employ for local static
objects: a first-time switch. For example:
ccllaassss Z
Zlliibb {
ssttaattiicc bbooooll iinniittiiaalliizzeedd;
ssttaattiicc vvooiidd iinniittiiaalliizzee() { /* initialize */ iinniittiiaalliizzeedd = ttrruuee; }
ppuubblliicc:
// no constructor
vvooiidd ff()
{
iiff (iinniittiiaalliizzeedd == ffaallssee) iinniittiiaalliizzee();
// ...
}
// ...
};
If there are many functions that need to test the first-time switch, this can be tedious, but it is often
manageable. This technique relies on the fact that statically allocated objects without constructors
are initialized to 00. The really difficult case is the one in which the first operation may be timecritical so that the overhead of testing and possible initialization can be serious. In that case, further
trickery is required (§21.5.2).
An alternative approach for a simple object is to present it as a function (§9.4.1):
iinntt& oobbjj() { ssttaattiicc iinntt x = 00; rreettuurrnn xx; } // initialized upon first use
First-time switches do not handle every conceivable situation. For example, it is possible to create
objects that refer to each other during construction. Such examples are best avoided. If such
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
254
Classes
Chapter 10
objects are necessary, they must be constructed carefully in stages. Also, there is no similarly simple last-time switch construct. Instead, see §9.4.1.1 and §21.5.2.
10.4.10 Temporary Objects [class.temp]
Temporary objects most often are the result of arithmetic expressions. For example, at some point
in the evaluation of xx*yy+zz the partial result xx*yy must exist somewhere. Except when performance
is the issue (§11.6), temporary objects rarely become the concern of the programmer. However, it
happens (§11.6, §22.4.7).
Unless bound to a reference or used to initialize a named object, a temporary object is destroyed
at the end of the full expression in which it was created. A full expression is an expression that is
not a subexpression of some other expression.
The standard ssttrriinngg class has a member function cc__ssttrr() that returns a C-style, zero-terminated
array of characters (§3.5.1, §20.4.1). Also, the operator + is defined to mean string concatenation.
These are very useful facilities for ssttrriinnggss. However, in combination they can cause obscure problems. For example:
vvooiidd ff(ssttrriinngg& ss11, ssttrriinngg& ss22, ssttrriinngg& ss33)
{
ccoonnsstt cchhaarr* ccss = (ss11+ss22).cc__ssttrr();
ccoouutt << ccss;
iiff (ssttrrlleenn(ccss=(ss22+ss33).cc__ssttrr())<88 && ccss[00]==´aa´) {
// cs used here
}
}
Probably, your first reaction is ‘‘but don’t do that,’’ and I agree. However, such code does get written, so it is worth knowing how it is interpreted.
A temporary object of class ssttrriinngg is created to hold ss11+ss22. Next, a pointer to a C-style string
is extracted from that object. Then – at the end of the expression – the temporary object is deleted.
Now, where was the C-style string allocated? Probably as part of the temporary object holding
ss11+ss22, and that storage is not guaranteed to exist after that temporary is destroyed. Consequently,
ccss points to deallocated storage. The output operation ccoouutt<<ccss might work as expected, but that
would be sheer luck. A compiler can detect and warn against many variants of this problem.
The example with the if-statement is a bit more subtle. The condition will work as expected
because the full expression in which the temporary holding ss22+ss33 is created is the condition itself.
However, that temporary is destroyed before the controlled statement is entered, so any use of ccss
there is not guaranteed to work.
Please note that in this case, as in many others, the problems with temporaries arose from using
a high-level data type in a low-level way. A cleaner programming style would have not only
yielded a more understandable program fragment, but also avoided the problems with temporaries
completely. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.10
Temporary Objects
255
vvooiidd ff(ssttrriinngg& ss11, ssttrriinngg& ss22, ssttrriinngg& ss33)
{
ccoouutt << ss11+ss22;
ssttrriinngg s = ss22+ss33;
iiff (ss.lleennggtthh()<88 && ss[00]==´aa´) {
// use s here
}
}
A temporary can be used as an initializer for a ccoonnsstt reference or a named object. For example:
vvooiidd gg(ccoonnsstt ssttrriinngg&, ccoonnsstt ssttrriinngg&);
vvooiidd hh(ssttrriinngg& ss11, ssttrriinngg& ss22)
{
ccoonnsstt ssttrriinngg& s = ss11+ss22;
ssttrriinngg ssss = ss11+ss22;
gg(ss,ssss); // we can use s and ss here
}
This is fine. The temporary is destroyed when ‘‘its’’ reference or named object go out of scope.
Remember that returning a reference to a local variable is an error (§7.3) and that a temporary
object cannot be bound to a non-ccoonnsstt reference (§5.5).
A temporary object can also be created by explicitly invoking a constructor. For example:
vvooiidd ff(SShhaappee& ss, iinntt xx, iinntt yy)
{
ss.m
moovvee(P
Pooiinntt(xx,yy));
// construct Point to pass to Shape::move()
// ...
}
Such temporaries are destroyed in exactly the same way as the implicitly generated temporaries.
10.4.11 Placement of Objects [class.placement]
Operator nneew
w creates its object on the free store by default. What if we wanted the object allocated
elsewhere? Consider a simple class:
ccllaassss X {
ppuubblliicc:
X
X(iinntt);
// ...
};
We can place objects anywhere by providing an allocator function with extra arguments and then
supplying such extra arguments when using nneew
w:
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt, vvooiidd* pp) { rreettuurrnn pp; }
// explicit placement operator
vvooiidd* bbuuff = rreeiinntteerrpprreett__ccaasstt<vvooiidd*>(00xxF
F0000F
F); // significant address
X
X* pp22 = nneew
w(bbuuff)X
X; // construct an X at ‘buf;’ invokes: operator new(sizeof(X),buf)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
256
Classes
Chapter 10
Because of this usage, the nneew
w(bbuuff)X
X syntax for supplying extra arguments to ooppeerraattoorr nneew
w() is
known as the placement syntax. Note that every ooppeerraattoorr nneew
w() takes a size as its first argument
and that the size of the object allocated is implicitly supplied (§15.6). The ooppeerraattoorr nneew
w() used
by the nneew
w operator is chosen by the usual argument matching rules (§7.4); every ooppeerraattoorr nneew
w()
has a ssiizzee__tt as its first argument.
The ‘‘placement’’ ooppeerraattoorr nneew
w() is the simplest such allocator. It is defined in the standard
header <nneew
w>.
The rreeiinntteerrpprreett__ccaasstt is the crudest and potentially nastiest of the type conversion operators
(§6.2.7). In most cases, it simply yields a value with the same bit pattern as its argument with the
type required. Thus, it can be used for the inherently implementation-dependent, dangerous, and
occasionally absolutely necessary activity of converting integer values to pointers and vice versa.
The placement nneew
w construct can also be used to allocate memory from a specific arena:
ccllaassss A
Arreennaa {
ppuubblliicc:
vviirrttuuaall vvooiidd* aalllloocc(ssiizzee__tt) =00;
vviirrttuuaall vvooiidd ffrreeee(vvooiidd*) =00;
// ...
};
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt sszz, A
Arreennaa* aa)
{
rreettuurrnn aa->aalllloocc(sszz);
}
Now objects of arbitrary types can be allocated from different A
Arreennaas as needed. For example:
eexxtteerrnn A
Arreennaa* P
Peerrssiisstteenntt;
eexxtteerrnn A
Arreennaa* SShhaarreedd;
vvooiidd gg(iinntt ii)
{
X
X* p = nneew
w(P
Peerrssiisstteenntt) X
X(ii);
X
X* q = nneew
w(SShhaarreedd) X
X(ii);
// ...
}
// X in persistent storage
// X in shared memory
Placing an object in an area that is not (directly) controlled by the standard free-store manager
implies that some care is required when destroying the object. The basic mechanism for that is an
explicit call of a destructor:
vvooiidd ddeessttrrooyy(X
X* pp, A
Arreennaa* aa)
{
pp->~X
X();
// call destructor
aa->ffrreeee(pp);
// free memory
}
Note that explicit calls of destructors, like the use of special-purpose global allocators, should be
avoided wherever possible. Occasionally, they are essential. For example, it would be hard to
implement an efficient general container along the lines of the standard library vveeccttoorr (§3.7.1,
§16.3.8) without using explicit destructor class. However, a novice should think thrice before
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.4.11
Placement of Objects
257
calling a destructor explicitly and also should ask a more experienced colleague before doing so.
See §14.4.7 for an explanation of how placement new interacts with exception handling.
There is no special syntax for placement of arrays. Nor need there be, since arbitrary types can
be allocated by placement new. However, a special ooppeerraattoorr ddeelleettee() can be defined for arrays
(§19.4.5).
10.4.12 Unions [class.union]
A named union is defined as a ssttrruucctt, where every member has the same address (see §C.8.2). A
union can have member functions but not static members.
In general, a compiler cannot know what member of a union is used; that is, the type of the
object stored in a union is unknown. Consequently, a union may not have members with constructors or destructors. It wouldn’t be possible to protect that object against corruption or to guarantee
that the right destructor is called when the union goes out of scope.
Unions are best used in low-level code, or as part of the implementation of classes that keep
track of what is stored in the union (see §10.6[20]).
10.5 Advice [class.advice]
[1] Represent concepts as classes; §10.1.
[2] Use public data (ssttrruucctts) only when it really is just data and no invariant is meaningful for the
data members; §10.2.8.
[3] A concrete type is the simplest kind of class. Where applicable, prefer a concrete type over
more complicated classes and over plain data structures; §10.3.
[4] Make a function a member only if it needs direct access to the representation of a class;
§10.3.2.
[5] Use a namespace to make the association between a class and its helper functions explicit;
§10.3.2.
[6] Make a member function that doesn’t modify the value of its object a ccoonnsstt member function;
§10.2.6.
[7] Make a function that needs access to the representation of a class but needn’t be called for a
specific object a ssttaattiicc member function; §10.2.4.
[8] Use a constructor to establish an invariant for a class; §10.3.1.
[9] If a constructor acquires a resource, its class needs a destructor to release the resource;
§10.4.1.
[10] If a class has a pointer member, it needs copy operations (copy constructor and copy assignment); §10.4.4.1.
[11] If a class has a reference member, it probably needs copy operations (copy constructor and
copy assignment); §10.4.6.3.
[12] If a class needs a copy operation or a destructor, it probably needs a constructor, a destructor, a
copy assignment, and a copy constructor; §10.4.4.1.
[13] Check for self-assignment in copy assignments; §10.4.4.1.
[14] When writing a copy constructor, be careful to copy every element that needs to be copied
(beware of default initializers); §10.4.4.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
258
Classes
Chapter 10
[15] When adding a new member to a class, always check to see if there are user-defined constructors that need to be updated to initialize the member; §10.4.6.3.
[16] Use enumerators when you need to define integer constants in class declarations; §10.4.6.1.
[17] Avoid order dependencies when constructing global and namespace objects; §10.4.9.
[18] Use first-time switches to minimize order dependencies; §10.4.9.
[19] Remember that temporary objects are destroyed at the end of the full expression in which they
are created; §10.4.10.
10.6 Exercises [class.exercises]
1. (∗1) Find the error in D
Daattee::aadddd__yyeeaarr() in §10.2.2. Then find two additional errors in the
version in §10.2.7.
2. (∗2.5) Complete and test D
Daattee. Reimplement it with ‘‘number of days after 1/1/1970’’ representation.
3. (∗2) Find a D
Daattee class that is in commercial use. Critique the facilities it offers. If possible,
then discuss that D
Daattee with a real user.
4. (∗1) How do you access sseett__ddeeffaauulltt from class D
Daattee from namespace C
Chhrroonnoo (§10.3.2)? Give
at least three different ways.
5. (∗2) Define a class H
Hiissttooggrraam
m that keeps count of numbers in some intervals specified as arguments to H
Hiissttooggrraam
m’s constructor. Provide functions to print out the histogram. Handle outof-range values.
6. (∗2) Define some classes for providing random numbers of certain distributions (for example,
uniform and exponential). Each class has a constructor specifying parameters for the distribution and a function ddrraaw
w that returns the next value.
7. (∗2.5) Complete class T
Taabbllee to hold (name,value) pairs. Then modify the desk calculator program from §6.1 to use class T
Taabbllee instead of m
maapp. Compare and contrast the two versions.
8. (∗2) Rewrite T
Tnnooddee from §7.10[7] as a class with constructors, destructors, etc. Define a tree of
T
Tnnooddees as a class with constructors, destructors, etc.
9. (∗3) Define, implement, and test a set of integers, class IInnttsseett. Provide union, intersection, and
symmetric difference operations.
10. (∗1.5) Modify class IInnttsseett into a set of nodes, where N
Nooddee is a structure you define.
11. (∗3) Define a class for analyzing, storing, evaluating, and printing simple arithmetic expressions
consisting of integer constants and the operators +, -, *, and /. The public interface should
look like this:
ccllaassss E
Exxpprr {
// ...
ppuubblliicc:
E
Exxpprr(cchhaarr*);
iinntt eevvaall();
vvooiidd pprriinntt();
};
The string argument for the constructor E
Exxpprr::E
Exxpprr() is the expression. The function
E
Exxpprr::eevvaall() returns the value of the expression, and E
Exxpprr::pprriinntt() prints a representation
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 10.6
Exercises
259
of the expression on ccoouutt. A program might look like this:
E
Exxpprr xx("112233/44+112233*44-33");
ccoouutt << "xx = " << xx.eevvaall() << "\\nn";
xx.pprriinntt();
Define class E
Exxpprr twice: once using a linked list of nodes as the representation and once using a
character string as the representation. Experiment with different ways of printing the expression: fully parenthesized, postfix notation, assembly code, etc.
12. (∗2) Define a class C
Chhaarr__qquueeuuee so that the public interface does not depend on the representation. Implement C
Chhaarr__qquueeuuee (a) as a linked list and (b) as a vector. Do not worry about concurrency.
13. (∗3) Design a symbol table class and a symbol table entry class for some language. Have a look
at a compiler for that language to see what the symbol table really looks like.
14. (∗2) Modify the expression class from §10.6[11] to handle variables and the assignment operator =. Use the symbol table class from §10.6[13].
15. (∗1) Given this program:
#iinncclluuddee <iioossttrreeaam
m>
iinntt m
maaiinn()
{
ssttdd::ccoouutt << "H
Heelllloo, w
woorrlldd!\\nn";
}
modify it to produce this output:
IInniittiiaalliizzee
H
Heelllloo, w
woorrlldd!
C
Clleeaann uupp
Do not change m
maaiinn() in any way.
16. (∗2) Define a C
Caallccuullaattoorr class for which the calculator functions from §6.1 provide most of the
implementation. Create C
Caallccuullaattoorrs and invoke them for input from cciinn, from command-line
arguments, and for strings in the program. Allow output to be delivered to a variety of targets
similar to the way input can be obtained from a variety of sources.
17. (∗2) Define two classes, each with a ssttaattiicc member, so that the construction of each ssttaattiicc
member involves a reference to the other. Where might such constructs appear in real code?
How can these classes be modified to eliminate the order dependence in the constructors?
18. (∗2.5) Compare class D
Daattee (§10.3) with your solution to §5.9[13] and §7.10[19]. Discuss errors
found and likely differences in maintenance of the two solutions.
19. (∗3) Write a function that, given an iissttrreeaam
m and a vveeccttoorr<ssttrriinngg>, produces a
m
maapp<ssttrriinngg,vveeccttoorr<iinntt>> holding each string and the numbers of the lines on which the string
appears. Run the program on a text-file with no fewer than 1,000 lines looking for no fewer
than 10 words.
20. (∗2) Take class E
Ennttrryy from §C.8.2 and modify it so that each union member is always used
according to its type.
.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
260
Classes
Chapter 10
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
11
________________________________________
________________________________________________________________________________________________________________________________________________________________
Operator Overloading
When I use a word it means just what
I choose it to mean – neither more nor less.
– Humpty Dumpty
Notation — operator functions — binary and unary operators — predefined meanings
for operators — user-defined meanings for operators — operators and namespaces — a
complex type — member and nonmember operators — mixed-mode arithmetic —
initialization — copying — conversions — literals — helper functions — conversion
operators — ambiguity resolution — friends — members and friends — large objects —
assignment and initialization — subscripting — function call — dereferencing — increment and decrement — a string class — advice — exercises.
11.1 Introduction [over.intro]
Every technical field – and most nontechnical fields – have developed conventional shorthand
notation to make convenient the presentation and discussion involving frequently-used concepts.
For example, because of long acquaintance
xx+yy*zz
is clearer to us than
m
muullttiippllyy y bbyy z aanndd aadddd tthhee rreessuulltt ttoo x
It is hard to overestimate the importance of concise notation for common operations.
Like most languages, C++ supports a set of operators for its built-in types. However, most concepts for which operators are conventionally used are not built-in types in C++, so they must be represented as user-defined types. For example, if you need complex arithmetic, matrix algebra, logic
signals, or character strings in C++, you use classes to represent these notions. Defining operators
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
262
Operator Overloading
Chapter 11
for such classes sometimes allows a programmer to provide a more conventional and convenient
notation for manipulating objects than could be achieved using only the basic functional notation.
For example,
ccllaassss ccoom
mpplleexx {
// very simplified complex
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx(ddoouubbllee rr, ddoouubbllee ii) : rree(rr), iim
m(ii) { }
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx);
ccoom
mpplleexx ooppeerraattoorr*(ccoom
mpplleexx);
};
defines a simple implementation of the concept of complex numbers. A ccoom
mpplleexx is represented by
a pair of double-precision floating-point numbers manipulated by the operators + and *. The programmer defines ccoom
mpplleexx::ooppeerraattoorr+() and ccoom
mpplleexx::ooppeerraattoorr*() to provide meanings for +
and *, respectively. For example, if b and c are of type ccoom
mpplleexx, bb+cc means bb.ooppeerraattoorr+(cc).
We can now approximate the conventional interpretation of ccoom
mpplleexx expressions:
vvooiidd ff()
{
ccoom
mpplleexx a = ccoom
mpplleexx(11, 33.11);
ccoom
mpplleexx b = ccoom
mpplleexx(11.22, 22);
ccoom
mpplleexx c = bb;
a = bb+cc;
b = bb+cc*aa;
c = aa*bb+ccoom
mpplleexx(11,22);
}
The usual precedence rules hold, so the second statement means bb=bb+(cc*aa), not bb=(bb+cc)*aa.
Many of the most obvious uses of operator overloading are for concrete types (§10.3). However, the usefulness of user-defined operators is not restricted to concrete types. For example, the
design of general and abstract interfaces often leads to the use of operators such as ->, [], and ().
11.2 Operator Functions [over.oper]
Functions defining meanings for the following operators (§6.2) can be declared:
+
|
-=
<<
>=
->
~
*=
>>
&&
[]
*
!
/=
>>=
||
()
/
=
%=
<<=
++
nneew
w
%
<
^=
==
-nneew
w[]
^
>
&=
!=
->*
ddeelleettee
&
+=
|=
<=
,
ddeelleettee[]
The following operators cannot be defined by a user:
:: (scope resolution; §4.9.4, §10.2.4),
. (member selection; §5.7), and
.* (member selection through pointer to function; §15.5).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.2
Operator Functions
263
They take a name, rather than a value, as their second operand and provide the primary means of
referring to members. Allowing them to be overloaded would lead to subtleties [Stroustrup,1994].
It is not possible to define new operator tokens, but you can use the function-call notation when
this set of operators is not adequate. For example, use ppoow
w(), not **. These restrictions may
seem Draconian, but more flexible rules can easily lead to ambiguities. For example, defining an
operator ** to mean exponentiation may seem an obvious and easy task at first glance, but think
again. Should ** bind to the left (as in Fortran) or to the right (as in Algol)? Should the expression aa**pp be interpreted as aa*(*pp) or as (aa)**(pp)?
The name of an operator function is the keyword ooppeerraattoorr followed by the operator itself; for
example, ooppeerraattoorr<<. An operator function is declared and can be called like any other function.
A use of the operator is only a shorthand for an explicit call of the operator function. For example:
vvooiidd ff(ccoom
mpplleexx aa, ccoom
mpplleexx bb)
{
ccoom
mpplleexx c = a + bb;
ccoom
mpplleexx d = aa.ooppeerraattoorr+(bb);
}
// shorthand
// explicit call
Given the previous definition of ccoom
mpplleexx, the two initializers are synonymous.
11.2.1 Binary and Unary Operators [over.binary]
A binary operator can be defined by either a nonstatic member function taking one argument or a
nonmember function taking two arguments. For any binary operator @, aaaa@bbbb can be interpreted as
either aaaa.ooppeerraattoorr@(bbbb) or ooppeerraattoorr@(aaaa,bbbb). If both are defined, overload resolution (§7.4)
determines which, if any, interpretation is used. For example:
ccllaassss X {
ppuubblliicc:
vvooiidd ooppeerraattoorr+(iinntt);
X
X(iinntt);
};
vvooiidd ooppeerraattoorr+(X
X,X
X);
vvooiidd ooppeerraattoorr+(X
X,ddoouubbllee);
vvooiidd ff(X
X aa)
{
aa+11;
11+aa;
aa+11.00;
}
// a.operator+(1)
// ::operator+(X(1),a)
// ::operator+(a,1.0)
A unary operator, whether prefix or postfix, can be defined by either a nonstatic member function
taking no arguments or a nonmember function taking one argument. For any prefix unary operator
@, @aaaa can be interpreted as either aaaa.ooppeerraattoorr@() or ooppeerraattoorr@(aaaa). If both are defined, overload resolution (§7.4) determines which, if any, interpretation is used. For any postfix unary operator @, aaaa@ can be interpreted as either aaaa.ooppeerraattoorr@(iinntt) or ooppeerraattoorr@(aaaa,iinntt). This is
explained further in §11.11. If both are defined, overload resolution (§7.4) determines which, if
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
264
Operator Overloading
Chapter 11
any, interpretation is used. An operator can be declared only for the syntax defined for it in the
grammar (§A.5). For example, a user cannot define a unary % or a ternary +. Consider:
ccllaassss X {
// members (with implicit ‘this’ pointer):
X
X* ooppeerraattoorr&();
X ooppeerraattoorr&(X
X);
X ooppeerraattoorr++(iinntt);
X ooppeerraattoorr&(X
X,X
X);
X ooppeerraattoorr/();
// prefix unary & (address of)
// binary & (and)
// postfix increment (see §11.11)
// error: ternary
// error: unary /
};
// nonmember functions :
X
X
X
X
X
X
ooppeerraattoorr-(X
X);
ooppeerraattoorr-(X
X,X
X);
ooppeerraattoorr--(X
X&,iinntt);
ooppeerraattoorr-();
ooppeerraattoorr-(X
X,X
X,X
X);
ooppeerraattoorr%(X
X);
// prefix unary minus
// binary minus
// postfix decrement
// error: no operand
// error: ternary
// error: unary %
Operator [] is described in §11.8, operator () in §11.9, operator -> in §11.10, operators ++ and
-- in §11.11, and the allocation and deallocation operators in §6.2.6.2, §10.4.11, and §15.6.
11.2.2 Predefined Meanings for Operators [over.predefined]
Only a few assumptions are made about the meaning of a user-defined operator. In particular,
ooppeerraattoorr=, ooppeerraattoorr[], ooppeerraattoorr(), and ooppeerraattoorr-> must be nonstatic member functions; this
ensures that their first operands will be lvalues (§4.9.6).
The meanings of some built-in operators are defined to be equivalent to some combination of
other operators on the same arguments. For example, if a is an int, ++aa means aa+=11, which in turn
means aa=aa+11. Such relations do not hold for user-defined operators unless the user happens to
define them that way. For example, a compiler will not generate a definition of Z
Z::ooppeerraattoorr+=()
from the definitions of Z
Z::ooppeerraattoorr+() and Z
Z::ooppeerraattoorr=().
Because of historical accident, the operators = (assignment), & (address-of), and , (sequencing;
§6.2.2) have predefined meanings when applied to class objects. These predefined meanings can
be made inaccessible to general users by making them private:
ccllaassss X {
pprriivvaattee:
vvooiidd ooppeerraattoorr=(ccoonnsstt X
X&);
vvooiidd ooppeerraattoorr&();
vvooiidd ooppeerraattoorr,(ccoonnsstt X
X&);
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.2.2
vvooiidd ff(X
X aa, X
{
a = bb;
&aa;
aa,bb;
}
Predefined Meanings for Operators
265
bb)
// error: operator= private
// error: operator& private
// error: operator, private
Alternatively, they can be given new meanings by suitable definitions.
11.2.3 Operators and User-Defined Types [over.user]
An operator function must either be a member or take at least one argument of a user-defined type
(functions redefining the nneew
w and ddeelleettee operators need not). This rule ensures that a user cannot
change the meaning of an expression unless the expression contains an object of a user-defined
type. In particular, it is not possible to define an operator function that operates exclusively on
pointers. This ensures that C++ is extensible but not mutable (with the exception of operators =, &,
and , for class objects).
An operator function intended to accept a basic type as its first operand cannot be a member
function. For example, consider adding a complex variable aaaa to the integer 22: aaaa+22 can, with a
suitably declared member function, be interpreted as aaaa.ooppeerraattoorr+(22), but 22+aaaa cannot because
there is no class iinntt for which to define + to mean 22.ooppeerraattoorr+(aaaa). Even if there were, two different member functions would be needed to cope with 22+aaaa and aaaa+22. Because the compiler does
not know the meaning of a user-defined +, it cannot assume that it is commutative and so interpret
22+aaaa as aaaa+22. This example is trivially handled using nonmember functions (§11.3.2, §11.5).
Enumerations are user-defined types so that we can define operators for them. For example:
eennuum
m D
Daayy { ssuunn, m
moonn, ttuuee, w
weedd, tthhuu, ffrrii, ssaatt };
D
Daayy& ooppeerraattoorr++(D
Daayy& dd)
{
rreettuurrnn d = (ssaatt==dd) ? ssuunn : D
Daayy(dd+11);
}
Every expression is checked for ambiguities. Where a user-defined operator provides a possible
interpretation, the expression is checked according to the rules in §7.4.
11.2.4 Operators in Namespaces [over.namespace]
An operator is either a member of a class or defined in some namespace (possibly the global namespace). Consider this simplified version of string I/O from the standard library:
nnaam
meessppaaccee ssttdd {
// simplified std
ccllaassss oossttrreeaam
m{
// ...
oossttrreeaam
m& ooppeerraattoorr<<(ccoonnsstt cchhaarr*);
};
eexxtteerrnn oossttrreeaam
m ccoouutt;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
266
Operator Overloading
Chapter 11
ccllaassss ssttrriinngg {
// ...
};
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m&, ccoonnsstt ssttrriinngg&);
}
iinntt m
maaiinn()
{
cchhaarr* p = "H
Heelllloo";
ssttdd::ssttrriinngg s = "w
woorrlldd";
ssttdd::ccoouutt << p << ", " << s << "!\\nn";
}
Naturally, this writes out H
Heelllloo, w
woorrlldd! But why? Note that I didn’t make everything from ssttdd
accessible by writing:
uussiinngg nnaam
meessppaaccee ssttdd;
Instead, I used the ssttdd:: prefix for ssttrriinngg and ccoouutt. In other words, I was at my best behavior and
didn’t pollute the global namespace or in other ways introduce unnecessary dependencies.
The output operator for C-style strings (cchhaarr*) is a member of ssttdd::oossttrreeaam
m, so by definition
ssttdd::ccoouutt << p
means
ssttdd::ccoouutt.ooppeerraattoorr<<(pp)
However, ssttdd::oossttrreeaam
m doesn’t have a member function to output a ssttdd::ssttrriinngg, so
ssttdd::ccoouutt << s
means
ooppeerraattoorr<<(ssttdd::ccoouutt,ss)
Operators defined in namespaces can be found based on their operand types just like functions can
be found based on their argument types (§8.2.6). In particular, ccoouutt is in namespace ssttdd, so ssttdd is
considered when looking for a suitable definition of <<. In that way, the compiler finds and uses:
ssttdd::ooppeerraattoorr<<(ssttdd::oossttrreeaam
m&, ccoonnsstt ssttdd::ssttrriinngg&)
For a binary operator @, xx@yy where x is of type X and y is of type Y is resolved like this:
[1] If X is a class, determine whether class X or a base of X defines ooppeerraattoorr@ as a member; if
so, that is the @ to try to use.
[2] Otherwise,
– look for declarations of @ in the context surrounding xx@yy; and
– if X is defined in namespace N
N, look for declarations of @ in N
N; and
– if Y is defined in namespace M
M, look for declarations of @ in M
M.
If declarations of ooppeerraattoorr@ are found in the surrounding context, in N
N, or in M
M, we try to use
those operators.
In either case, declarations for several ooppeerraattoorr@s may be found and overload resolution rules
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.2.4
Operators in Namespaces
267
(§7.4) are used to find the best match, if any. This lookup mechanism is applied only if the operator has at least one operand of a user-defined type. Therefore, user-defined conversions (§11.3.2,
§11.4) will be considered. Note that a ttyyppeeddeeff name is just a synonym and not a user-defined type
(§4.9.7).
11.3 A Complex Number Type [over.complex]
The implementation of complex numbers presented in the introduction is too restrictive to please
anyone. For example, from looking at a math textbook we would expect this to work:
vvooiidd ff()
{
ccoom
mpplleexx a = ccoom
mpplleexx(11,22);
ccoom
mpplleexx b = 33;
ccoom
mpplleexx c = aa+22.33;
ccoom
mpplleexx d = 22+bb;
ccoom
mpplleexx e = -bb-cc;
b = cc*22*cc;
}
In addition, we would expect to be provided with a few additional operators, such as == for comparison and << for output, and a suitable set of mathematical functions, such as ssiinn() and ssqqrrtt().
Class ccoom
mpplleexx is a concrete type, so its design follows the guidelines from §10.3. In addition,
users of complex arithmetic rely so heavily on operators that the definition of ccoom
mpplleexx brings into
play most of the basic rules for operator overloading.
11.3.1 Member and Nonmember Operators [over.member]
I prefer to minimize the number of functions that directly manipulate the representation of an
object. This can be achieved by defining only operators that inherently modify the value of their
first argument, such as +=, in the class itself. Operators that simply produce a new value based on
the values of its arguments, such as +, are then defined outside the class and use the essential operators in their implementation:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx& ooppeerraattoorr+=(ccoom
mpplleexx aa);
// ...
};
// needs access to representation
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx aa, ccoom
mpplleexx bb)
{
ccoom
mpplleexx r = aa;
rreettuurrnn r += bb; // access representation through +=
}
Given these declarations, we can write:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
268
Operator Overloading
Chapter 11
vvooiidd ff(ccoom
mpplleexx xx, ccoom
mpplleexx yy, ccoom
mpplleexx zz)
{
ccoom
mpplleexx rr11 = xx+yy+zz; // r1 = operator+(x,operator+(y,z))
ccoom
mpplleexx rr22 = xx;
// r2 = x
rr22 += yy;
// r2.operator+=(y)
rr22 += zz;
// r2.operator+=(z)
}
Except for possible efficiency differences, the computations of rr11 and rr22 are equivalent.
Composite assignment operators such as += and *= tend to be simpler to define than their
‘‘simple’’ counterparts + and *. This surprises most people at first, but it follows from the fact that
three objects are involved in a + operation (the two operands and the result), whereas only two
objects are involved in a += operation. In the latter case, run-time efficiency is improved by eliminating the need for temporary variables. For example:
iinnlliinnee ccoom
mpplleexx& ccoom
mpplleexx::ooppeerraattoorr+=(ccoom
mpplleexx aa)
{
rree += aa.rree;
iim
m += aa.iim
m;
rreettuurrnn *tthhiiss;
}
does not require a temporary variable to hold the result of the addition and is simple for a compiler
to inline perfectly.
A good optimizer will generate close to optimal code for uses of the plain + operator also.
However, we don’t always have a good optimizer and not all types are as simple as ccoom
mpplleexx, so
§11.5 discusses ways of defining operators with direct access to the representation of classes.
11.3.2 Mixed-Mode Arithmetic [over.mixed]
To cope with
ccoom
mpplleexx d = 22+bb;
we need to define operator + to accept operands of different types. In Fortran terminology, we
need mixed-mode arithmetic. We can achieve that simply by adding appropriate versions of the
operators:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx& ooppeerraattoorr+=(ccoom
mpplleexx aa) {
rree += aa.rree;
iim
m += aa.iim
m;
rreettuurrnn *tthhiiss;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.3.2
Mixed-Mode Arithmetic
269
ccoom
mpplleexx& ooppeerraattoorr+=(ddoouubbllee aa) {
rree += aa;
rreettuurrnn *tthhiiss;
}
// ...
};
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx aa, ccoom
mpplleexx bb)
{
ccoom
mpplleexx r = aa;
rreettuurrnn r += bb; // calls complex::operator+=(complex)
}
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx aa, ddoouubbllee bb)
{
ccoom
mpplleexx r = aa;
rreettuurrnn r += bb; // calls complex::operator+=(double)
}
ccoom
mpplleexx ooppeerraattoorr+(ddoouubbllee aa, ccoom
mpplleexx bb)
{
ccoom
mpplleexx r = bb;
rreettuurrnn r += aa; // calls complex::operator+=(double)
}
Adding a ddoouubbllee to a complex number is a simpler operation than adding a ccoom
mpplleexx. This is
reflected in these definitions. The operations taking ddoouubbllee operands do not touch the imaginary
part of a complex number and thus will be more efficient.
Given these declarations, we can write:
vvooiidd ff(ccoom
mpplleexx xx, ccoom
mpplleexx yy)
{
ccoom
mpplleexx rr11 = xx+yy; // calls operator+(complex,complex)
ccoom
mpplleexx rr22 = xx+22; // calls operator+(complex,double)
ccoom
mpplleexx rr33 = 22+xx; // calls operator+(double,complex)
}
11.3.3 Initialization [over.ctor]
To cope with assignments and initialization of ccoom
mpplleexx variables with scalars, we need a conversion of a scalar (integer or floating-point number) to a ccoom
mpplleexx. For example:
ccoom
mpplleexx b = 33; // should mean b.re=3, b.im=0
A constructor taking a single argument specifies a conversion from its argument type to the
constructor’s type. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
270
Operator Overloading
Chapter 11
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx(ddoouubbllee rr) :rree(rr), iim
m(00) { }
// ...
};
The constructor specifies the traditional embedding of the real line in the complex plane.
A constructor is a prescription for creating a value of a given type. The constructor is used
when a value of a type is expected and when such a value can be created by a constructor from the
value supplied as an initializer or assigned value. Thus, a constructor requiring a single argument
need not be called explicitly. For example,
ccoom
mpplleexx b = 33;
means
ccoom
mpplleexx b = ccoom
mpplleexx(33);
A user-defined conversion is implicitly applied only if it is unique (§7.4). See §11.7.1 for a way of
specifying constructors that can only be explicitly invoked.
Naturally, we still need the constructor that takes two doubles, and a default constructor initializing a ccoom
mpplleexx to (00,00) is also useful:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx() : rree(00), iim
m(00) { }
ccoom
mpplleexx(ddoouubbllee rr) : rree(rr), iim
m(00) { }
ccoom
mpplleexx(ddoouubbllee rr, ddoouubbllee ii) : rree(rr), iim
m(ii) { }
// ...
};
Using default arguments, we can abbreviate:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx(ddoouubbllee r =00, ddoouubbllee i =00) : rree(rr), iim
m(ii) { }
// ...
};
When a constructor is explicitly declared for a type, it is not possible to use an initializer list (§5.7,
§4.9.5) as the initializer. For example:
ccoom
mpplleexx zz11 = { 3 };
ccoom
mpplleexx zz22 = { 33, 4 };
// error: complex has a constructor
// error: complex has a constructor
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.3.4
Copying
271
11.3.4 Copying [over.copy]
In addition to the explicitly declared constructors, ccoom
mpplleexx by default gets a copy constructor
defined (§10.2.5). A default copy constructor simply copies all members. To be explicit, we could
equivalently have written:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx& cc) : rree(cc.rree), iim
m(cc.iim
m) { }
// ...
};
However, for types where the default copy constructor has the right semantics, I prefer to rely on
that default. It is less verbose than anything I can write, and people should understand the default.
Also, compilers know about the default and its possible optimization opportunities. Furthermore,
writing out the memberwise copy by hand is tedious and error-prone for classes with many data
members (§10.4.6.3).
I use a reference argument for the copy constructor because I must. The copy constructor
defines what copying means – including what copying an argument means – so writing
ccoom
mpplleexx::ccoom
mpplleexx(ccoom
mpplleexx cc) : rree(cc.rree), iim
m(cc.iim
m) { } // error
is an error because any call would have involved an infinite recursion.
For other functions taking ccoom
mpplleexx arguments, I use value arguments rather than reference
arguments. Here, the designer has a choice. From a user’s point of view, there is little difference
between a function that takes a ccoom
mpplleexx argument and one that takes a ccoonnsstt ccoom
mpplleexx& argument.
This issue is discussed further in §11.6.
In principle, copy constructors are used in simple initializations such as
ccoom
mpplleexx x = 22;
ccoom
mpplleexx y = ccoom
mpplleexx(22,00);
// create complex(2); then initialize x with it
// create complex(2,0); then initialize y with it
However, the calls to the copy constructor are trivially optimized away. We could equivalently
have written:
ccoom
mpplleexx xx(22);
ccoom
mpplleexx yy(22,00);
// initialize x by 2
// initialize x by (2,0)
For arithmetic types, such as ccoom
mpplleexx, I like the look of the version using = better. It is possible to
restrict the set of values accepted by the = style of initialization compared to the ()style by making
the copy constructor private (§11.2.2) or by declaring a constructor eexxpplliicciitt (§11.7.1).
Similar to initialization, assignment of two objects of the same class is by default defined as
memberwise assignment (§10.2.5). We could explicitly define ccoom
mpplleexx::ooppeerraattoorr= to do that.
However, for a simple type like ccoom
mpplleexx there is no reason to do so. The default is just right.
The copy constructor – whether user-defined or compiler-generated – is used not only for the
initialization of variables, but also for argument passing, value return, and exception handling (see
§11.7). The semantics of these operations is defined to be the semantics of initialization (§7.1,
§7.3, §14.2.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
272
Operator Overloading
Chapter 11
11.3.5 Constructors and Conversions [over.conv]
We defined three versions of each of the four standard arithmetic operators:
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx,ccoom
mpplleexx);
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx,ddoouubbllee);
ccoom
mpplleexx ooppeerraattoorr+(ddoouubbllee,ccoom
mpplleexx);
// ...
This can get tedious, and what is tedious easily becomes error-prone. What if we had three alternatives for the type of each argument for each function? We would need three versions of each
single-argument function, nine versions of each two-argument function, twenty-seven versions of
each three-argument function, etc. Often these variants are very similar. In fact, almost all variants
involve a simple conversion of arguments to a common type followed by a standard algorithm.
The alternative to providing different versions of a function for each combination of arguments
is to rely on conversions. For example, our ccoom
mpplleexx class provides a constructor that converts a
ddoouubbllee to a ccoom
mpplleexx. Consequently, we could simply declare only one version of the equality
operator for ccoom
mpplleexx:
bbooooll ooppeerraattoorr==(ccoom
mpplleexx,ccoom
mpplleexx);
vvooiidd ff(ccoom
mpplleexx
{
xx==yy;
xx==33;
33==yy;
}
xx, ccoom
mpplleexx yy)
// means operator==(x,y)
// means operator==(x,complex(3))
// means operator==(complex(3),y)
There can be reasons for preferring to define separate functions. For example, in some cases the
conversion can impose overheads, and in other cases, a simpler algorithm can be used for specific
argument types. Where such issues are not significant, relying on conversions and providing only
the most general variant of a function – plus possibly a few critical variants – contains the combinatorial explosion of variants that can arise from mixed-mode arithmetic.
Where several variants of a function or an operator exist, the compiler must pick ‘‘the right’’
variant based on the argument types and the available (standard and user-defined) conversions.
Unless a best match exists, an expression is ambiguous and is an error (see §7.4).
An object constructed by explicit or implicit use of a constructor is automatic and will be
destroyed at the first opportunity (see §10.4.10).
No implicit user-defined conversions are applied to the left-hand side of a . (or a ->). This is
the case even when the . is implicit. For example:
vvooiidd gg(ccoom
mpplleexx zz)
{
33+zz;
33.ooppeerraattoorr+=(zz);
33+=zz;
}
// ok: complex(3)+z
// error: 3 is not a class object
// error: 3 is not a class object
Thus, you can express the notion that an operator requires an lvalue as their left-hand operand by
making that operator a member.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.3.6
Literals
273
11.3.6 Literals [over.literals]
It is not possible to define literals of a class type in the sense that 11.22 and 1122ee33 are literals of type
ddoouubbllee. However, literals of the basic types can often be used instead if class member functions are
used to provide an interpretation for them. Constructors taking a single argument provide a general
mechanism for this. When constructors are simple and inline, it is quite reasonable to think of constructor invocations with literal arguments as literals. For example, I think of ccoom
mpplleexx(33) as a literal of type ccoom
mpplleexx, even though technically it isn’t.
11.3.7 Additional Member Functions [over.additional]
So far, we have provided class ccoom
mpplleexx with constructors and arithmetic operators only. That is
not quite sufficient for real use. In particular, we often need to be able to examine the value of the
real and imaginary parts:
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ddoouubbllee rreeaall() ccoonnsstt { rreettuurrnn rree; }
ddoouubbllee iim
maagg() ccoonnsstt { rreettuurrnn iim
m; }
// ...
};
Unlike the other members of ccoom
mpplleexx, rreeaall() and iim
maagg() do not modify the value of a ccoom
mpplleexx,
so they can be declared ccoonnsstt.
Given rreeaall() and iim
maagg(), we can define all kinds of useful operations without granting them
direct access to the representation of ccoom
mpplleexx. For example:
iinnlliinnee bbooooll ooppeerraattoorr==(ccoom
mpplleexx aa, ccoom
mpplleexx bb)
{
rreettuurrnn aa.rreeaall()==bb.rreeaall() && aa.iim
maagg()==bb.iim
maagg();
}
Note that we need only to be able to read the real and imaginary parts; writing them is less often
needed. If we must do a ‘‘partial update,’’ we can:
vvooiidd ff(ccoom
mpplleexx& zz, ddoouubbllee dd)
{
// ...
z = ccoom
mpplleexx(zz.rreeaall(),dd); // assign d to z.im
}
A good optimizer generates a single assignment for that statement.
11.3.8 Helper Functions [over.helpers]
If we put all the bits and pieces together, the ccoom
mpplleexx class becomes:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
274
Operator Overloading
Chapter 11
ccllaassss ccoom
mpplleexx {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx(ddoouubbllee r =00, ddoouubbllee i =00) : rree(rr), iim
m(ii) { }
ddoouubbllee rreeaall() ccoonnsstt { rreettuurrnn rree; }
ddoouubbllee iim
maagg() ccoonnsstt { rreettuurrnn iim
m; }
ccoom
mpplleexx& ooppeerraattoorr+=(ccoom
mpplleexx);
ccoom
mpplleexx& ooppeerraattoorr+=(ddoouubbllee);
// – =, *=, and /=
};
In addition, we must provide a number of helper functions:
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx,ccoom
mpplleexx);
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx,ddoouubbllee);
ccoom
mpplleexx ooppeerraattoorr+(ddoouubbllee,ccoom
mpplleexx);
// – , *, and /
ccoom
mpplleexx ooppeerraattoorr-(ccoom
mpplleexx); // unary minus
ccoom
mpplleexx ooppeerraattoorr+(ccoom
mpplleexx); // unary plus
bbooooll ooppeerraattoorr==(ccoom
mpplleexx,ccoom
mpplleexx);
bbooooll ooppeerraattoorr!=(ccoom
mpplleexx,ccoom
mpplleexx);
iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m&,ccoom
mpplleexx&); // input
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m&,ccoom
mpplleexx); // output
Note that the members rreeaall() and iim
maagg() are essential for defining the comparisons. The definition of most of the following helper functions similarly relies on rreeaall() and iim
maagg().
We might provide functions to allow users to think in terms of polar coordinates:
ccoom
mpplleexx ppoollaarr(ddoouubbllee rrhhoo, ddoouubbllee tthheettaa);
ccoom
mpplleexx ccoonnjj(ccoom
mpplleexx);
ddoouubbllee aabbss(ccoom
mpplleexx);
ddoouubbllee aarrgg(ccoom
mpplleexx);
ddoouubbllee nnoorrm
m(ccoom
mpplleexx);
ddoouubbllee rreeaall(ccoom
mpplleexx);
ddoouubbllee iim
maagg(ccoom
mpplleexx);
// for notational convenience
// for notational convenience
Finally, we must provide an appropriate set of standard mathematical functions:
ccoom
mpplleexx aaccooss(ccoom
mpplleexx);
ccoom
mpplleexx aassiinn(ccoom
mpplleexx);
ccoom
mpplleexx aattaann(ccoom
mpplleexx);
// ...
From a user’s point of view, the complex type presented here is almost identical to the
ccoom
mpplleexx<ddoouubbllee> found in <ccoom
mpplleexx> in the standard library (§22.5).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.4
Conversion Operators
275
11.4 Conversion Operators [over.conversion]
Using a constructor to specify type conversion is convenient but has implications that can be undesirable. A constructor cannot specify
[1] an implicit conversion from a user-defined type to a basic type (because the basic types are
not classes), or
[2] a conversion from a new class to a previously defined class (without modifying the declaration for the old class).
These problems can be handled by defining a conversion operator for the source type. A member
function X
X::ooppeerraattoorr T
T(), where T is a type name, defines a conversion from X to T
T. For example, one could define a 6-bit non-negative integer, T
Tiinnyy, that can mix freely with integers in arithmetic operations:
ccllaassss T
Tiinnyy {
cchhaarr vv;
vvooiidd aassssiiggnn(iinntt ii) { iiff (ii&~007777) tthhrroow
w B
Baadd__rraannggee(); vv=ii; }
ppuubblliicc:
ccllaassss B
Baadd__rraannggee { };
T
Tiinnyy(iinntt ii) { aassssiiggnn(ii); }
T
Tiinnyy& ooppeerraattoorr=(iinntt ii) { aassssiiggnn(ii); rreettuurrnn *tthhiiss; }
ooppeerraattoorr iinntt() ccoonnsstt { rreettuurrnn vv; }
// conversion to int function
};
The range is checked whenever a T
Tiinnyy is initialized by an iinntt and whenever an iinntt is assigned to
one. No range check is needed when we copy a T
Tiinnyy, so the default copy constructor and assignment are just right.
To enable the usual integer operations on T
Tiinnyy variables, we define the implicit conversion from
T
Tiinnyy to iinntt, T
Tiinnyy::ooppeerraattoorr iinntt(). Note that the type being converted to is part of the name of the
operator and cannot be repeated as the return value of the conversion function:
T
Tiinnyy::ooppeerraattoorr iinntt() ccoonnsstt { rreettuurrnn vv; }
iinntt T
Tiinnyy::ooppeerraattoorr iinntt() ccoonnsstt { rreettuurrnn vv; }
// right
// error
In this respect also, a conversion operator resembles a constructor.
Whenever a T
Tiinnyy appears where an iinntt is needed, the appropriate iinntt is used. For example:
iinntt m
maaiinn()
{
T
Tiinnyy cc11 = 22;
T
Tiinnyy cc22 = 6622;
T
Tiinnyy cc33 = cc22-cc11;
T
Tiinnyy cc44 = cc33;
iinntt i = cc11+cc22;
// c3 = 60
// no range check (not necessary)
// i = 64
cc11 = cc11+cc22;
i = cc33-6644;
cc22 = cc33-6644;
cc33 = cc44;
// range error: c1 can’t be 64
// i = – 4
// range error: c2 can’t be – 4
// no range check (not necessary)
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
276
Operator Overloading
Chapter 11
Conversion functions appear to be particularly useful for handling data structures when reading
(implemented by a conversion operator) is trivial, while assignment and initialization are distinctly
less trivial.
The iissttrreeaam
m and oossttrreeaam
m types rely on a conversion function to enable statements such as
w
whhiillee (cciinn>>xx) ccoouutt<<xx;
The input operation cciinn>>xx returns an iissttrreeaam
m&. That value is implicitly converted to a value indicating the state of cciinn. This value can then be tested by the w
whhiillee (see §21.3.3). However, it is typically not a good idea to define an implicit conversion from one type to another in such a way that
information is lost in the conversion.
In general, it is wise to be sparing in the introduction of conversion operators. When used in
excess, they lead to ambiguities. Such ambiguities are caught by the compiler, but they can be a
nuisance to resolve. Probably the best idea is initially to do conversions by named functions, such
as X
X::m
maakkee__iinntt(). If such a function becomes popular enough to make explicit use inelegant, it
can be replaced by a conversion operator X
X::ooppeerraattoorr iinntt().
If both user-defined conversions and user-defined operators are defined, it is possible to get
ambiguities between the user-defined operators and the built-in operators. For example:
iinntt ooppeerraattoorr+(T
Tiinnyy,T
Tiinnyy);
vvooiidd ff(T
Tiinnyy tt, iinntt ii)
{
tt+ii; // error, ambiguous: operator+(t,Tiny(i)) or int(t)+i ?
}
It is therefore often best to rely on user-defined conversions or user-defined operators for a given
type, but not both.
11.4.1 Ambiguities [over.ambig]
An assignment of a value of type V to an object of class X is legal if there is an assignment operator
X
X::ooppeerraattoorr=(Z
Z) so that V is Z or there is a unique conversion of V to Z
Z. Initialization is treated
equivalently.
In some cases, a value of the desired type can be constructed by repeated use of constructors or
conversion operators. This must be handled by explicit conversions; only one level of user-defined
implicit conversion is legal. In some cases, a value of the desired type can be constructed in more
than one way; such cases are illegal. For example:
ccllaassss X { /* ... */ X
X(iinntt); X
X(cchhaarr*); };
ccllaassss Y { /* ... */ Y
Y(iinntt); };
ccllaassss Z { /* ... */ Z
Z(X
X); };
X ff(X
X);
Y ff(Y
Y);
Z gg(Z
Z);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.4.1
Ambiguities
vvooiidd kk11()
{
ff(11);
ff(X
X(11));
ff(Y
Y(11));
277
// error: ambiguous f(X(1)) or f(Y(1))?
// ok
// ok
gg("M
Maacckk");
// error: two user-defined conversions needed; g(Z(X("Mack"))) not tried
gg(X
X("D
Doocc")); // ok: g(Z(X("Doc")))
gg(Z
Z("SSuuzzyy")); // ok: g(Z(X("Suzy")))
}
User-defined conversions are considered only if they are necessary to resolve a call. For example:
ccllaassss X
XX
X { /* ... */ X
XX
X(iinntt); };
vvooiidd hh(ddoouubbllee);
vvooiidd hh(X
XX
X);
vvooiidd kk22()
{
hh(11);
}
// h(double(1)) or h(XX(1))? h(double(1))!
The call hh(11) means hh(ddoouubbllee(11)) because that alternative uses only a standard conversion
rather than a user-defined conversion (§7.4).
The rules for conversion are neither the simplest to implement, the simplest to document, nor
the most general that could be devised. They are, however, considerably safer, and the resulting
resolutions are less surprising. It is far easier to manually resolve an ambiguity than to find an error
caused by an unsuspected conversion.
The insistence on strict bottom-up analysis implies that the return type is not used in overloading resolution. For example:
ccllaassss Q
Quuaadd {
ppuubblliicc:
Q
Quuaadd(ddoouubbllee);
// ...
};
Q
Quuaadd ooppeerraattoorr+(Q
Quuaadd,Q
Quuaadd);
vvooiidd ff(ddoouubbllee aa11, ddoouubbllee aa22)
{
Q
Quuaadd rr11 = aa11+aa22;
// double-precision add
Q
Quuaadd rr22 = Q
Quuaadd(aa11)+aa22; // force quad arithmetic
}
The reason for this design choice is partly that strict bottom-up analysis is more comprehensible
and partly that it is not considered the compiler’s job to decide which precision the programmer
might want for the addition.
Once the types of both sides of an initialization or assignment have been determined, both types
are used to resolve the initialization or assignment. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
278
Operator Overloading
Chapter 11
ccllaassss R
Reeaall {
ppuubblliicc:
ooppeerraattoorr ddoouubbllee();
ooppeerraattoorr iinntt();
// ...
};
vvooiidd gg(R
Reeaall aa)
{
ddoouubbllee d = aa; // d = a.double();
iinntt i = aa;
// i = a.int();
d = aa;
i = aa;
// d = a.double();
// i = a.int();
}
In these cases, the type analysis is still bottom-up, with only a single operator and its argument
types considered at any one time.
11.5 Friends [over.friends]
An ordinary member function declaration specifies three logically distinct things:
[1] The function can access the private part of the class declaration, and
[2] the function is in the scope of the class, and
[3] the function must be invoked on an object (has a tthhiiss pointer).
By declaring a member function ssttaattiicc (§10.2.4), we can give it the first two properties only. By
declaring a function a ffrriieenndd, we can give it the first property only.
For example, we could define an operator that multiplies a M
Maattrriixx by a V
Veeccttoorr. Naturally,
V
Veeccttoorr and M
Maattrriixx each hide their representation and provide a complete set of operations for
manipulating objects of their type. However, our multiplication routine cannot be a member of
both. Also, we don’t really want to provide low-level access functions to allow every user to both
read and write the complete representation of both M
Maattrriixx and V
Veeccttoorr. To avoid this, we declare
the ooppeerraattoorr* a friend of both:
ccllaassss M
Maattrriixx;
ccllaassss V
Veeccttoorr {
ffllooaatt vv[44];
// ...
ffrriieenndd V
Veeccttoorr ooppeerraattoorr*(ccoonnsstt M
Maattrriixx&, ccoonnsstt V
Veeccttoorr&);
};
ccllaassss M
Maattrriixx {
V
Veeccttoorr vv[44];
// ...
ffrriieenndd V
Veeccttoorr ooppeerraattoorr*(ccoonnsstt M
Maattrriixx&, ccoonnsstt V
Veeccttoorr&);
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.5
Friends
279
V
Veeccttoorr ooppeerraattoorr*(ccoonnsstt M
Maattrriixx& m
m, ccoonnsstt V
Veeccttoorr& vv)
{
V
Veeccttoorr rr;
ffoorr (iinntt i = 00; ii<44; ii++) {
// r[i] = m[i] * v;
rr.vv[ii] = 00;
ffoorr (iinntt j = 00; jj<44; jj++) rr.vv[ii] += m
m.vv[ii].vv[jj] * vv.vv[jj];
}
rreettuurrnn rr;
}
A ffrriieenndd declaration can be placed in either the private or the public part of a class declaration; it
does not matter where. Like a member function, a friend function is explicitly declared in the
declaration of the class of which it is a friend. It is therefore as much a part of that interface as is a
member function.
A member function of one class can be the friend of another. For example:
ccllaassss L
Liisstt__iitteerraattoorr {
// ...
iinntt* nneexxtt();
};
ccllaassss L
Liisstt {
ffrriieenndd iinntt* L
Liisstt__iitteerraattoorr::nneexxtt();
// ...
};
It is not unusual for all functions of one class to be friends of another. There is a shorthand for this:
ccllaassss L
Liisstt {
ffrriieenndd ccllaassss L
Liisstt__iitteerraattoorr;
// ...
};
This friend declaration makes all of L
Liisstt__iitteerraattoorr’s member functions friends of L
Liisstt.
Clearly, ffrriieenndd classes should be used only to express closely connected concepts. Often, there
is a choice between making a class a member (a nested class) or a friend (§24.4).
11.5.1 Finding Friends [over.lookup]
Like a member declaration, a ffrriieenndd declaration does not introduce a name into an enclosing scope.
For example:
ccllaassss M
Maattrriixx {
ffrriieenndd ccllaassss X
Xffoorrm
m;
ffrriieenndd M
Maattrriixx iinnvveerrtt(ccoonnsstt M
Maattrriixx&);
// ...
};
X
Xffoorrm
m xx;
M
Maattrriixx (*pp)(ccoonnsstt M
Maattrriixx&) = &iinnvveerrtt;
// error: no Xform in scope
// error: no invert() in scope
For large programs and large classes, it is nice that a class doesn’t ‘‘quietly’’ add names to its
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
280
Operator Overloading
Chapter 11
enclosing scope. For a template class that can be instantiated in many different contexts (Chapter
13), this is very important.
A friend class must be previously declared in an enclosing scope or defined in the non-class
scope immediately enclosing the class that is declaring it a friend. For example:
ccllaassss X { /* ... */ };
nnaam
meessppaaccee N {
ccllaassss Y {
ffrriieenndd ccllaassss X
X;
ffrriieenndd ccllaassss Z
Z;
ffrriieenndd ccllaassss A
AE
E;
};
ccllaassss Z { /* ... */ };
}
ccllaassss A
AE
E { /* ... */ };
// Y’s friend
// Y’s friend
// not a friend of Y
A friend function can be explicitly declared just like friend classes, or it can be found through its
argument types (§8.2.6) as if it was declared in the non-class scope immediately enclosing its class.
For example:
vvooiidd ff(M
Maattrriixx& m
m)
{
iinnvveerrtt(m
m);
}
// Matrix’s friend invert()
It follows that a friend function should either be explicitly declared in an enclosing scope or take an
argument of its class. If not, the friend cannot be called. For example:
// no f() here
vvooiidd gg();
// X’s friend
ccllaassss X {
ffrriieenndd vvooiidd ff();
// useless
ffrriieenndd vvooiidd gg();
ffrriieenndd vvooiidd hh(ccoonnsstt X
X&); // can be found through its argument
};
vvooiidd ff() { /* ... */ }
// not a friend of X
11.5.2 Friends and Members [over.friends.members]
When should we use a friend function, and when is a member function the better choice for specifying an operation? First, we try to minimize the number of functions that access the representation
of a class and try to make the set of access functions as appropriate as possible. Therefore, the first
question is not, ‘‘Should it be a member, a static member, or a friend?’’ but rather, ‘‘Does it really
need access?’’ Typically, the set of functions that need access is smaller than we are willing to
believe at first.
Some operations must be members – for example, constructors, destructors, and virtual
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.5.2
Friends and Members
281
functions (§12.2.6) – but typically there is a choice. Because member names are local to the class,
a function should be a member unless there is a specific reason for it to be a nonmember.
Consider a class X presenting alternative ways of presenting an operation:
ccllaassss X {
// ...
X
X(iinntt);
iinntt m
m11();
iinntt m
m22() ccoonnsstt;
ffrriieenndd iinntt ff11(X
X&);
ffrriieenndd iinntt ff22(ccoonnsstt X
X&);
ffrriieenndd iinntt ff33(X
X);
};
Member functions can be invoked for objects of their class only; no user-defined conversions are
applied. For example:
vvooiidd gg()
{
9999.m
m11(); // error: X(99).m1() not tried
9999.m
m22(); // error: X(99).m2() not tried
}
The conversion X
X(iinntt) is not applied to make an X out of 9999.
The global function ff11() has a similar property because implicit conversions are not used for
non-ccoonnsstt reference arguments (§5.5, §11.3.5). However, conversions may be applied to the arguments of ff22() and ff33():
vvooiidd hh()
{
ff11(9999);
ff22(9999);
ff33(9999);
}
// error: f1(X(99)) not tried
// ok: f2(X(99));
// ok: f3(X(99));
An operation modifying the state of a class object should therefore be a member or a global function taking a non-ccoonnsstt reference argument (or a non-ccoonnsstt pointer argument). Operators that
require lvalue operands for the fundamental types (=, *=, ++, etc.) are most naturally defined as
members for user-defined types.
Conversely, if implicit type conversion is desired for all operands of an operation, the function
implementing it must be a nonmember function taking a ccoonnsstt reference argument or a nonreference argument. This is often the case for the functions implementing operators that do not
require lvalue operands when applied to fundamental types (+, -, ||, etc.). Such operators often
need access to the representations of their operand class. Consequently, binary operators are the
most common source of ffrriieenndd functions.
If no type conversions are defined, there appears to be no compelling reason to choose a member over a friend taking a reference argument, or vice versa. In some cases, the programmer may
have a preference for one call syntax over another. For example, most people seem to prefer the
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
282
Operator Overloading
Chapter 11
notation iinnvv(m
m) for inverting a M
Maattrriixx m to the alternative m
m.iinnvv(). Naturally, if iinnvv() really
does invert m itself, rather than return a new M
Maattrriixx that is the inverse of m
m, it should be a member.
All other things considered equal, choose a member. It is not possible to know if someone
someday will define a conversion operator. It is not always possible to predict if a future change
may require changes to the state of the object involved. The member function call syntax makes it
clear to the user that the object may be modified; a reference argument is far less obvious. Furthermore, expressions in the body of a member can be noticeably shorter than the equivalent expressions in a global function; a nonmember function must use an explicit argument, whereas the member can use tthhiiss implicitly. Also, because member names are local to the class they tend to be
shorter than the names of nonmember functions.
11.6 Large Objects [over.large]
We defined the ccoom
mpplleexx operators to take arguments of type ccoom
mpplleexx. This implies that for each
use of a ccoom
mpplleexx operator, each operand is copied. The overhead of copying two ddoouubblleess can be
noticeable but often less than what a pair of pointers impose. Unfortunately, not all classes have a
conveniently small representation. To avoid excessive copying, one can declare functions to take
reference arguments. For example:
ccllaassss M
Maattrriixx {
ddoouubbllee m
m[44][44];
ppuubblliicc:
M
Maattrriixx();
ffrriieenndd M
Maattrriixx ooppeerraattoorr+(ccoonnsstt M
Maattrriixx&, ccoonnsstt M
Maattrriixx&);
ffrriieenndd M
Maattrriixx ooppeerraattoorr*(ccoonnsstt M
Maattrriixx&, ccoonnsstt M
Maattrriixx&);
};
References allow the use of expressions involving the usual arithmetic operators for large objects
without excessive copying. Pointers cannot be used because it is not possible to redefine the meaning of an operator applied to a pointer. Addition could be defined like this:
M
Maattrriixx ooppeerraattoorr+(ccoonnsstt M
Maattrriixx& aarrgg11, ccoonnsstt M
Maattrriixx& aarrgg22)
{
M
Maattrriixx ssuum
m;
ffoorr (iinntt ii=00; ii<44; ii++)
ffoorr (iinntt jj=00; jj<44; jj++)
ssuum
m.m
m[ii][jj] = aarrgg11.m
m[ii][jj] + aarrgg22.m
m[ii][jj];
rreettuurrnn ssuum
m;
}
This ooppeerraattoorr+() accesses the operands of + through references but returns an object value.
Returning a reference would appear to be more efficient:
ccllaassss M
Maattrriixx {
// ...
ffrriieenndd M
Maattrriixx& ooppeerraattoorr+(ccoonnsstt M
Maattrriixx&, ccoonnsstt M
Maattrriixx&);
ffrriieenndd M
Maattrriixx& ooppeerraattoorr*(ccoonnsstt M
Maattrriixx&, ccoonnsstt M
Maattrriixx&);
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.6
Large Objects
283
This is legal, but it causes a memory allocation problem. Because a reference to the result will be
passed out of the function as a reference to the return value, the return value cannot be an automatic
variable (§7.3). Since an operator is often used more than once in an expression, the result cannot
be a ssttaattiicc local variable. The result would typically be allocated on the free store. Copying the
return value is often cheaper (in execution time, code space, and data space) than allocating and
(eventually) deallocating an object on the free store. It is also much simpler to program.
There are techniques you can use to avoid copying the result. The simplest is to use a buffer of
static objects. For example:
ccoonnsstt m
maaxx__m
maattrriixx__tteem
mpp = 77;
M
Maattrriixx& ggeett__m
maattrriixx__tteem
mpp()
{
ssttaattiicc iinntt nnbbuuff = 00;
ssttaattiicc M
Maattrriixx bbuuff[m
maaxx__m
maattrriixx__tteem
mpp];
iiff (nnbbuuff == m
maaxx__m
maattrriixx__tteem
mpp) nnbbuuff = 00;
rreettuurrnn bbuuff[nnbbuuff++];
}
M
Maattrriixx& ooppeerraattoorr+(ccoonnsstt M
Maattrriixx& aarrgg11, ccoonnsstt M
Maattrriixx& aarrgg22)
{
M
Maattrriixx& rreess = ggeett__m
maattrriixx__tteem
mpp();
// ...
rreettuurrnn rreess;
}
Now a M
Maattrriixx is copied only when the result of an expression is assigned. However, heaven help
you if you write an expression that involves more than m
maaxx__m
maattrriixx__tteem
mpp temporaries!
A less error-prone technique involves defining the matrix type as a handle (§25.7) to a representation type that really holds the data. In that way, the matrix handles can manage the representation
objects in such a way that allocation and copying are minimized (see §11.12 and §11.14[18]).
However, that strategy relies on operators returning objects rather than references or pointers.
Another technique is to define ternary operations and have them automatically invoked for expressions such as aa=bb+cc and aa+bb*ii (§21.4.6.3, §22.4.7).
11.7 Essential Operators [over.essential]
In general, for a type X
X, the copy constructor X
X(ccoonnsstt X
X&) takes care of initialization by an object
of the same type X
X. It cannot be overemphasized that assignment and initialization are different
operations (§10.4.4.1). This is especially important when a destructor is declared. If a class X has
a destructor that performs a nontrivial task, such as free-store deallocation, the class is likely to
need the full complement of functions that control construction, destruction, and copying:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
284
Operator Overloading
ccllaassss X {
// ...
X
X(SSoom
meettyyppee);
X
X(ccoonnsstt X
X&);
X
X& ooppeerraattoorr=(ccoonnsstt X
X&);
~X
X();
};
Chapter 11
// constructor: create objects
// copy constructor
// copy assignment: cleanup and copy
// destructor: cleanup
There are three more cases in which an object is copied: as a function argument, as a function
return value, and as an exception. When an argument is passed, a hitherto uninitialized variable –
the formal parameter – is initialized. The semantics are identical to those of other initializations.
The same is the case for function return values and exceptions, although that is less obvious. In
such cases, the copy constructor will be applied. For example:
ssttrriinngg gg(ssttrriinngg aarrgg)
{
rreettuurrnn aarrgg;
}
iinntt m
maaiinn ()
{
ssttrriinngg s = "N
Neew
wttoonn";
s = gg(ss);
}
Clearly, the value of s ought to be ""N
Neew
wttoonn"" after the call of gg(). Getting a copy of the value of s
into the argument aarrgg is not difficult; a call of ssttrriinngg’s copy constructor does that. Getting a copy
of that value out of gg() takes another call of ssttrriinngg(ccoonnsstt ssttrriinngg&); this time, the variable initialized is a temporary one, which is then assigned to ss. Often one, but not both, of these copy operations can be optimized away. Such temporary variables are, of course, destroyed properly using
ssttrriinngg::~ssttrriinngg() (see §10.4.10).
For a class X for which the assignment operator X
X::ooppeerraattoorr=(ccoonnsstt X
X&) and the copy constructor X
X::X
X(ccoonnsstt X
X&) are not explicitly declared by the programmer, the missing operation or
operations will be generated by the compiler (§10.2.5).
11.7.1 Explicit Constructors [over.explicit]
By default, a single argument constructor also defines an implicit conversion. For some types, that
is ideal. For example:
ccoom
mpplleexx z = 22; // initialize z with complex(2)
In other cases, the implicit conversion is undesirable and error-prone. For example:
ssttrriinngg s = ´aa´; // make s a string with int(’a’) elements
It is quite unlikely that this was what the person defining s meant.
Implicit conversion can be suppressed by declaring a constructor eexxpplliicciitt. That is, an eexxpplliicciitt
constructor will be invoked only explicitly. In particular, where a copy constructor is in principle
needed (§11.3.4), an eexxpplliicciitt constructor will not be implicitly invoked. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.7.1
Explicit Constructors
ccllaassss SSttrriinngg {
// ...
eexxpplliicciitt SSttrriinngg(iinntt nn);
SSttrriinngg(ccoonnsstt cchhaarr* pp);
};
SSttrriinngg
SSttrriinngg
SSttrriinngg
SSttrriinngg
SSttrriinngg
ss11 = ´aa´;
ss22(1100);
ss33 = SSttrriinngg(1100);
ss44 = "B
Brriiaann";
ss55("F
Faaw
wllttyy");
285
// preallocate n bytes
// initial value is the C-style string p
// error: no implicit char– >String conversion
// ok: String with space for 10 characters
// ok: String with space for 10 characters
// ok: s4 = String("Brian")
vvooiidd ff(SSttrriinngg);
SSttrriinngg gg()
{
ff(1100);
ff(SSttrriinngg(1100));
ff("A
Arrtthhuurr");
ff(ss11);
// error: no implicit int– >String conversion
// ok: f(String("Arthur"))
SSttrriinngg* pp11 = nneew
w SSttrriinngg("E
Erriicc");
SSttrriinngg* pp22 = nneew
w SSttrriinngg(1100);
rreettuurrnn 1100;
// error: no implicit int– >String conversion
}
The distinction between
SSttrriinngg ss11 = ´aa´;
// error: no implicit char– >String conversion
SSttrriinngg ss22(1100);
// ok: string with space for 10 characters
and
may seem subtle, but it is less so in real code than in contrived examples.
In D
Daattee, we used a plain iinntt to represent a year (§10.3). Had D
Daattee been critical in our design,
we might have introduced a Y
Yeeaarr type to allow stronger compile-time checking. For example:
ccllaassss Y
Yeeaarr {
iinntt yy;
ppuubblliicc:
eexxpplliicciitt Y
Yeeaarr(iinntt ii) : yy(ii) { }
ooppeerraattoorr iinntt() ccoonnsstt { rreettuurrnn yy; }
};
// construct Year from int
// conversion: Year to int
ccllaassss D
Daattee {
ppuubblliicc:
D
Daattee(iinntt dd, M
Moonntthh m
m, Y
Yeeaarr yy);
// ...
};
D
Daattee dd33(11997788,ffeebb,2211);
// error: 21 is not a Year
D
Daattee dd44(2211,ffeebb,Y
Yeeaarr(11997788)); // ok
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
286
Operator Overloading
Chapter 11
The Y
Yeeaarr class is a simple ‘‘wrapper’’ around an iinntt. Thanks to the ooppeerraattoorr iinntt(), a Y
Yeeaarr is
implicitly converted into an iinntt wherever needed. By declaring the constructor eexxpplliicciitt, we make
sure that the iinntt to Y
Yeeaarr happens only when we ask for it and that ‘‘accidental’’ assignments are
caught at compile time. Because Y
Yeeaarr’s member functions are easily inlined, no run-time or space
costs are added.
A similar technique can be used to define range types (§25.6.1).
11.8 Subscripting [over.subscript]
An ooppeerraattoorr[] function can be used to give subscripts a meaning for class objects. The second
argument (the subscript) of an ooppeerraattoorr[] function may be of any type. This makes it possible to
define vveeccttoorrs, associative arrays, etc.
As an example, let us recode the example from §5.5 in which an associative array is used to
write a small program for counting the number of occurrences of words in a file. There, a function
is used. Here, an associative array type is defined:
ccllaassss A
Assssoocc {
ssttrruucctt P
Paaiirr {
ssttrriinngg nnaam
mee;
ddoouubbllee vvaall;
P
Paaiirr(ssttrriinngg n ="", ddoouubbllee v =00) :nnaam
mee(nn), vvaall(vv) { }
};
vveeccttoorr<P
Paaiirr> vveecc;
A
Assssoocc(ccoonnsstt A
Assssoocc&);
// private to prevent copying
A
Assssoocc& ooppeerraattoorr=(ccoonnsstt A
Assssoocc&);
// private to prevent copying
ppuubblliicc:
A
Assssoocc() {}
ddoouubbllee& ooppeerraattoorr[](ccoonnsstt ssttrriinngg&);
vvooiidd pprriinntt__aallll() ccoonnsstt;
};
An A
Assssoocc keeps a vector of P
Paaiirrs. The implementation uses the same trivial and inefficient search
method as in §5.5:
ddoouubbllee& A
Assssoocc::ooppeerraattoorr[](ccoonnsstt ssttrriinngg& ss)
// search for s; return its value if found; otherwise, make a new Pair and return the default value 0
{
ffoorr (vveeccttoorr<P
Paaiirr>::ccoonnsstt__iitteerraattoorr p = vveecc.bbeeggiinn(); pp!=vveecc.eenndd(); ++pp)
iiff (ss == pp->nnaam
mee) rreettuurrnn pp->vvaall;
vveecc.ppuusshh__bbaacckk(P
Paaiirr(ss,00));
// initial value: 0
rreettuurrnn vveecc.bbaacckk().vvaall;
// return last element (§16.3.3)
}
Because the representation of an A
Assssoocc is hidden, we need a way of printing it:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.8
Subscripting
287
vvooiidd A
Assssoocc::pprriinntt__aallll() ccoonnsstt
{
ffoorr (vveeccttoorr<P
Paaiirr>::ccoonnsstt__iitteerraattoorr p = vveecc.bbeeggiinn(); pp!=vveecc.eenndd(); ++pp)
ccoouutt << pp->nnaam
mee << ": " << pp->vvaall << ´\\nn´;
}
Finally, we can write the trivial main program:
iinntt m
maaiinn()
// count the occurrences of each word on input
{
ssttrriinngg bbuuff;
A
Assssoocc vveecc;
w
whhiillee (cciinn>>bbuuff) vveecc[bbuuff]++;
vveecc.pprriinntt__aallll();
}
A further development of the idea of an associative array can be found in §17.4.1.
An ooppeerraattoorr[]() must be a member function.
11.9 Function Call [over.call]
Function call, that is, the notation expression(expression-list), can be interpreted as a binary operation with the expression as the left-hand operand and the expression-list as the right-hand operand.
The call operator () can be overloaded in the same way as other operators can. An argument list
for an ooppeerraattoorr()() is evaluated and checked according to the usual argument-passing rules.
Overloading function call seems to be useful primarily for defining types that have only a single
operation and for types for which one operation is predominant.
The most obvious, and probably also the most important, use of the () operator is to provide
the usual function call syntax for objects that in some way behave like functions. An object that
acts like a function is often called a function-like object or simply a function object (§18.4). Such
function objects are important because they allow us to write code that takes nontrivial operations
as parameters. For example, the standard library provides many algorithms that invoke a function
for each element of a container. Consider:
vvooiidd nneeggaattee(ccoom
mpplleexx& cc) { c = -cc; }
vvooiidd ff(vveeccttoorr<ccoom
mpplleexx>& aaaa, lliisstt<ccoom
mpplleexx>& llll)
{
ffoorr__eeaacchh(aaaa.bbeeggiinn(),aaaa.eenndd(),nneeggaattee); // negate all vector elements
ffoorr__eeaacchh(llll.bbeeggiinn(),llll.eenndd(),nneeggaattee); // negate all list elements
}
This negates every element in the vector and the list.
What if we wanted to add ccoom
mpplleexx(22,33) to every element? That is easily done like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
288
Operator Overloading
Chapter 11
vvooiidd aadddd2233(ccoom
mpplleexx& cc)
{
c += ccoom
mpplleexx(22,33);
}
vvooiidd gg(vveeccttoorr<ccoom
mpplleexx>& aaaa, lliisstt<ccoom
mpplleexx>& llll)
{
ffoorr__eeaacchh(aaaa.bbeeggiinn(),aaaa.eenndd(),aadddd2233);
ffoorr__eeaacchh(llll.bbeeggiinn(),llll.eenndd(),aadddd2233);
}
How would we write a function to repeatedly add an arbitrary complex value? We need something
to which we can pass that arbitrary value and which can then use that value each time it is called.
That does not come naturally for functions. Typically, we end up ‘‘passing’’ the arbitrary value by
leaving it in the function’s surrounding context. That’s messy. However, we can write a class that
behaves in the desired way:
ccllaassss A
Adddd {
ccoom
mpplleexx vvaall;
ppuubblliicc:
A
Adddd(ccoom
mpplleexx cc) { vvaall = cc; }
A
Adddd(ddoouubbllee rr, ddoouubbllee ii) { vvaall = ccoom
mpplleexx(rr,ii); }
vvooiidd ooppeerraattoorr()(ccoom
mpplleexx& cc) ccoonnsstt { c += vvaall; }
// save value
// add value to argument
};
An object of class A
Adddd is initialized with a complex number, and when invoked using (), it adds
that number to its argument. For example:
vvooiidd hh(vveeccttoorr<ccoom
mpplleexx>& aaaa, lliisstt<ccoom
mpplleexx>& llll, ccoom
mpplleexx zz)
{
ffoorr__eeaacchh(aaaa.bbeeggiinn(),aaaa.eenndd(),A
Adddd(22,33));
ffoorr__eeaacchh(llll.bbeeggiinn(),llll.eenndd(),A
Adddd(zz));
}
This will add ccoom
mpplleexx(22,33) to every element of the array and z to every element on the list. Note
that A
Adddd(zz) constructs an object that is used repeatedly by ffoorr__eeaacchh(). It is not simply a function
that is called once or even called repeatedly. The function that is called repeatedly is A
Adddd(zz)’s
ooppeerraattoorr()().
This all works because ffoorr__eeaacchh is a template that applies () to its third argument without caring exactly what that third argument really is:
tteem
mppllaattee<ccllaassss IItteerr, ccllaassss F
Fcctt> IItteerr ffoorr__eeaacchh(IItteerr bb, IItteerr ee, F
Fcctt ff)
{
w
whhiillee (bb != ee) ff(*bb++);
rreettuurrnn bb;
}
At first glance, this technique may look esoteric, but it is simple, efficient, and extremely useful
(see §3.8.5, §18.4).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.9
Function Call
289
Other popular uses of ooppeerraattoorr()() are as a substring operator and as a subscripting operator
for multidimensional arrays (§22.4.5).
An ooppeerraattoorr()() must be a member function.
11.10 Dereferencing [over.deref]
The dereferencing operator -> can be defined as a unary postfix operator. That is, given a class
ccllaassss P
Pttrr {
// ...
X
X* ooppeerraattoorr->();
};
objects of class P
Pttrr can be used to access members of class X in a very similar manner to the way
pointers are used. For example:
vvooiidd ff(P
Pttrr pp)
{
pp->m
m = 77;
}
// (p.operator– >())– >m = 7
The transformation of the object p into the pointer pp.ooppeerraattoorr->() does not depend on the member m pointed to. That is the sense in which ooppeerraattoorr->() is a unary postfix operator. However,
there is no new syntax introduced, so a member name is still required after the ->. For example:
vvooiidd gg(P
Pttrr pp)
{
X
X* qq11 = pp->;
// syntax error
X
X* qq22 = pp.ooppeerraattoorr->(); // ok
}
Overloading -> is primarily useful for creating ‘‘smart pointers,’’ that is, objects that act like pointers and in addition perform some action whenever an object is accessed through them. For example, one could define a class R
Reecc__ppttrr for accessing objects of class R
Reecc stored on disk. R
Reecc__ppttrr’s
constructor takes a name that can be used to find the object on disk, R
Reecc__ppttrr::ooppeerraattoorr->()
brings the object into main memory when accessed through its R
Reecc__ppttrr, and R
Reecc__ppttrr’s destructor
eventually writes the updated object back out to disk:
ccllaassss R
Reecc__ppttrr {
R
Reecc* iinn__ccoorree__aaddddrreessss;
ccoonnsstt cchhaarr* iiddeennttiiffiieerr;
// ...
ppuubblliicc:
R
Reecc__ppttrr(ccoonnsstt cchhaarr* pp) : iiddeennttiiffiieerr(pp), iinn__ccoorree__aaddddrreessss(00) { }
~R
Reecc__ppttrr() { w
wrriittee__ttoo__ddiisskk(iinn__ccoorree__aaddddrreessss,iiddeennttiiffiieerr); }
R
Reecc* ooppeerraattoorr->();
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
290
Operator Overloading
Chapter 11
R
Reecc* R
Reecc__ppttrr::ooppeerraattoorr->()
{
iiff (iinn__ccoorree__aaddddrreessss == 00) iinn__ccoorree__aaddddrreessss = rreeaadd__ffrroom
m__ddiisskk(iiddeennttiiffiieerr);
rreettuurrnn iinn__ccoorree__aaddddrreessss;
}
R
Reecc__ppttrr might be used like this:
ssttrruucctt R
Reecc {
// the Rec that a Rec_ptr points to
ssttrriinngg nnaam
mee;
// ...
};
vvooiidd uuppddaattee(ccoonnsstt cchhaarr* ss)
{
R
Reecc__ppttrr pp(ss);
pp->nnaam
mee = "R
Roossccooee";
// ...
// get Rec_ptr for s
// update s; if necessary, first retrieve from disk
}
Naturally, a real R
Reecc__ppttrr would be a template so that the R
Reecc type is a parameter. Also, a realistic
program would contain error-handling code and use a less naive way of interacting with the disk.
For ordinary pointers, use of -> is synonymous with some uses of unary * and []. Given
Y
Y* pp;
it holds that
pp->m
m == (*pp).m
m == pp[00].m
m
As usual, no such guarantee is provided for user-defined operators. The equivalence can be provided where desired:
ccllaassss P
Pttrr__ttoo__Y
Y{
Y
Y* pp;
ppuubblliicc:
Y
Y* ooppeerraattoorr->() { rreettuurrnn pp; }
Y
Y& ooppeerraattoorr*() { rreettuurrnn *pp; }
Y
Y& ooppeerraattoorr[](iinntt ii) { rreettuurrnn pp[ii]; }
};
If you provide more than one of these operators, it might be wise to provide the equivalence, just as
it is wise to ensure that ++xx and xx+=11 have the same effect as xx=xx+11 for a simple variable x of
some class if ++, +=, =, and + are provided.
The overloading of -> is important to a class of interesting programs and not just a minor
curiosity. The reason is that indirection is a key concept and that overloading -> provides a clean,
direct, and efficient way of representing indirection in a program. Iterators (Chapter 19) provide an
important example of this. Another way of looking at operator -> is to consider it as a way of providing C++ with a limited, but useful, form of delegation (§24.2.4).
Operator -> must be a member function. If used, its return type must be a pointer or an object
of a class to which you can apply ->. When declared for a template class, ooppeerraattoorr->() is
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.10
Dereferencing
291
frequently unused, so it makes sense to postpone checking the constraint on the return type until
actual use.
11.11 Increment and Decrement [over.incr]
Once people invent ‘‘smart pointers,’’ they often decide to provide the increment operator ++ and
the decrement operator -- to mirror these operators’ use for built-in types. This is especially obvious and necessary where the aim is to replace an ordinary pointer type with a ‘‘smart pointer’’ type
that has the same semantics, except that it adds a bit of run-time error checking. For example, consider a troublesome traditional program:
vvooiidd ff11(T
T aa)
// traditional use
{
T vv[220000];
T
T* p = &vv[00];
pp--;
*pp = aa; // Oops: ‘p’ out of range, uncaught
++pp;
*pp = aa; // ok
}
We might want to replace the pointer p with an object of a class P
Pttrr__ttoo__T
T that can be dereferenced
only provided it actually points to an object. We would also like to ensure that p can be incremented and decremented, only provided it points to an object within an array and the increment and
decrement operations yield an object within the array. That is we would like something like this:
ccllaassss P
Pttrr__ttoo__T
T{
// ...
};
vvooiidd ff22(T
T aa)
// checked
{
T vv[220000];
P
Pttrr__ttoo__T
T pp(&vv[00],vv,220000);
pp--;
*pp = aa; // run-time error: ‘p’ out of range
++pp;
*pp = aa; // ok
}
The increment and decrement operators are unique among C++ operators in that they can be used as
both prefix and postfix operators. Consequently, we must define prefix and postfix increment and
decrement P
Pttrr__ttoo__T
T. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
292
Operator Overloading
Chapter 11
ccllaassss P
Pttrr__ttoo__T
T{
T
T* pp;
T
T* aarrrraayy;
iinntt ssiizzee;
ppuubblliicc:
P
Pttrr__ttoo__T
T(T
T* pp, T
T* vv, iinntt ss);
P
Pttrr__ttoo__T
T(T
T* pp);
// bind to array v of size s, initial value p
// bind to single object, initial value p
P
Pttrr__ttoo__T
T& ooppeerraattoorr++();
P
Pttrr__ttoo__T
T ooppeerraattoorr++(iinntt);
// prefix
// postfix
P
Pttrr__ttoo__T
T& ooppeerraattoorr--();
P
Pttrr__ttoo__T
T ooppeerraattoorr--(iinntt);
// prefix
// postfix
T
T& ooppeerraattoorr*();
// prefix
};
The iinntt argument is used to indicate that the function is to be invoked for postfix application of ++.
This iinntt is never used; the argument is simply a dummy used to distinguish between prefix and
postfix application. The way to remember which version of an ooppeerraattoorr++ is prefix is to note that
the version without the dummy argument is prefix, exactly like all the other unary arithmetic and
logical operators. The dummy argument is used only for the ‘‘odd’’ postfix ++ and --.
Using P
Pttrr__ttoo__T
T, the example is equivalent to:
vvooiidd ff33(T
T aa)
// checked
{
T vv[220000];
P
Pttrr__ttoo__T
T pp(&vv[00],vv,220000);
pp.ooppeerraattoorr--(00);
pp.ooppeerraattoorr*() = aa; // run-time error: ‘p’ out of range
pp.ooppeerraattoorr++();
pp.ooppeerraattoorr*() = aa; // ok
}
Completing class P
Pttrr__ttoo__T
T is left as an exercise (§11.14[19]). Its elaboration into a template using
exceptions to report the run-time errors is another exercise (§14.12[2]). An example of operators
++ and -- for iteration can be found in §19.3. A pointer template that behaves correctly with
respect to inheritance is presented in (§13.6.3).
11.12 A String Class [over.string]
Here is a more realistic version of class SSttrriinngg. I designed it as the minimal string that served my
needs. This string provides value semantics, character read and write operations, checked and
unchecked access, stream I/O, literal strings as literals, and equality and concatenation operators. It
represents strings as C-style, zero-terminated arrays of characters and uses reference counts to minimize copying. Writing a better string class and/or one that provides more facilities is a good exercise (§11.14[7-12]). That done, we can throw away our exercises and use the standard library
string (Chapter 20).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.12
A String Class
293
My almost-real SSttrriinngg employs three auxiliary classes: SSrreepp, to allow an actual representation
to be shared between several SSttrriinnggs with the same value; R
Raannggee, to be thrown in case of range
errors, and C
Crreeff, to help implement a subscript operator that distinguishes between reading and
writing:
ccllaassss SSttrriinngg {
ssttrruucctt SSrreepp;
SSrreepp *rreepp;
ppuubblliicc:
ccllaassss C
Crreeff;
ccllaassss R
Raannggee { };
// representation
// reference to char
// for exceptions
// ...
};
Like other members, a member class (often called a nested class) can be declared in the class itself
and defined later:
ssttrruucctt SSttrriinngg::SSrreepp {
cchhaarr* ss;
// pointer to elements
iinntt sszz;
// number of characters
iinntt nn;
// reference count
SSrreepp(iinntt nnsszz, ccoonnsstt cchhaarr* pp)
{
n = 11;
sszz = nnsszz;
s = nneew
w cchhaarr[sszz+11]; // add space for terminator
ssttrrccppyy(ss,pp);
}
~SSrreepp() { ddeelleettee[] ss; }
SSrreepp* ggeett__oow
wnn__ccooppyy()
// clone if necessary
{
iiff (nn==11) rreettuurrnn tthhiiss;
nn--;
rreettuurrnn nneew
w SSrreepp(sszz,ss);
}
vvooiidd aassssiiggnn(iinntt nnsszz, ccoonnsstt cchhaarr* pp)
{
iiff (sszz != nnsszz) {
ddeelleettee[] ss;
sszz = nnsszz;
s = nneew
w cchhaarr[sszz+11];
}
ssttrrccppyy(ss,pp);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
294
Operator Overloading
Chapter 11
pprriivvaattee:
// prevent copying:
SSrreepp(ccoonnsstt SSrreepp&);
SSrreepp& ooppeerraattoorr=(ccoonnsstt SSrreepp&);
};
Class SSttrriinngg provides the usual set of constructors, destructor, and assignment operations:
ccllaassss SSttrriinngg {
// ...
SSttrriinngg();
// x = ""
SSttrriinngg(ccoonnsstt cchhaarr*);
// x = "abc"
SSttrriinngg(ccoonnsstt SSttrriinngg&);
// x = other_string
SSttrriinngg& ooppeerraattoorr=(ccoonnsstt cchhaarr *);
SSttrriinngg& ooppeerraattoorr=(ccoonnsstt SSttrriinngg&);
~SSttrriinngg();
// ...
};
This SSttrriinngg has value semantics. That is, after an assignment ss11=ss22, the two strings ss11 and ss22 are
fully distinct and subsequent changes to the one have no effect on the other. The alternative would
be to give SSttrriinngg pointer semantics. That would be to let changes to ss22 after ss11=ss22 also affect the
value of ss11. For types with conventional arithmetic operations, such as complex, vector, matrix,
and string, I prefer value semantics. However, for the value semantics to be affordable, a SSttrriinngg is
implemented as a handle to its representation and the representation is copied only when necessary:
SSttrriinngg::SSttrriinngg()
// the empty string is the default value
{
rreepp = nneew
w SSrreepp(00,"");
}
SSttrriinngg::SSttrriinngg(ccoonnsstt SSttrriinngg& xx) // copy constructor
{
xx.rreepp->nn++;
rreepp = xx.rreepp;
// share representation
}
SSttrriinngg::~SSttrriinngg()
{
iiff (--rreepp->nn == 00) ddeelleettee rreepp;
}
SSttrriinngg& SSttrriinngg::ooppeerraattoorr=(ccoonnsstt SSttrriinngg& xx)
// copy assignment
{
xx.rreepp->nn++;
// protects against ‘‘st = st’’
iiff (--rreepp->nn == 00) ddeelleettee rreepp;
rreepp = xx.rreepp;
// share representation
rreettuurrnn *tthhiiss;
}
Pseudo-copy operations taking ccoonnsstt cchhaarr* arguments are provided to allow string literals:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.12
A String Class
295
SSttrriinngg::SSttrriinngg(ccoonnsstt cchhaarr* ss)
{
rreepp = nneew
w SSrreepp(ssttrrlleenn(ss),ss);
}
SSttrriinngg& SSttrriinngg::ooppeerraattoorr=(ccoonnsstt cchhaarr* ss)
{
iiff (rreepp->nn == 11)
// recycle Srep
rreepp->aassssiiggnn(ssttrrlleenn(ss),ss);
eellssee {
// use new Srep
rreepp->nn--;
rreepp = nneew
w SSrreepp(ssttrrlleenn(ss),ss);
}
rreettuurrnn *tthhiiss;
}
The design of access operators for a string is a difficult topic because ideally access is by conventional notation (that is, using []), maximally efficient, and range checked. Unfortunately, you cannot have all of these properties simultaneously. My choice here has been to provide efficient
unchecked operations with a slightly inconvenient notation plus slightly less efficient checked operators with the conventional notation:
ccllaassss SSttrriinngg {
// ...
vvooiidd cchheecckk(iinntt ii) ccoonnsstt { iiff (ii<00 || rreepp->sszz<=ii) tthhrroow
w R
Raannggee(); }
cchhaarr rreeaadd(iinntt ii) ccoonnsstt { rreettuurrnn rreepp->ss[ii]; }
vvooiidd w
wrriittee(iinntt ii, cchhaarr cc) { rreepp=rreepp->ggeett__oow
wnn__ccooppyy(); rreepp->ss[ii]=cc; }
C
Crreeff ooppeerraattoorr[](iinntt ii) { cchheecckk(ii); rreettuurrnn C
Crreeff(*tthhiiss,ii); }
cchhaarr ooppeerraattoorr[](iinntt ii) ccoonnsstt { cchheecckk(ii); rreettuurrnn rreepp->ss[ii]; }
iinntt ssiizzee() ccoonnsstt { rreettuurrnn rreepp->sszz; }
// ...
};
The idea is to use [] to get checked access for ordinary use, but to allow the user to optimize by
checking the range once for a set of accesses. For example:
iinntt hhaasshh(ccoonnsstt SSttrriinngg& ss)
{
iinntt h = ss.rreeaadd(00);
ccoonnsstt iinntt m
maaxx = ss.ssiizzee();
ffoorr (iinntt i = 11; ii<m
maaxx; ii++) h ^= ss.rreeaadd(ii)>>11; // unchecked access to s
rreettuurrnn hh;
}
Defining an operator, such as [], to be used for both reading and writing is difficult where it is not
acceptable simply to return a reference and let the user decide what to do with it. Here, that is not a
reasonable alternative because I have defined SSttrriinngg so that the representation is shared between
SSttrriinnggs that have been assigned, passed as value arguments, etc., until someone actually writes to a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
296
Operator Overloading
Chapter 11
SSttrriinngg. Then, and only then, is the representation copied. This technique is usually called copyon-write. The actual copy is done by SSttrriinngg::SSrreepp::ggeett__oow
wnn__ccooppyy().
To get these access functions inlined, their definitions must be placed so that the definition of
SSrreepp is in scope. This implies that either SSrreepp is defined within SSttrriinngg or the access functions are
defined iinnlliinnee outside SSttrriinngg and after SSttrriinngg::SSrreepp (§11.14[2]).
To distinguish between a read and a write, SSttrriinngg::ooppeerraattoorr[]() returns a C
Crreeff when called
for a non-ccoonnsstt object. A C
Crreeff behaves like a cchhaarr&, except that it calls
SSttrriinngg::SSrreepp::ggeett__oow
wnn__ccooppyy() when written to:
ccllaassss SSttrriinngg::C
Crreeff {
// reference to s[i]
ffrriieenndd ccllaassss SSttrriinngg;
SSttrriinngg& ss;
iinntt ii;
C
Crreeff(SSttrriinngg& ssss, iinntt iiii) : ss(ssss), ii(iiii) { }
ppuubblliicc:
ooppeerraattoorr cchhaarr() { rreettuurrnn ss.rreeaadd(ii); }
vvooiidd ooppeerraattoorr=(cchhaarr cc) { ss.w
wrriittee(ii,cc); }
};
// yield value
// change value
For example:
vvooiidd ff(SSttrriinngg ss, ccoonnsstt SSttrriinngg& rr)
{
iinntt cc11 = ss[11]; // c1 = s.operator[](1).operator char()
ss[11] = ´cc´;
// s.operator[](1).operator=(’c’)
iinntt cc22 = rr[11]; // c2 = r.operator[](1)
rr[11] = ´dd´;
// error: assignment to char, r.operator[](1) = ’d’
}
Note that for a non-ccoonnsstt object ss.ooppeerraattoorr[](11) is C
Crreeff(ss,11).
To complete class SSttrriinngg, I provide a set of useful functions:
ccllaassss SSttrriinngg {
// ...
SSttrriinngg& ooppeerraattoorr+=(ccoonnsstt SSttrriinngg&);
SSttrriinngg& ooppeerraattoorr+=(ccoonnsstt cchhaarr*);
ffrriieenndd oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m&, ccoonnsstt SSttrriinngg&);
ffrriieenndd iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m&, SSttrriinngg&);
ffrriieenndd bbooooll ooppeerraattoorr==(ccoonnsstt SSttrriinngg& xx, ccoonnsstt cchhaarr* ss)
{ rreettuurrnn ssttrrccm
mpp(xx.rreepp->ss, ss) == 00; }
ffrriieenndd bbooooll ooppeerraattoorr==(ccoonnsstt SSttrriinngg& xx, ccoonnsstt SSttrriinngg& yy)
{ rreettuurrnn ssttrrccm
mpp(xx.rreepp->ss, yy.rreepp->ss) == 00; }
ffrriieenndd bbooooll ooppeerraattoorr!=(ccoonnsstt SSttrriinngg& xx, ccoonnsstt cchhaarr* ss)
{ rreettuurrnn ssttrrccm
mpp(xx.rreepp->ss, ss) != 00; }
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.12
A String Class
297
ffrriieenndd bbooooll ooppeerraattoorr!=(ccoonnsstt SSttrriinngg& xx, ccoonnsstt SSttrriinngg& yy)
{ rreettuurrnn ssttrrccm
mpp(xx.rreepp->ss, yy.rreepp->ss) != 00; }
};
SSttrriinngg ooppeerraattoorr+(ccoonnsstt SSttrriinngg&, ccoonnsstt SSttrriinngg&);
SSttrriinngg ooppeerraattoorr+(ccoonnsstt SSttrriinngg&, ccoonnsstt cchhaarr*);
To save space, I have left the I/O and concatenation operations as exercises.
The main program simply exercises the SSttrriinngg operators a bit:
SSttrriinngg ff(SSttrriinngg aa, SSttrriinngg bb)
{
aa[22] = ´xx´;
cchhaarr c = bb[33];
ccoouutt << "iinn ff: " << a << ´ ´ << b << ´ ´ << c << ´\\nn´;
rreettuurrnn bb;
}
iinntt m
maaiinn()
{
SSttrriinngg xx, yy;
ccoouutt << "P
Plleeaassee eenntteerr ttw
woo ssttrriinnggss\\nn";
cciinn >> x >> yy;
ccoouutt << "iinnppuutt: " << x << ´ ´ << y << ´\\nn´;
SSttrriinngg z = xx;
y = ff(xx,yy);
iiff (xx != zz) ccoouutt << "xx ccoorrrruupptteedd!\\nn";
xx[00] = ´!´;
iiff (xx == zz) ccoouutt << "w
wrriittee ffaaiilleedd!\\nn";
ccoouutt << "eexxiitt: " << x << ´ ´ << y << ´ ´ << z << ´\\nn´;
}
This SSttrriinngg lacks many features that you might consider important or even essential. For example,
it offers no operation of producing a C-string representation of its value (§11.14[10], Chapter 20).
11.13 Advice [class.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
Define operators primarily to mimic conventional usage; §11.1.
For large operands, use ccoonnsstt reference argument types; §11.6.
For large results, consider optimizing the return; §11.6.
Prefer the default copy operations if appropriate for a class; §11.3.4.
Redefine or prohibit copying if the default is not appropriate for a type; §11.2.2.
Prefer member functions over nonmembers for operations that need access to the representation; §11.5.2.
Prefer nonmember functions over members for operations that do not need access to the representation; §11.5.2.
Use namespaces to associate helper functions with ‘‘their’’ class; §11.2.4.
Use nonmember functions for symmetric operators; §11.3.2.
Use () for subscripting multidimensional arrays; §11.9.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
298
Operator Overloading
Chapter 11
[11] Make constructors that take a single ‘‘size argument’’ eexxpplliicciitt; §11.7.1.
[12] For non-specialized uses, prefer the standard ssttrriinngg (Chapter 20) to the result of your own
exercises; §11.12.
[13] Be cautious about introducing implicit conversions; §11.4.
[14] Use member functions to express operators that require an lvalue as its left-hand operand;
§11.3.5.
11.14 Exercises [over.exercises]
1. (∗2) In the following program, which conversions are used in each expression?
ssttrruucctt X {
iinntt ii;
X
X(iinntt);
ooppeerraattoorr+(iinntt);
};
ssttrruucctt Y {
iinntt ii;
Y
Y(X
X);
ooppeerraattoorr+(X
X);
ooppeerraattoorr iinntt();
};
eexxtteerrnn X ooppeerraattoorr*(X
X, Y
Y);
eexxtteerrnn iinntt ff(X
X);
X x = 11;
Y y = xx;
iinntt i = 22;
iinntt m
maaiinn()
{
i + 1100;
x + y + ii;
ff(yy);
}
y + 1100;
y + 1100 * yy;
x * x + ii; ff(77);
y + yy;
110066 + yy;
Modify the program so that it will run and print the values of each legal expression.
2. (∗2) Complete and test class SSttrriinngg from §11.12.
3. (∗2) Define a class IIN
NT
T that behaves exactly like an iinntt. Hint: Define IIN
NT
T::ooppeerraattoorr iinntt().
4. (∗1) Define a class R
RIIN
NT
T that behaves like an iinntt except that the only operations allowed are +
(unary and binary), - (unary and binary), *, /, and %. Hint: Do not define R
RIIN
NT
T::ooppeerraattoorr
iinntt().
5. (∗3) Define a class L
LIIN
NT
T that behaves like a R
RIIN
NT
T, except that it has at least 64 bits of precision.
6. (∗4) Define a class implementing arbitrary precision arithmetic. Test it by calculating the factorial of 11000000. Hint: You will need to manage storage in a way similar to what was done for class
SSttrriinngg.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 11.14
Exercises
299
7. (∗2) Define an external iterator for class SSttrriinngg:
ccllaassss SSttrriinngg__iitteerr {
// refer to string and string element
ppuubblliicc:
SSttrriinngg__iitteerr(SSttrriinngg& ss);
// iterator for s
cchhaarr& nneexxtt();
// reference to next element
// more operations of your choice
};
Compare this in utility, programming style, and efficiency to having an internal iterator for
SSttrriinngg (that is, a notion of a current element for the SSttrriinngg and operations relating to that element).
8. (∗1.5) Provide a substring operator for a string class by overloading (). What other operations
would you like to be able to do on a string?
9. (∗3) Design class SSttrriinngg so that the substring operator can be used on the left-hand side of an
assignment. First, write a version in which a string can be assigned to a substring of the same
length. Then, write a version in which the lengths may differ.
10. (∗2) Define an operation for SSttrriinngg that produces a C-string representation of its value. Discuss
the pros and cons of having that operation as a conversion operator. Discuss alternatives for
allocating the memory for that C-string representation.
11. (∗2.5) Define and implement a simple regular expression pattern match facility for class SSttrriinngg.
12. (∗1.5) Modify the pattern match facility from §11.14[11] to work on the standard library ssttrriinngg.
Note that you cannot modify the definition of ssttrriinngg.
13. (∗2) Write a program that has been rendered unreadable through use of operator overloading
and macros. An idea: Define + to mean - and vice versa for IIN
NT
Tss. Then, use a macro to define
iinntt to mean IIN
NT
T. Redefine popular functions using reference type arguments. Writing a few
misleading comments can also create great confusion.
14. (∗3) Swap the result of §11.14[13] with a friend. Without running it, figure out what your
friend’s program does. When you have completed this exercise, you’ll know what to avoid.
15. (∗2) Define a type V
Veecc44 as a vector of four ffllooaatts. Define ooppeerraattoorr[] for V
Veecc44. Define operators +, -, *, /, =, +=, -=, *=, and /= for combinations of vectors and floating-point numbers.
16. (∗3) Define a class M
Maatt44 as a vector of four V
Veecc44s. Define ooppeerraattoorr[] to return a V
Veecc44 for
M
Maatt44. Define the usual matrix operations for this type. Define a function doing Gaussian elimination for a M
Maatt44.
17. (∗2) Define a class V
Veeccttoorr similar to V
Veecc44 but with the size given as an argument to the constructor V
Veeccttoorr::V
Veeccttoorr(iinntt).
18. (∗3) Define a class M
Maattrriixx similar to M
Maatt44 but with the dimensions given as arguments to the
constructor M
Maattrriixx::M
Maattrriixx(iinntt,iinntt).
19. (∗2) Complete class P
Pttrr__ttoo__T
T from §11.11 and test it. To be complete, P
Pttrr__ttoo__T
T must have at
least the operators *, ->, =, ++, and -- defined. Do not cause a run-time error until a wild
pointer is actually dereferenced.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
300
Operator Overloading
Chapter 11
20. (∗1) Given two structures:
ssttrruucctt S { iinntt xx, yy; };
ssttrruucctt T { cchhaarr* pp; cchhaarr* qq; };
write a class C that allows the use of x and p from some S and T
T, much as if x and p had been
members of C
C.
21. (∗1.5) Define a class IInnddeexx to hold the index for an exponentiation function
m
myyppoow
w(ddoouubbllee,IInnddeexx). Find a way to have 22**II call m
myyppoow
w(22,II).
22. (∗2) Define a class IIm
maaggiinnaarryy to represent imaginary numbers. Define class C
Coom
mpplleexx based on
that. Implement the fundamental arithmetic operators.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
12
________________________________________
________________________________________________________________________________________________________________________________________________________________
Derived Classes
Do not multiply objects without necessity.
– W. Occam
Concepts and classes — derived classes — member functions — construction and
destruction — class hierarchies — type fields — virtual functions — abstract classes —
traditional class hierarchies — abstract classes as interfaces — localizing object creation
— abstract classes and class hierarchies — advice — exercises.
12.1 Introduction [derived.intro]
From Simula, C++ borrowed the concept of a class as a user-defined type and the concept of class
hierarchies. In addition, it borrowed the idea for system design that classes should be used to
model concepts in the programmer’s and the application’s world. C++ provides language constructs that directly support these design notions. Conversely, using the language features in support of design concepts distinguishes effective use of C++. Using language constructs only as notational props for more traditional types of programming is to miss key strengths of C++.
A concept does not exist in isolation. It coexists with related concepts and derives much of its
power from relationships with related concepts. For example, try to explain what a car is. Soon
you’ll have introduced the notions of wheels, engines, drivers, pedestrians, trucks, ambulances,
roads, oil, speeding tickets, motels, etc. Since we use classes to represent concepts, the issue
becomes how to represent relationships between concepts. However, we can’t express arbitrary
relationships directly in a programming language. Even if we could, we wouldn’t want to. Our
classes should be more narrowly defined than our everyday concepts – and more precise. The
notion of a derived class and its associated language mechanisms are provided to express hierarchical relationships, that is, to express commonality between classes. For example, the concepts of a
circle and a triangle are related in that they are both shapes; that is, they have the concept of a shape
in common. Thus, we must explicitly define class C
Ciirrccllee and class T
Trriiaannggllee to have class SShhaappee in
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
302
Derived Classes
Chapter 12
common. Representing a circle and a triangle in a program without involving the notion of a shape
would be to lose something essential. This chapter is an exploration of the implications of this simple idea, which is the basis for what is commonly called object-oriented programming.
The presentation of language features and techniques progress from the simple and concrete to
the more sophisticated and abstract. For many programmers, this will also be a progression from
the familiar towards the less well known. This is not a simple journey from ‘‘bad old techniques’’
towards ‘‘the one right way.’’ When I point out limitations of one technique as a motivation for
another, I do so in the context of specific problems; for different problems or in other contexts, the
first technique may indeed be the better choice. Useful software has been constructed using all of
the techniques presented here. The aim is to help you attain sufficient understanding of the techniques to be able to make intelligent and balanced choices among them for real problems.
In this chapter, I first introduce the basic language features supporting object-oriented programming. Next, the use of those features to develop well-structured programs is discussed in the context of a larger example. Further facilities supporting object-oriented programming, such as multiple inheritance and run-time type identification, are discussed in Chapter 15.
12.2 Derived Classes [derived.derived]
Consider building a program dealing with people employed by a firm. Such a program might have
a data structure like this:
ssttrruucctt E
Em
mppllooyyeeee {
ssttrriinngg ffiirrsstt__nnaam
mee, ffaam
miillyy__nnaam
mee;
cchhaarr m
miiddddllee__iinniittiiaall;
D
Daattee hhiirriinngg__ddaattee;
sshhoorrtt ddeeppaarrttm
meenntt;
// ...
};
Next, we might try to define a manager:
ssttrruucctt M
Maannaaggeerr {
E
Em
mppllooyyeeee eem
mpp;
sseett<E
Em
mppllooyyeeee*> ggrroouupp;
sshhoorrtt lleevveell;
// ...
};
// manager’s employee record
// people managed
A manager is also an employee; the E
Em
mppllooyyeeee data is stored in the eem
mpp member of a M
Maannaaggeerr
object. This may be obvious to a human reader – especially a careful reader – but there is nothing
that tells the compiler and other tools that M
Maannaaggeerr is also an E
Em
mppllooyyeeee. A M
Maannaaggeerr* is not an
E
Em
mppllooyyeeee*, so one cannot simply use one where the other is required. In particular, one cannot put
aM
Maannaaggeerr onto a list of E
Em
mppllooyyeeees without writing special code. We could either use explicit
type conversion on a M
Maannaaggeerr* or put the address of the eem
mpp member onto a list of eem
mppllooyyeeees.
However, both solutions are inelegant and can be quite obscure. The correct approach is to explicitly state that a M
Maannaaggeerr is an E
Em
mppllooyyeeee, with a few pieces of information added:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.2
Derived Classes
303
ssttrruucctt M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee {
sseett<E
Em
mppllooyyeeee*> ggrroouupp;
sshhoorrtt lleevveell;
// ...
};
The M
Maannaaggeerr is derived from E
Em
mppllooyyeeee, and conversely, E
Em
mppllooyyeeee is a base class for M
Maannaaggeerr.
The class M
Maannaaggeerr has the members of class E
Em
mppllooyyeeee (nnaam
mee, aaggee, etc.) in addition to its own
members (ggrroouupp, lleevveell, etc.).
Derivation is often represented graphically by a pointer from the derived class to its base class
indicating that the derived class refers to its base (rather than the other way around):
E
Em
mppllooyyeeee
.
M
Maannaaggeerr
A derived class is often said to inherit properties from its base, so the relationship is also called
inheritance. A base class is sometimes called a superclass and a derived class a subclass. This terminology, however, is confusing to people who observe that the data in a derived class object is a
superset of the data of an object of its base class. A derived class is larger than its base class in the
sense that it holds more data and provides more functions.
A popular and efficient implementation of the notion of derived classes has an object of the
derived class represented as an object of the base class, with the information belonging specifically
to the derived class added at the end. For example:
E
Em
mppllooyyeeee:
M
Maannaaggeerr:
ffiirrsstt__nnaam
mee
ffaam
miillyy__nnaam
mee
...
ffiirrsstt__nnaam
mee
ffaam
miillyy__nnaam
mee
...
ggrroouupp
lleevveell
...
Deriving M
Maannaaggeerr from E
Em
mppllooyyeeee in this way makes M
Maannaaggeerr a subtype of E
Em
mppllooyyeeee so that a
M
Maannaaggeerr can be used wherever an E
Em
mppllooyyeeee is acceptable. For example, we can now create a list
of E
Em
mppllooyyeeees, some of whom are M
Maannaaggeerrs:
vvooiidd ff(M
Maannaaggeerr m
m11, E
Em
mppllooyyeeee ee11)
{
lliisstt<E
Em
mppllooyyeeee*> eelliisstt;
eelliisstt.ppuusshh__ffrroonntt(&m
m11);
eelliisstt.ppuusshh__ffrroonntt(&ee11);
// ...
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
304
Derived Classes
Chapter 12
AM
Maannaaggeerr is (also) an E
Em
mppllooyyeeee, so a M
Maannaaggeerr* can be used as a E
Em
mppllooyyeeee*. However, an
E
Em
mppllooyyeeee is not necessarily a M
Maannaaggeerr, so an E
Em
mppllooyyeeee* cannot be used as a M
Maannaaggeerr*. In general, if a class D
Deerriivveedd has a public base class (§15.3) B
Baassee, then a D
Deerriivveedd* can be assigned to a
variable of type B
Baassee* without the use of explicit type conversion. The opposite conversion, from
B
Baassee* to D
Deerriivveedd*, must be explicit. For example:
vvooiidd gg(M
Maannaaggeerr m
mm
m, E
Em
mppllooyyeeee eeee)
{
E
Em
mppllooyyeeee* ppee = &m
mm
m;
// ok: every Manager is an Employee
M
Maannaaggeerr* ppm
m = &eeee;
// error: not every Employee is a Manager
ppm
m->lleevveell = 22;
// disaster: ee doesn’t have a ‘level’
ppm
m = ssttaattiicc__ccaasstt<M
Maannaaggeerr*>(ppee);
ppm
m->lleevveell = 22;
// brute force: works because pe points
// to the Manager mm
// fine: pm points to the Manager mm that has a ‘level’
}
In other words, an object of a derived class can be treated as an object of its base class when manipulated through pointers and references. The opposite is not true. The use of ssttaattiicc__ccaasstt and
ddyynnaam
miicc__ccaasstt is discussed in §15.4.2.
Using a class as a base is equivalent to declaring an (unnamed) object of that class. Consequently, a class must be defined in order to be used as a base (§5.7):
ccllaassss E
Em
mppllooyyeeee;
// declaration only, no definition
ccllaassss M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee { // error: Employee not defined
// ...
};
12.2.1 Member Functions [derived.member]
Simple data structures, such as E
Em
mppllooyyeeee and M
Maannaaggeerr, are really not that interesting and often not
particularly useful. We need to give the information as a proper type that provides a suitable set of
operations that present the concept, and we need to do this without tying us to the details of a particular representation. For example:
ccllaassss E
Em
mppllooyyeeee {
ssttrriinngg ffiirrsstt__nnaam
mee, ffaam
miillyy__nnaam
mee;
cchhaarr m
miiddddllee__iinniittiiaall;
// ...
ppuubblliicc:
vvooiidd pprriinntt() ccoonnsstt;
ssttrriinngg ffuullll__nnaam
mee() ccoonnsstt
{ rreettuurrnn ffiirrsstt__nnaam
mee + ´ ´ + m
miiddddllee__iinniittiiaall + ´ ´ + ffaam
miillyy__nnaam
mee; }
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.2.1
Member Functions
305
ccllaassss M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee {
// ...
ppuubblliicc:
vvooiidd pprriinntt() ccoonnsstt;
// ...
};
A member of a derived class can use the public – and protected (see §15.3) – members of its base
class as if they were declared in the derived class itself. For example:
vvooiidd M
Maannaaggeerr::pprriinntt() ccoonnsstt
{
ccoouutt << "nnaam
mee iiss " << ffuullll__nnaam
mee() << ´\\nn´;
// ...
}
However, a derived class cannot use a base class’ private names:
vvooiidd M
Maannaaggeerr::pprriinntt() ccoonnsstt
{
ccoouutt << " nnaam
mee iiss " << ffaam
miillyy__nnaam
mee << ´\\nn´;
// ...
}
// error!
This second version of M
Maannaaggeerr::pprriinntt() will not compile. A member of a derived class has no
special permission to access private members of its base class, so ffaam
miillyy__nnaam
mee is not accessible to
M
Maannaaggeerr::pprriinntt().
This comes as a surprise to some, but consider the alternative: that a member function of a
derived class could access the private members of its base class. The concept of a private member
would be rendered meaningless by allowing a programmer to gain access to the private part of a
class simply by deriving a new class from it. Furthermore, one could no longer find all uses of a
private name by looking at the functions declared as members and friends of that class. One would
have to examine every source file of the complete program for derived classes, then examine every
function of those classes, then find every class derived from those classes, etc. This is, at best,
tedious and often impractical. Where it is acceptable, pprrootteecctteedd – rather than pprriivvaattee – members
can be used. A protected member is like a public member to a member of a derived class, yet it is
like a private member to other functions (see §15.3).
Typically, the cleanest solution is for the derived class to use only the public members of its
base class. For example:
vvooiidd M
Maannaaggeerr::pprriinntt() ccoonnsstt
{
E
Em
mppllooyyeeee::pprriinntt(); // print Employee information
ccoouutt << lleevveell;
// ...
// print Manager-specific information
}
Note that :: must be used because pprriinntt() has been redefined in M
Maannaaggeerr. Such reuse of names
is typical. The unwary might write this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
306
Derived Classes
Chapter 12
vvooiidd M
Maannaaggeerr::pprriinntt() ccoonnsstt
{
pprriinntt(); // oops!
// print Manager-specific information
}
and find the program involved in an unexpected sequence of recursive calls.
12.2.2 Constructors and Destructors [derived.ctor]
Some derived classes need constructors. If a base class has constructors, then a constructor must be
invoked. Default constructors can be invoked implicitly. However, if all constructors for a base
require arguments, then a constructor for that base must be explicitly called. Consider:
ccllaassss E
Em
mppllooyyeeee {
ssttrriinngg ffiirrsstt__nnaam
mee, ffaam
miillyy__nnaam
mee;
sshhoorrtt ddeeppaarrttm
meenntt;
// ...
ppuubblliicc:
E
Em
mppllooyyeeee(ccoonnsstt ssttrriinngg& nn, iinntt dd);
// ...
};
ccllaassss M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee {
sseett<E
Em
mppllooyyeeee*> ggrroouupp;
// people managed
sshhoorrtt lleevveell;
// ...
ppuubblliicc:
M
Maannaaggeerr(ccoonnsstt ssttrriinngg& nn, iinntt dd, iinntt llvvll);
// ...
};
Arguments for the base class’ constructor are specified in the definition of a derived class’ constructor. In this respect, the base class acts exactly like a member of the derived class (§10.4.6).
For example:
E
Em
mppllooyyeeee::E
Em
mppllooyyeeee(ccoonnsstt ssttrriinngg& nn, iinntt dd)
: ffaam
miillyy__nnaam
mee(nn), ddeeppaarrttm
meenntt(dd)
{
// ...
}
// initialize members
M
Maannaaggeerr::M
Maannaaggeerr(ccoonnsstt ssttrriinngg& nn, iinntt dd, iinntt llvvll)
:E
Em
mppllooyyeeee(nn,dd),
// initialize base
lleevveell(llvvll)
// initialize members
{
// ...
}
A derived class constructor can specify initializers for its own members and immediate bases only;
it cannot directly initialize members of a base. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.2.2
Constructors and Destructors
307
M
Maannaaggeerr::M
Maannaaggeerr(ccoonnsstt ssttrriinngg& nn, iinntt dd, iinntt llvvll)
: ffaam
miillyy__nnaam
mee(nn), // error: family_name not declared in manager
ddeeppaarrttm
meenntt(dd),
// error: department not declared in manager
lleevveell(llvvll)
{
// ...
}
This definition contains three errors: it fails to invoke E
Em
mppllooyyeeee´ss constructor, and twice it
attempts to initialize members of E
Em
mppllooyyeeee directly.
Class objects are constructed from the bottom up: first the base, then the members, and then the
derived class itself. They are destroyed in the opposite order: first the derived class itself, then the
members, and then the base. Members and bases are constructed in order of declaration in the class
and destroyed in the reverse order. See also §10.4.6 and §15.2.4.1.
12.2.3 Copying [derived.copy]
Copying of class objects is defined by the copy constructor and assignments (§10.4.4.1). Consider:
ccllaassss E
Em
mppllooyyeeee {
// ...
E
Em
mppllooyyeeee& ooppeerraattoorr=(ccoonnsstt E
Em
mppllooyyeeee&);
E
Em
mppllooyyeeee(ccoonnsstt E
Em
mppllooyyeeee&);
};
vvooiidd ff(ccoonnsstt M
Maannaaggeerr& m
m)
{
E
Em
mppllooyyeeee e = m
m;
// construct e from Employee part of m
e=m
m;
// assign Employee part of m to e
}
Because the E
Em
mppllooyyeeee copy functions do not know anything about M
Maannaaggeerrs, only the E
Em
mppllooyyeeee
part of a M
Maannaaggeerr is copied. This is commonly referred to as slicing and can be a source of surprises and errors. One reason to pass pointers and references to objects of classes in a hierarchy is
to avoid slicing. Other reasons are to preserve polymorphic behavior (§2.5.4, §12.2.6) and to gain
efficiency.
12.2.4 Class Hierarchies [derived.hierarchy]
A derived class can itself be a base class. For example:
ccllaassss E
Em
mppllooyyeeee { /* ... */ };
ccllaassss M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee { /* ... */ };
ccllaassss D
Diirreeccttoorr : ppuubblliicc M
Maannaaggeerr { /* ... */ };
Such a set of related classes is traditionally called a class hierarchy. Such a hierarchy is most often
a tree, but it can also be a more general graph structure. For example:
ccllaassss T
Teem
mppoorraarryy { /* ... */ };
ccllaassss SSeeccrreettaarryy : ppuubblliicc E
Em
mppllooyyeeee { /* ... */ };
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
308
Derived Classes
Chapter 12
ccllaassss T
Tsseecc : ppuubblliicc T
Teem
mppoorraarryy, ppuubblliicc SSeeccrreettaarryy { /* ... */ };
ccllaassss C
Coonnssuullttaanntt : ppuubblliicc T
Teem
mppoorraarryy, ppuubblliicc M
Maannaaggeerr { /* ... */ };
Or graphically:
T
Teem
mppoorraarryy
E
Em
mppllooyyeeee
.
SSeeccrreettaarryy
M
Maannaaggeerr
T
Tsseecc
C
Coonnssuullttaanntt
D
Diirreeccttoorr
Thus, as is explained in detail in §15.2, C++ can express a directed acyclic graph of classes.
12.2.5 Type Fields [derived.typefield]
To use derived classes as more than a convenient shorthand in declarations, we must solve the following problem: Given a pointer of type bbaassee*, to which derived type does the object pointed to
really belong? There are four fundamental solutions to the problem:
[1] Ensure that only objects of a single type are pointed to (§2.7, Chapter 13).
[2] Place a type field in the base class for the functions to inspect.
[3] Use ddyynnaam
miicc__ccaasstt (§15.4.2, §15.4.5).
[4] Use virtual functions (§2.5.5, §12.2.6).
Pointers to base classes are commonly used in the design of container classes such as set, vector,
and list. In this case, solution 1 yields homogeneous lists, that is, lists of objects of the same type.
Solutions 2, 3, and 4 can be used to build heterogeneous lists, that is, lists of (pointers to) objects of
several different types. Solution 3 is a language-supported variant of solution 2. Solution 4 is a
special type-safe variation of solution 2. Combinations of solutions 1 and 4 are particularly interesting and powerful; in almost all situations, they yield cleaner code than do solutions 2 and 3.
Let us first examine the simple type-field solution to see why it is most often best avoided. The
manager/employee example could be redefined like this:
ssttrruucctt E
Em
mppllooyyeeee {
eennuum
m E
Em
mppll__ttyyppee { M
M, E };
E
Em
mppll__ttyyppee ttyyppee;
E
Em
mppllooyyeeee() : ttyyppee(E
E) { }
ssttrriinngg ffiirrsstt__nnaam
mee, ffaam
miillyy__nnaam
mee;
cchhaarr m
miiddddllee__iinniittiiaall;
D
Daattee hhiirriinngg__ddaattee;
sshhoorrtt ddeeppaarrttm
meenntt;
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.2.5
Type Fields
309
ssttrruucctt M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee {
M
Maannaaggeerr() { ttyyppee = M
M; }
sseett<E
Em
mppllooyyeeee*> ggrroouupp;
sshhoorrtt lleevveell;
// ...
// people managed
};
Given this, we can now write a function that prints information about each E
Em
mppllooyyeeee:
vvooiidd pprriinntt__eem
mppllooyyeeee(ccoonnsstt E
Em
mppllooyyeeee* ee)
{
ssw
wiittcchh (ee->ttyyppee) {
ccaassee E
Em
mppllooyyeeee::E
E:
ccoouutt << ee->ffaam
miillyy__nnaam
mee << ´\\tt´ << ee->ddeeppaarrttm
meenntt << ´\\nn´;
// ...
bbrreeaakk;
ccaassee E
Em
mppllooyyeeee::M
M:
{
ccoouutt << ee->ffaam
miillyy__nnaam
mee << ´\\tt´ << ee->ddeeppaarrttm
meenntt << ´\\nn´;
// ...
ccoonnsstt M
Maannaaggeerr* p = ssttaattiicc__ccaasstt<ccoonnsstt M
Maannaaggeerr*>(ee);
ccoouutt << " lleevveell " << pp->lleevveell << ´\\nn´;
// ...
bbrreeaakk;
}
}
}
and use it to print a list of E
Em
mppllooyyeeees, like this:
vvooiidd pprriinntt__lliisstt(ccoonnsstt lliisstt<E
Em
mppllooyyeeee*>& eelliisstt)
{
ffoorr (lliisstt<E
Em
mppllooyyeeee*>::ccoonnsstt__iitteerraattoorr p = eelliisstt.bbeeggiinn(); pp!=eelliisstt.eenndd(); ++pp)
pprriinntt__eem
mppllooyyeeee(*pp);
}
This works fine, especially in a small program maintained by a single person. However, it has the
fundamental weakness in that it depends on the programmer manipulating types in a way that cannot be checked by the compiler. This problem is usually made worse because functions such as
pprriinntt__eem
mppllooyyeeee() are organized to take advantage of the commonality of the classes involved.
For example:
vvooiidd pprriinntt__eem
mppllooyyeeee(ccoonnsstt E
Em
mppllooyyeeee* ee)
{
ccoouutt << ee->ffaam
miillyy__nnaam
mee << ´\\tt´ << ee->ddeeppaarrttm
meenntt << ´\\nn´;
// ...
iiff (ee->ttyyppee == E
Em
mppllooyyeeee::M
M) {
ccoonnsstt M
Maannaaggeerr* p = ssttaattiicc__ccaasstt<ccoonnsstt M
Maannaaggeerr*>(ee);
ccoouutt << " lleevveell " << pp->lleevveell << ´\\nn´;
// ...
}
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
310
Derived Classes
Chapter 12
Finding all such tests on the type field buried in a large function that handles many derived classes
can be difficult. Even when they have been found, understanding what is going on can be difficult.
Furthermore, any addition of a new kind of E
Em
mppllooyyeeee involves a change to all the key functions in
the system – the ones containing the tests on the type field. The programmer must consider every
function that could conceivably need a test on the type field after a change. This implies the need
to access critical source code and the resulting necessary overhead of testing the affected code. The
use of an explicit type conversion is a strong hint that improvement is possible.
In other words, use of a type field is an error-prone technique that leads to maintenance problems. The problems increase in severity as the size of the program increases because the use of a
type field causes a violation of the ideals of modularity and data hiding. Each function using a type
field must know about the representation and other details of the implementation of every class
derived from the one containing the type field.
It also seems that the existence of any common data accessible from every derived class, such
as a type field, tempts people to add more such data. The common base thus becomes the repository of all kinds of ‘‘useful information.’’ This, in turn, gets the implementation of the base and
derived classes intertwined in ways that are most undesirable. For clean design and simpler maintenance, we want to keep separate issues separate and avoid mutual dependencies.
12.2.6 Virtual Functions [derived.virtual]
Virtual functions overcome the problems with the type-field solution by allowing the programmer
to declare functions in a base class that can be redefined in each derived class. The compiler and
loader will guarantee the correct correspondence between objects and the functions applied to them.
For example:
ccllaassss E
Em
mppllooyyeeee {
ssttrriinngg ffiirrsstt__nnaam
mee, ffaam
miillyy__nnaam
mee;
sshhoorrtt ddeeppaarrttm
meenntt;
// ...
ppuubblliicc:
E
Em
mppllooyyeeee(ccoonnsstt ssttrriinngg& nnaam
mee, iinntt ddeepptt);
vviirrttuuaall vvooiidd pprriinntt() ccoonnsstt;
// ...
};
The keyword vviirrttuuaall indicates that pprriinntt() can act as an interface to the pprriinntt() function defined
in this class and the pprriinntt() functions defined in classes derived from it. Where such pprriinntt()
functions are defined in derived classes, the compiler ensures that the right pprriinntt() for the given
E
Em
mppllooyyeeee object is invoked in each case.
To allow a virtual function declaration to act as an interface to functions defined in derived
classes, the argument types specified for a function in a derived class cannot differ from the argument types declared in the base, and only very slight changes are allowed for the return type
(§15.6.2). A virtual member function is sometimes called a method.
A virtual function must be defined for the class in which it is first declared (unless it is declared
to be a pure virtual function; see §12.3). For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.2.6
Virtual Functions
311
vvooiidd E
Em
mppllooyyeeee::pprriinntt() ccoonnsstt
{
ccoouutt << ffaam
miillyy__nnaam
mee << ´\\tt´ << ddeeppaarrttm
meenntt << ´\\nn´;
// ...
}
A virtual function can be used even if no class is derived from its class, and a derived class that
does not need its own version of a virtual function need not provide one. When deriving a class,
simply provide an appropriate function, if it is needed. For example:
ccllaassss M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee {
sseett<E
Em
mppllooyyeeee*> ggrroouupp;
sshhoorrtt lleevveell;
// ...
ppuubblliicc:
M
Maannaaggeerr(ccoonnsstt ssttrriinngg& nnaam
mee, iinntt ddeepptt, iinntt llvvll);
vvooiidd pprriinntt() ccoonnsstt;
// ...
};
vvooiidd M
Maannaaggeerr::pprriinntt() ccoonnsstt
{
E
Em
mppllooyyeeee::pprriinntt();
ccoouutt << "\\ttlleevveell " << lleevveell << ´\\nn´;
// ...
}
A function from a derived class with the same name and the same set of argument types as a virtual
function in a base is said to override the base class version of the virtual function. Except where
we explicitly say which version of a virtual function is called (as in the call E
Em
mppllooyyeeee::pprriinntt()),
the overriding function is chosen as the most appropriate for the object for which it is called.
The global function pprriinntt__eem
mppllooyyeeee() (§12.2.5) is now unnecessary because the pprriinntt()
member functions have taken its place. A list of E
Em
mppllooyyeeees can be printed like this:
vvooiidd pprriinntt__lliisstt(sseett<E
Em
mppllooyyeeee*>& ss)
{
ffoorr (sseett<E
Em
mppllooyyeeee*>::ccoonnsstt__iitteerraattoorr p = ss.bbeeggiinn(); pp!=ss.eenndd(); ++pp)// see §2.7.2
(*pp)->pprriinntt();
}
or even
vvooiidd pprriinntt__lliisstt(sseett<E
Em
mppllooyyeeee*>& ss)
{
ffoorr__eeaacchh(ss.bbeeggiinn(),ss.eenndd(),m
meem
m__ffuunn(&E
Em
mppllooyyeeee::pprriinntt));
}
// see §3.8.5
Each E
Em
mppllooyyeeee will be written out according to its type. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
312
Derived Classes
Chapter 12
iinntt m
maaiinn()
{
E
Em
mppllooyyeeee ee("B
Brroow
wnn",11223344);
M
Maannaaggeerr m
m("SSm
miitthh",11223344,22);
sseett<E
Em
mppllooyyeeee*> eem
mppll;
eem
mppll.ppuusshh__ffrroonntt(&ee);
// see §2.5.4
eem
mppll.ppuusshh__ffrroonntt(&m
m);
pprriinntt__lliisstt(eem
mppll);
}
produced:
SSm
miitthh 11223344
lleevveell 2
B
Brroow
wnn 11223344
Note that this will work even if E
Em
mppllooyyeeee::pprriinntt__lliisstt() was written and compiled before the specific derived class M
Maannaaggeerr was even conceived of! This is a key aspect of classes. When used
properly, it becomes the cornerstone of object-oriented designs and provides a degree of stability to
an evolving program.
Getting ‘‘the right’’ behavior from E
Em
mppllooyyeeee’s functions independently of exactly what kind of
E
Em
mppllooyyeeee is actually used is called polymorphism. A type with virtual functions is called a
polymorphic type. To get polymorphic behavior in C++, the member functions called must be vviirr-ttuuaall and objects must be manipulated through pointers or references. When manipulating an object
directly (rather than through a pointer or reference), its exact type is known by the compilation so
that run-time polymorphism is not needed.
Clearly, to implement polymorphism, the compiler must store some kind of type information in
each object of class E
Em
mppllooyyeeee and use it to call the right version of the virtual function pprriinntt(). In
a typical implementation, the space taken is just enough to hold a pointer (§2.5.5). This space is
taken only in objects of a class with virtual functions – not in every object, or even in every object
of a derived class. You pay this overhead only for classes for which you declare virtual functions.
Had you chosen to use the alternative type-field solution, a comparable amount of space would
have been needed for the type field.
Calling a function using the scope resolution operator :: as is done in M
Maannaaggeerr::pprriinntt()
ensures that the virtual mechanism is not used. Otherwise, M
Maannaaggeerr::pprriinntt() would suffer an
infinite recursion. The use of a qualified name has another desirable effect. That is, if a vviirrttuuaall
function is also iinnlliinnee (as is not uncommon), then inline substitution can be used for calls specified
using ::. This provides the programmer with an efficient way to handle some important special
cases in which one virtual function calls another for the same object. The M
Maannaaggeerr::pprriinntt()
function is an example of this. Because the type of the object is determined in the call of
M
Maannaaggeerr::pprriinntt(), it need not be dynamically determined again for the resulting call of
E
Em
mppllooyyeeee::pprriinntt().
It is worth remembering that the traditional and obvious implementation of a virtual function
call is simply an indirect function call (§2.5.5), so efficiency concerns should not deter anyone from
using a virtual function where an ordinary function call would be acceptably efficient.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.3
Abstract Classes
313
12.3 Abstract Classes [derived.abstract]
Many classes resemble class E
Em
mppllooyyeeee in that they are useful both as themselves and also as bases
for derived classes. For such classes, the techniques described in the previous section suffice.
However, not all classes follow that pattern. Some classes, such as class SShhaappee, represent abstract
concepts for which objects cannot exist. A SShhaappee makes sense only as the base of some class
derived from it. This can be seen from the fact that it is not possible to provide sensible definitions
for its virtual functions:
ccllaassss SShhaappee {
ppuubblliicc:
vviirrttuuaall vvooiidd rroottaattee(iinntt) { eerrrroorr("SShhaappee::rroottaattee"); } // inelegant
vviirrttuuaall vvooiidd ddrraaw
w() { eerrrroorr("SShhaappee::ddrraaw
w"); }
// ...
};
Trying to make a shape of this unspecified kind is silly but legal:
SShhaappee ss; // silly: ‘‘shapeless shape’’
It is silly because every operation on s will result in an error.
A better alternative is to declare the virtual functions of class SShhaappee to be pure virtual functions.
A virtual function is ‘‘made pure’’ by the initializer = 00:
ccllaassss SShhaappee {
// abstract class
ppuubblliicc:
vviirrttuuaall vvooiidd rroottaattee(iinntt) = 00;
// pure virtual function
vviirrttuuaall vvooiidd ddrraaw
w() = 00;
// pure virtual function
vviirrttuuaall bbooooll iiss__cclloosseedd() = 00; // pure virtual function
// ...
};
A class with one or more pure virtual functions is an abstract class, and no objects of that abstract
class can be created:
SShhaappee ss; // error: variable of abstract class Shape
An abstract class can be used only as an interface and as a base for other classes. For example:
ccllaassss P
Pooiinntt { /* ... */ };
ccllaassss C
Ciirrccllee : ppuubblliicc SShhaappee {
ppuubblliicc:
vvooiidd rroottaattee(iinntt) { }
vvooiidd ddrraaw
w();
bbooooll iiss__cclloosseedd() { rreettuurrnn ttrruuee; }
// override Shape::rotate
// override Shape::draw
// override Shape::is_closed
C
Ciirrccllee(P
Pooiinntt pp, iinntt rr);
pprriivvaattee:
P
Pooiinntt cceenntteerr;
iinntt rraaddiiuuss;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
314
Derived Classes
Chapter 12
A pure virtual function that is not defined in a derived class remains a pure virtual function, so the
derived class is also an abstract class. This allows us to build implementations in stages:
ccllaassss P
Poollyyggoonn : ppuubblliicc SShhaappee {
// abstract class
ppuubblliicc:
bbooooll iiss__cclloosseedd() { rreettuurrnn ttrruuee; }
// override Shape::is_closed
// ... draw and rotate not overridden ...
};
P
Poollyyggoonn bb;
// error: declaration of object of abstract class Polygon
ccllaassss IIrrrreegguullaarr__ppoollyyggoonn : ppuubblliicc P
Poollyyggoonn {
lliisstt<P
Pooiinntt> llpp;
ppuubblliicc:
vvooiidd ddrraaw
w();
// override Shape::draw
vvooiidd rroottaattee(iinntt);
// override Shape::rotate
// ...
};
IIrrrreegguullaarr__ppoollyyggoonn ppoollyy(ssoom
mee__ppooiinnttss);
// fine (assume suitable constructor)
An important use of abstract classes is to provide an interface without exposing any implementation
details. For example, an operating system might hide the details of its device drivers behind an
abstract class:
ccllaassss C
Chhaarraacctteerr__ddeevviiccee {
ppuubblliicc:
vviirrttuuaall iinntt ooppeenn(iinntt oopptt) = 00;
vviirrttuuaall iinntt cclloossee(iinntt oopptt) = 00;
vviirrttuuaall iinntt rreeaadd(cchhaarr* pp, iinntt nn) = 00;
vviirrttuuaall iinntt w
wrriittee(ccoonnsstt cchhaarr* pp, iinntt nn) = 00;
vviirrttuuaall iinntt iiooccttll(iinntt ...) = 00;
vviirrttuuaall ~C
Chhaarraacctteerr__ddeevviiccee() { }
// virtual destructor
};
We can then specify drivers as classes derived from C
Chhaarraacctteerr__ddeevviiccee, and manipulate a variety of
drivers through that interface. The importance of virtual destructors is explained in §12.4.2.
With the introduction of abstract classes, we have the basic facilities for writing a complete program in a modular fashion using classes as building blocks.
12.4 Design of Class Hierarchies [derived.design]
Consider a simple design problem: provide a way for a program to get an integer value from a user
interface. This can be done in a bewildering number of ways. To insulate our program from this
variety, and also to get a chance to explore the possible design choices, let us start by defining our
program’s model of this simple input operation. We will leave until later the details of implementing it using a real user-interface system.
The idea is to have a class IIvvaall__bbooxx that knows what range of input values it will accept. A
program can ask an IIvvaall__bbooxx for its value and ask it to prompt the user if necessary. In addition, a
program can ask an IIvvaall__bbooxx if a user changed the value since the program last looked at it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.4
Design of Class Hierarchies
315
Because there are many ways of implementing this basic idea, we must assume that there will
be many different kinds of IIvvaall__bbooxxeess, such as sliders, plain boxes in which a user can type a number, dials, and voice interaction.
The general approach is to build a ‘‘virtual user-interface system’’ for the application to use.
This system provides some of the services provided by existing user-interface systems. It can be
implemented on a wide variety of systems to ensure the portability of application code. Naturally,
there are other ways of insulating an application from a user-interface system. I chose this
approach because it is general, because it allows me to demonstrate a variety of techniques and
design tradeoffs, because those techniques are also the ones used to build ‘‘real’’ user-interface systems, and – most important – because these techniques are applicable to problems far beyond the
narrow domain of interface systems.
12.4.1 A Traditional Class Hierarchy [derived.traditional]
Our first solution is a traditional class hierarchy as is commonly found in Simula, Smalltalk, and
older C++ programs.
Class IIvvaall__bbooxx defines the basic interface to all IIvvaall__bbooxxes and specifies a default implementation that more specific kinds of IIvvaall__bbooxxes can override with their own versions. In addition, we
declare the data needed to implement the basic notion:
ccllaassss IIvvaall__bbooxx {
pprrootteecctteedd:
iinntt vvaall;
iinntt lloow
w, hhiigghh;
bbooooll cchhaannggeedd;
ppuubblliicc:
IIvvaall__bbooxx(iinntt llll, iinntt hhhh) { cchhaannggeedd = ffaallssee; vvaall = lloow
w = llll; hhiigghh = hhhh; }
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
iinntt ggeett__vvaalluuee() { cchhaannggeedd = ffaallssee; rreettuurrnn vvaall; }
vvooiidd sseett__vvaalluuee(iinntt ii) { cchhaannggeedd = ttrruuee; vvaall = ii; }
vvooiidd rreesseett__vvaalluuee(iinntt ii) { cchhaannggeedd = ffaallssee; vvaall = ii; }
vvooiidd pprroom
mpptt() { }
bbooooll w
waass__cchhaannggeedd() ccoonnsstt { rreettuurrnn cchhaannggeedd; }
// for user
// for application
};
The default implementation of the functions is pretty sloppy and is provided here primarily to illustrate the intended semantics. A realistic class would, for example, provide some range checking.
A programmer might use these ‘‘iivvaall classes’’ like this:
vvooiidd iinntteerraacctt(IIvvaall__bbooxx* ppbb)
{
ppbb->pprroom
mpptt(); // alert user
// ...
iinntt i = ppbb->ggeett__vvaalluuee();
iiff (ppbb->w
waass__cchhaannggeedd()) {
// new value; do something
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
316
Derived Classes
Chapter 12
eellssee {
// old value was fine; do something else
}
// ...
}
vvooiidd ssoom
mee__ffcctt()
{
IIvvaall__bbooxx* pp11 = nneew
w IIvvaall__sslliiddeerr(00,55);
iinntteerraacctt(pp11);
// Ival_slider derived from Ival_box
IIvvaall__bbooxx* pp22 = nneew
w IIvvaall__ddiiaall(11,1122);
iinntteerraacctt(pp22);
}
Most application code is written in terms of (pointers to) plain IIvvaall__bbooxxes the way iinntteerraacctt() is.
That way, the application doesn’t have to know about the potentially large number of variants of
the IIvvaall__bbooxx concept. The knowledge of such specialized classes is isolated in the relatively few
functions that create such objects. This isolates users from changes in the implementations of the
derived classes. Most code can be oblivious to the fact that there are different kinds of IIvvaall__bbooxxes.
To simplify the discussion, I do not address issues of how a program waits for input. Maybe the
program really does wait for the user in ggeett__vvaalluuee(), maybe the program associates the IIvvaall__bbooxx
with an event and prepares to respond to a callback, or maybe the program spawns a thread for the
IIvvaall__bbooxx and later inquires about the state of that thread. Such decisions are crucial in the design
of user-interface systems. However, discussing them here in any realistic detail would simply distract from the presentation of programming techniques and language facilities. The design techniques described here and the language facilities that support them are not specific to user interfaces. They apply to a far greater range of problems.
The different kinds of IIvvaall__bbooxxes are defined as classes derived from IIvvaall__bbooxx. For example:
ccllaassss IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx {
// graphics stuff to define what the slider looks like, etc.
ppuubblliicc:
IIvvaall__sslliiddeerr(iinntt, iinntt);
iinntt ggeett__vvaalluuee();
vvooiidd pprroom
mpptt();
};
The data members of IIvvaall__bbooxx were declared pprrootteecctteedd to allow access from derived classes.
Thus, IIvvaall__sslliiddeerr::ggeett__vvaalluuee() can deposit a value in IIvvaall__bbooxx::vvaall. A pprrootteecctteedd member is
accessible from a class’ own members and from members of derived classes, but not to general
users (see §15.3).
In addition to IIvvaall__sslliiddeerr, we would define other variants of the IIvvaall__bbooxx concept. These could
include IIvvaall__ddiiaall, which lets you select a value by turning a knob; ffllaasshhiinngg__iivvaall__sslliiddeerr, which
flashes when you ask it to pprroom
mpptt(); and ppooppuupp__iivvaall__sslliiddeerr, which responds to pprroom
mpptt() by
appearing in some prominent place, thus making it hard for the user to ignore.
From where would we get the graphics stuff? Most user-interface systems provide a class
defining the basic properties of being an entity on the screen. So, if we use the system from ‘‘Big
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.4.1
A Traditional Class Hierarchy
317
Bucks Inc.,’’ we would have to make each of our IIvvaall__sslliiddeerr, IIvvaall__ddiiaall, etc., classes a kind of
B
BB
Bw
wiinnddoow
w. This would most simply be achieved by rewriting our IIvvaall__bbooxx so that it derives from
B
BB
Bw
wiinnddoow
w. In that way, all our classes inherit all the properties of a B
BB
Bw
wiinnddoow
w. For example,
every IIvvaall__bbooxx can be placed on the screen, obey the graphical style rules, be resized, be dragged
around, etc., according to the standard set by the B
BB
Bw
wiinnddoow
w system. Our class hierarchy would
look like this:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
IIvvaall__bbooxx : ppuubblliicc B
BB
Bw
wiinnddoow
w { /* ... */ }; // rewritten to use BBwindow
IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx { /* ... */ };
IIvvaall__ddiiaall : ppuubblliicc IIvvaall__bbooxx { /* ... */ };
F
Fllaasshhiinngg__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
P
Pooppuupp__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
or graphically:
B
BB
Bw
wiinnddoow
w
.
IIvvaall__bbooxx
IIvvaall__sslliiddeerr
P
Pooppuupp__iivvaall__sslliiddeerr
IIvvaall__ddiiaall
F
Fllaasshhiinngg__iivvaall__sslliiddeerr
12.4.1.1 Critique [derived.critique]
This design works well in many ways, and for many problems this kind of hierarchy is a good solution. However, there are some awkward details that could lead us to look for alternative designs.
We retrofitted B
BB
Bw
wiinnddoow
w as the base of IIvvaall__bbooxx. This is not quite right. The use of B
BB
Bw
wiinn-ddoow
w isn’t part of our basic notion of an IIvvaall__bbooxx; it is an implementation detail. Deriving IIvvaall__bbooxx
from B
BB
Bw
wiinnddoow
w elevated an implementation detail to a first-level design decision. That can be
right. For example, using the environment defined by ‘‘Big Bucks Inc.’’ may be a key decision of
how our organization conducts its business. However, what if we also wanted to have implementations of our IIvvaall__bbooxxes for systems from ‘‘Imperial Bananas,’’ ‘‘Liberated Software,’’ and ‘‘Compiler Whizzes?’’ We would have to maintain four distinct versions of our program:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
IIvvaall__bbooxx : ppuubblliicc
IIvvaall__bbooxx : ppuubblliicc
IIvvaall__bbooxx : ppuubblliicc
IIvvaall__bbooxx : ppuubblliicc
B
BB
Bw
wiinnddoow
w { /* ... */ };
C
CW
Ww
wiinnddoow
w { /* ... */ };
IIB
Bw
wiinnddoow
w { /* ... */ };
L
LSSw
wiinnddoow
w { /* ... */ };
// BB version
// CW version
// IB version
// LS version
Having many versions could result in a version-control nightmare.
Another problem is that every derived class shares the basic data declared in IIvvaall__bbooxx. That
data is, of course, an implementation detail that also crept into our IIvvaall__bbooxx interface. From a
practical point of view, it is also the wrong data in many cases. For example, an IIvvaall__sslliiddeerr
doesn’t need the value stored specifically. It can easily be calculated from the position of the slider
when someone executes ggeett__vvaalluuee(). In general, keeping two related, but different, sets of data is
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
318
Derived Classes
Chapter 12
asking for trouble. Sooner or later someone will get them out of sync. Also, experience shows that
novice programmers tend to mess with protected data in ways that are unnecessary and that cause
maintenance problems. Data is better kept private so that writers of derived classes cannot mess
with them. Better still, data should be in the derived classes, where it can be defined to match
requirements exactly and cannot complicate the life of unrelated derived classes. In almost all
cases, a protected interface should contain only functions, types, and constants.
Deriving from B
BB
Bw
wiinnddoow
w gives the benefit of making the facilities provided by B
BB
Bw
wiinnddoow
w
available to users of IIvvaall__bbooxx. Unfortunately, it also means that changes to class B
BB
Bw
wiinnddoow
w may
force users to recompile or even rewrite their code to recover from such changes. In particular, the
way most C++ implementations work implies that a change in the size of a base class requires a
recompilation of all derived classes.
Finally, our program may have to run in a mixed environment in which windows of different
user-interface systems coexist. This could happen either because two systems somehow share a
screen or because our program needs to communicate with users on different systems. Having our
user-interface systems ‘‘wired in’’ as the one and only base of our one and only IIvvaall__bbooxx interface
just isn’t flexible enough to handle those situations.
12.4.2 Abstract Classes [derived.interface]
So, let’s start again and build a new class hierarchy that solves the problems presented in the critique of the traditional hierarchy:
[1] The user-interface system should be an implementation detail that is hidden from users who
don’t want to know about it.
[2] The IIvvaall__bbooxx class should contain no data.
[3] No recompilation of code using the IIvvaall__bbooxx family of classes should be required after a
change of the user-interface system.
[4] IIvvaall__bbooxxes for different interface systems should be able to coexist in our program.
Several alternative approaches can be taken to achieve this. Here, I present one that maps cleanly
into the C++ language.
First, I specify class IIvvaall__bbooxx as a pure interface:
ccllaassss IIvvaall__bbooxx {
ppuubblliicc:
vviirrttuuaall iinntt ggeett__vvaalluuee() = 00;
vviirrttuuaall vvooiidd sseett__vvaalluuee(iinntt ii) = 00;
vviirrttuuaall vvooiidd rreesseett__vvaalluuee(iinntt ii) = 00;
vviirrttuuaall vvooiidd pprroom
mpptt() = 00;
vviirrttuuaall bbooooll w
waass__cchhaannggeedd() ccoonnsstt = 00;
vviirrttuuaall ~IIvvaall__bbooxx() { }
};
This is much cleaner than the original declaration of IIvvaall__bbooxx. The data is gone and so are the simplistic implementations of the member functions. Gone, too, is the constructor, since there is no
data for it to initialize. Instead, I added a virtual destructor to ensure proper cleanup of the data that
will be defined in the derived classes.
The definition of IIvvaall__sslliiddeerr might look like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.4.2
Abstract Classes
319
ccllaassss IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd B
BB
Bw
wiinnddoow
w{
ppuubblliicc:
IIvvaall__sslliiddeerr(iinntt,iinntt);
~IIvvaall__sslliiddeerr();
iinntt ggeett__vvaalluuee();
vvooiidd sseett__vvaalluuee(iinntt ii);
// ...
pprrootteecctteedd:
// functions overriding BBwindow virtual functions
// e.g. BBwindow::draw(), BBwindow::mouse1hit()
pprriivvaattee:
// data needed for slider
};
The derived class IIvvaall__sslliiddeerr inherits from an abstract class (IIvvaall__bbooxx) that requires it to implement the base class’ pure virtual functions. It also inherits from B
BB
Bw
wiinnddoow
w that provides it with
the means of doing so. Since IIvvaall__bbooxx provides the interface for the derived class, it is derived
using ppuubblliicc. Since B
BB
Bw
wiinnddoow
w is only an implementation aid, it is derived using pprrootteecctteedd
(§15.3.2). This implies that a programmer using IIvvaall__sslliiddeerr cannot directly use facilities defined
by B
BB
Bw
wiinnddoow
w. The interface provided by IIvvaall__sslliiddeerr is the one inherited by IIvvaall__bbooxx, plus what
IIvvaall__sslliiddeerr explicitly declares. I used pprrootteecctteedd derivation instead of the more restrictive (and usually safer) pprriivvaattee derivation to make B
BB
Bw
wiinnddoow
w available to classes derived from IIvvaall__sslliiddeerr.
Deriving directly from more than one class is usually called multiple inheritance (§15.2). Note
that IIvvaall__sslliiddeerr must override functions from both IIvvaall__bbooxx and B
BB
Bw
wiinnddoow
w. Therefore, it must be
derived directly or indirectly from both. As shown in §12.4.1.1, deriving IIvvaall__sslliiddeerr indirectly
from B
BB
Bw
wiinnddoow
w by making B
BB
Bw
wiinnddoow
w a base of IIvvaall__bbooxx is possible, but doing so has undesirable
side effects. Similarly, making the ‘‘implementation class’’ B
BB
Bw
wiinnddoow
w a member of IIvvaall__bbooxx is
not a solution because a class cannot override virtual functions of its members (§24.3.4). Representing the window by a B
BB
Bw
wiinnddoow
w* member in IIvvaall__bbooxx leads to a completely different design
with a separate set of tradeoffs (§12.7[14], §25.7).
Interestingly, this declaration of IIvvaall__sslliiddeerr allows application code to be written exactly as
before. All we have done is to restructure the implementation details in a more logical way.
Many classes require some form of cleanup for an object before it goes away. Since the abstract
class IIvvaall__bbooxx cannot know if a derived class requires such cleanup, it must assume that it does
require some. We ensure proper cleanup by defining a virtual destructor IIvvaall__bbooxx::~IIvvaall__bbooxx()
in the base and overriding it suitably in derived classes. For example:
vvooiidd ff(IIvvaall__bbooxx* pp)
{
// ...
ddeelleettee pp;
}
The ddeelleettee operator explicitly destroys the object pointed to by pp. We have no way of knowing
exactly to which class the object pointed to by p belongs, but thanks to IIvvaall__bbooxx’s virtual
destructor, proper cleanup as (optionally) defined by that class’ destructor will be called.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
320
Derived Classes
Chapter 12
The IIvvaall__bbooxx hierarchy can now be defined like this:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
IIvvaall__bbooxx { /* ... */ };
IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd B
BB
Bw
wiinnddoow
w { /* ... */ };
IIvvaall__ddiiaall : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd B
BB
Bw
wiinnddoow
w { /* ... */ };
F
Fllaasshhiinngg__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
P
Pooppuupp__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
or graphically using obvious abbreviations:
IIvvaall__bbooxx
B
BB
Bw
wiinnddoow
w
..
IIvvaall__sslliiddeerr
P
Pooppuupp__sslliiddeerr
B
BB
Bw
wiinnddoow
w
IIvvaall__ddiiaall
F
Fllaasshhiinngg__sslliiddeerr
I used a dashed line to represent protected inheritance. As far as general users are concerned, doing
that is simply an implementation detail.
12.4.3 Alternative Implementations [derived.alt]
This design is cleaner and more easily maintainable than the traditional one – and no less efficient.
However, it still fails to solve the version control problem:
ccllaassss IIvvaall__bbooxx { /* ... */ };
// common
ccllaassss IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd B
BB
Bw
wiinnddoow
w { /* ... */ }; // for BB
ccllaassss IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd C
CW
Ww
wiinnddoow
w { /* ... */ }; // for CW
// ...
In addition, there is no way of having an IIvvaall__sslliiddeerr for B
BB
Bw
wiinnddoow
ws coexist with an IIvvaall__sslliiddeerr
for C
CW
Ww
wiinnddoow
ws, even if the two user-interface systems could themselves coexist.
The obvious solution is to define several different IIvvaall__sslliiddeerr classes with separate names:
ccllaassss IIvvaall__bbooxx { /* ... */ };
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd B
BB
Bw
wiinnddoow
w { /* ... */ };
ccllaassss C
CW
W__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx, pprrootteecctteedd C
CW
Ww
wiinnddoow
w { /* ... */ };
// ...
or graphically:
B
BB
Bw
wiinnddoow
w
IIvvaall__bbooxx
..
B
BB
B__iivvaall__sslliiddeerr
C
CW
Ww
wiinnddoow
w
C
CW
W__iivvaall__sslliiddeerr
To further insulate our application-oriented IIvvaall__bbooxx classes from implementation details, we can
derive an abstract IIvvaall__sslliiddeerr class from IIvvaall__bbooxx and then derive the system-specific IIvvaall__sslliiddeerrss
from that:
ccllaassss IIvvaall__bbooxx { /* ... */ };
ccllaassss IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx { /* ... */ };
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.4.3
Alternative Implementations
321
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bw
wiinnddoow
w { /* ... */ };
ccllaassss C
CW
W__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd C
CW
Ww
wiinnddoow
w { /* ... */ };
// ...
or graphically:
IIvvaall__bbooxx
.
B
BB
Bw
wiinnddoow
w
IIvvaall__sslliiddeerr
B
BB
B__iivvaall__sslliiddeerr
C
CW
Ww
wiinnddoow
w
C
CW
W__iivvaall__sslliiddeerr
Usually, we can do better yet by utilizing more-specific classes in the implementation hierarchy.
For example, if the ‘‘Big Bucks Inc.’’ system has a slider class, we can derive our IIvvaall__sslliiddeerr
directly from the B
BB
Bsslliiddeerr:
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bsslliiddeerr { /* ... */ };
ccllaassss C
CW
W__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd C
CW
Wsslliiddeerr { /* ... */ };
or graphically:
B
BB
Bw
wiinnddoow
w
.
IIvvaall__bbooxx
.
C
CW
Ww
wiinnddoow
w
B
BB
Bsslliiddeerr
IIvvaall__sslliiddeerr
C
CW
Wsslliiddeerr
B
BB
B__iivvaall__sslliiddeerr
C
CW
W__iivvaall__sslliiddeerr
This improvement becomes significant where – as is not uncommon – our abstractions are not too
different from the ones provided by the system used for implementation. In that case, programming is reduced to mapping between similar concepts. Derivation from general base classes, such
as B
BB
Bw
wiinnddoow
w, is then done only rarely.
The complete hierarchy will consist of our original application-oriented conceptual hierarchy of
interfaces expressed as derived classes:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
IIvvaall__bbooxx { /* ... */ };
IIvvaall__sslliiddeerr : ppuubblliicc IIvvaall__bbooxx { /* ... */ };
IIvvaall__ddiiaall : ppuubblliicc IIvvaall__bbooxx { /* ... */ };
F
Fllaasshhiinngg__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
P
Pooppuupp__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
followed by the implementations of this hierarchy for various graphical user-interface systems,
expressed as derived classes:
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bsslliiddeerr { /* ... */ };
ccllaassss B
BB
B__ffllaasshhiinngg__iivvaall__sslliiddeerr : ppuubblliicc F
Fllaasshhiinngg__iivvaall__sslliiddeerr,
pprrootteecctteedd B
BB
Bw
wiinnddoow
w__w
wiitthh__bbeellllss__aanndd__w
whhiissttlleess { /* ... */ };
ccllaassss B
BB
B__ppooppuupp__iivvaall__sslliiddeerr : ppuubblliicc P
Pooppuupp__iivvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bsslliiddeerr { /* ... */ };
ccllaassss C
CW
W__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd C
CW
Wsslliiddeerr { /* ... */ };
// ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
322
Derived Classes
Chapter 12
Using obvious abbreviations, this hierarchy can be represented graphically like this:
IIvvaal.l__bbooxx
IIvvaall__sslliiddeerr
iippooppuupp
B
BB
Bsslliiddeerr
B
BB
Bsslliiddeerr
B
BB
Biisslliiddeerr
B
BB
Biippoopp
C
CW
Wssll
C
CW
Wiippoopp
IIvvaall__ddiiaall
iiffllaasshh
C
CW
Wssll
C
CW
Wiiffll
B
BB
Bbb&
&w
w
C
CW
Wssll
B
BB
Biiffll
C
CW
Wiisslliiddeerr
The original IIvvaall__bbooxx class hierarchy appears unchanged surrounded by implementation classes.
12.4.3.1 Critique [derived.critique2]
The abstract class design is flexible and almost as simple to deal with as the equivalent design that
relies on a common base defining the user-interface system. In the latter design, the windows class
is the root of a tree. In the former, the original application class hierarchy appears unchanged as the
root of classes that supply its implementations. From the application’s point of view, these designs
are equivalent in the strong sense that almost all code works unchanged and in the same way in the
two cases. In either case, you can look at the IIvvaall__bbooxx family of classes without bothering with the
window-related implementation details most of the time. For example, we would not need to
rewrite iinntteerraacctt() from §12.4.1 if we switched from the one class hierarchy to the other.
In either case, the implementation of each IIvvaall__bbooxx class must be rewritten when the public
interface of the user-interface system changes. However, in the abstract class design, almost all
user code is protected against changes to the implementation hierarchy and requires no recompilation after such a change. This is especially important when the supplier of the implementation hierarchy issues a new ‘‘almost compatible’’ release. In addition, users of the abstract class hierarchy
are in less danger of being locked into a proprietary implementation than are users of a classical
hierarchy. Users of the IIvvaall__bbooxx abstract class application hierarchy cannot accidentally use facilities from the implementation because only facilities explicitly specified in the IIvvaall__bbooxx hierarchy
are accessible; nothing is implicitly inherited from an implementation-specific base class.
12.4.4 Localizing Object Creation [derived.local]
Most of an application can be written using the IIvvaall__bbooxx interface. Further, should the derived
interfaces evolve to provide more facilities than plain IIvvaall__bbooxx, then most of an application can be
written using the IIvvaall__bbooxx, IIvvaall__sslliiddeerr, etc., interfaces. However, the creation of objects must be
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.4.4
Localizing Object Creation
323
done using implementation-specific names such as C
CW
W__iivvaall__ddiiaall and B
BB
B__ffllaasshhiinngg__iivvaall__sslliiddeerr.
We would like to minimize the number of places where such specific names occur, and object creation is hard to localize unless it is done systematically.
As usual, the solution is to introduce an indirection. This can be done in many ways. A simple
one is to introduce an abstract class to represent the set of creation operations:
ccllaassss IIvvaall__m
maakkeerr {
ppuubblliicc:
vviirrttuuaall IIvvaall__ddiiaall* ddiiaall(iinntt, iinntt) =00;
// make dial
vviirrttuuaall P
Pooppuupp__iivvaall__sslliiddeerr* ppooppuupp__sslliiddeerr(iinntt, iinntt) =00; // make popup slider
// ...
};
For each interface from the IIvvaall__bbooxx family of classes that a user should know about, class
IIvvaall__m
maakkeerr provides a function that makes an object. Such a class is sometimes called a factory,
and its functions are (somewhat misleadingly) sometimes called virtual constructors (§15.6.2).
We now represent each user-interface system by a class derived from IIvvaall__m
maakkeerr:
ccllaassss B
BB
B__m
maakkeerr : ppuubblliicc IIvvaall__m
maakkeerr {
// make BB versions
ppuubblliicc:
IIvvaall__ddiiaall* ddiiaall(iinntt, iinntt);
P
Pooppuupp__iivvaall__sslliiddeerr* ppooppuupp__sslliiddeerr(iinntt, iinntt);
// ...
};
ccllaassss L
LSS__m
maakkeerr : ppuubblliicc IIvvaall__m
maakkeerr {
// make LS versions
ppuubblliicc:
IIvvaall__ddiiaall* ddiiaall(iinntt, iinntt);
P
Pooppuupp__iivvaall__sslliiddeerr* ppooppuupp__sslliiddeerr(iinntt, iinntt);
// ...
};
Each function creates an object of the desired interface and implementation type. For example:
IIvvaall__ddiiaall* B
BB
B__m
maakkeerr::ddiiaall(iinntt aa, iinntt bb)
{
rreettuurrnn nneew
w B
BB
B__iivvaall__ddiiaall(aa,bb);
}
IIvvaall__ddiiaall* L
LSS__m
maakkeerr::ddiiaall(iinntt aa, iinntt bb)
{
rreettuurrnn nneew
w L
LSS__iivvaall__ddiiaall(aa,bb);
}
Given a pointer to a IIvvaall__m
maakkeerr, a user can now create objects without having to know exactly
which user-interface system is used. For example:
vvooiidd uusseerr(IIvvaall__m
maakkeerr* ppiim
m)
{
IIvvaall__bbooxx* ppbb = ppiim
m->ddiiaall(00,9999);
// ...
}
// create appropriate dial
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
324
Derived Classes
Chapter 12
B
BB
B__m
maakkeerr B
BB
B__iim
mppll; // for BB users
L
LSS__m
maakkeerr L
LSS__iim
mppll; // for LS users
vvooiidd ddrriivveerr()
{
uusseerr(&B
BB
B__iim
mppll);
uusseerr(&L
LSS__iim
mppll);
}
// use BB
// use LS
12.5 Class Hierarchies and Abstract Classes [derived.hier]
An abstract class is an interface. A class hierarchy is a means of building classes incrementally.
Naturally, every class provides an interface to users and some abstract classes provide significant
functionality to build from, but ‘‘interface’’ and ‘‘building block’’ are the primary roles of abstract
classes and class hierarchies.
A classical hierarchy is a hierarchy in which the individual classes both provide useful functionality for users and act as building blocks for the implementation of more advanced or specialized
classes. Such hierarchies are ideal for supporting programming by incremental refinement. They
provide the maximum support for the implementation of new classes as long as the new class
relates strongly to the existing hierarchy.
Classical hierarchies do tend to couple implementation concerns rather strongly with the interfaces provided to users. Abstract classes can help here. Hierarchies of abstract classes provide a
clean and powerful way of expressing concepts without encumbering them with implementation
concerns or significant run-time overheads. After all, a virtual function call is cheap and independent of the kind of abstraction barrier it crosses. It costs no more to call a member of an abstract
class than to call any other vviirrttuuaall function.
The logical conclusion of this line of thought is a system represented to users as a hierarchy of
abstract classes and implemented by a classical hierarchy.
12.6 Advice [derived.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
Avoid type fields; §12.2.5.
Use pointers and references to avoid slicing; §12.2.3.
Use abstract classes to focus design on the provision of clean interfaces; §12.3.
Use abstract classes to minimize interfaces; §12.4.2.
Use abstract classes to keep implementation details out of interfaces; §12.4.2.
Use virtual functions to allow new implementations to be added without affecting user code;
§12.4.1.
Use abstract classes to minimize recompilation of user code; §12.4.2.
Use abstract classes to allow alternative implementations to coexist; §12.4.3.
A class with a virtual function should have a virtual destructor; §12.4.2.
An abstract class typically doesn’t need a constructor; §12.4.2.
Keep the representations of distinct concepts distinct; §12.4.1.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 12.7
Exercises
325
12.7 Exercises [derived.exercises]
1. (∗1) Define
ccllaassss bbaassee {
ppuubblliicc:
vviirrttuuaall vvooiidd iiaam
m() { ccoouutt << "bbaassee\\nn"; }
};
2.
3.
4.
5.
6.
7.
8.
9.
Derive two classes from bbaassee, and for each define iiaam
m() to write out the name of the class.
Create objects of these classes and call iiaam
m() for them. Assign pointers to objects of the
derived classes to bbaassee* pointers and call iiaam
m() through those pointers.
(∗3.5) Implement a simple graphics system using whatever graphics facilities are available on
your system (if you don’t have a good graphics system or have no experience with one, you
might consider a simple ‘‘huge bit ASCII implementation’’ where a point is a character position
and you write by placing a suitable character, such as * in a position): W
Wiinnddoow
w(nn,m
m) creates
an area of size n times m on the screen. Points on the screen are addressed using (x,y) coordinates (Cartesian). A W
Wiinnddoow
w w has a current position w
w.ccuurrrreenntt(). Initially, ccuurrrreenntt is
P
Pooiinntt(00,00). The current position can be set by w
w.ccuurrrreenntt(pp) where p is a P
Pooiinntt. A P
Pooiinntt is
specified by a coordinate pair: P
Pooiinntt(xx,yy). A L
Liinnee is specified by a pair of P
Pooiinntts:
L
Liinnee(w
w.ccuurrrreenntt(),pp22); class SShhaappee is the common interface to D
Dootts, L
Liinnees, R
Reeccttaannggllees,
C
Ciirrccllees, etc. A P
Pooiinntt is not a SShhaappee. A D
Doott, D
Doott(pp) can be used to represent a P
Pooiinntt p on the
screen.
A
SShhaappee
is
invisible
unless
ddrraaw
w()n.
For
example:
w
w.ddrraaw
w(C
Ciirrccllee(w
w.ccuurrrreenntt(),1100)). Every SShhaappee has 9 contact points: e (east), w (west), n
(north), s (south), nnee, nnw
w, ssee, ssw
w, and c (center). For example, L
Liinnee(xx.cc(),yy.nnw
w()) creates
a line from xx’s center to yy’s top left corner. After ddrraaw
w()ing a SShhaappee the current position is the
SShhaappee’s ssee(). A R
Reeccttaannggllee is specified by its bottom left and top right corner:
R
Reeccttaannggllee(w
w.ccuurrrreenntt(),P
Pooiinntt(1100,1100)). As a simple test, display a simple ‘‘child’s drawing of a house’’ with a roof, two windows, and a door.
(∗2) Important aspects of a SShhaappee appear on the screen as a set of line segments. Implement
operations to vary the appearance of these segments: ss.tthhiicckknneessss(nn) sets the line thickness to
00, 11, 22, or 33, where 2 is the default and 0 means invisible. In addition, a line segment can be
ssoolliidd, ddaasshheedd, or ddootttteedd. This is set by the function SShhaappee::oouuttlliinnee().
(∗2.5) Provide a function L
Liinnee::aarrrroow
whheeaadd() that adds arrow heads to an end of a line. A
line has two ends and an arrowhead can point in two directions relative to the line, so the argument or arguments to aarrrroow
whheeaadd() must be able to express at least four alternatives.
(∗3.5) Make sure that points and line segments that fall outside the W
Wiinnddoow
w do not appear on
the screen. This is often called ‘‘clipping.’’ As an exercise only, do not rely on the implementation graphics system for this.
(∗2.5) Add a T
Teexxtt type to the graphics system. A T
Teexxtt is a rectangular SShhaappee displaying characters. By default, a character takes up one coordinate unit along each coordinate axis.
(∗2) Define a function that draws a line connecting two shapes by finding the two closest ‘‘contact points’’ and connecting them.
(∗3) Add a notion of color to the simple graphics system. Three things can be colored: the
background, the inside of a closed shape, and the outlines of shapes.
(∗2) Consider:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
326
Derived Classes
Chapter 12
ccllaassss C
Chhaarr__vveecc {
iinntt sszz;
cchhaarr eelleem
meenntt[11];
ppuubblliicc:
ssttaattiicc C
Chhaarr__vveecc* nneew
w__cchhaarr__vveecc(iinntt ss);
cchhaarr& ooppeerraattoorr[](iinntt ii) { rreettuurrnn eelleem
meenntt[ii]; }
// ...
};
Define nneew
w__cchhaarr__vveecc() to allocate contiguous memory for a C
Chhaarr__vveecc object so that the elements can be indexed through eelleem
meenntt as shown. Under what circumstances does this trick
cause serious problems?
10. (∗2.5) Given classes C
Ciirrccllee, SSqquuaarree, and T
Trriiaannggllee derived from a class SShhaappee, define a function iinntteerrsseecctt() that takes two SShhaappee* arguments and calls suitable functions to determine if
the two shapes overlap. It will be necessary to add suitable (virtual) functions to the classes to
achieve this. Don’t bother to write the code that checks for overlap; just make sure the right
functions are called. This is commonly referred to as double ddiissppaattcchh or a multi-method.
11. (∗5) Design and implement a library for writing event-driven simulations. Hint: <ttaasskk.hh>.
However, that is an old program, and you can do better. There should be a class ttaasskk. An
object of class ttaasskk should be able to save its state and to have that state restored (you might
define ttaasskk::ssaavvee() and ttaasskk::rreessttoorree()) so that it can operate as a coroutine. Specific tasks
can be defined as objects of classes derived from class ttaasskk. The program to be executed by a
task might be specified as a virtual function. It should be possible to pass arguments to a new
task as arguments to its constructor(s). There should be a scheduler implementing a concept of
virtual time. Provide a function ttaasskk::ddeellaayy(lloonngg) that ‘‘consumes’’ virtual time. Whether
the scheduler is part of class ttaasskk or separate will be one of the major design decisions. The
tasks will need to communicate. Design a class qquueeuuee for that. Devise a way for a task to wait
for input from several queues. Handle run-time errors in a uniform way. How would you
debug programs written using such a library?
12. (∗2) Define interfaces for W
Waarrrriioorr, M
Moonnsstteerr, and O
Obbjjeecctt (that is a thing you can pick up, drop,
use, etc.) classes for an adventure-style game.
13. (∗1.5) Why is there both a P
Pooiinntt and a D
Doott class in §12.7[2]? Under which circumstances
would it be a good idea to augment the SShhaappee classes with concrete versions of key classes such
as L
Liinnee.
14. (∗3) Outline a different implementation strategy for the IIvvaall__bbooxx example (§12.4) based on the
idea that every class seen by an application is an interface containing a single pointer to the
implementation. Thus, each "interface class" will be a handle to an "implementation class," and
there will be an interface hierarchy and an implementation hierarchy. Write code fragments that
are detailed enough to illustrate possible problems with type conversion. Consider ease of use,
ease of programming, ease of reusing implementations and interfaces when adding a new concept to the hierarchy, ease of making changes to interfaces and implementations, and need for
recompilation after change in the implementation.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
13
________________________________________
________________________________________________________________________________________________________________________________________________________________
Templates
Your quote here.
– B. Stroustrup
Templates — a string template — instantiation — template parameters — type checking
— function templates — template argument deduction — specifying template arguments
— function template overloading — policy as template arguments — default template
arguments — specialization — derivation and templates — member templates — conversions — source code organization — advice — exercises.
13.1 Introduction [temp.intro]
Independent concepts should be independently represented and should be combined only when
needed. Where this principle is violated, you either bundle unrelated concepts together or create
unnecessary dependencies. Either way, you get a less flexible set of components out of which to
compose systems. Templates provide a simple way to represent a wide range of general concepts
and simple ways to combine them. The resulting classes and functions can match hand-written,
more-specialized code in run-time and space efficiency.
Templates provide direct support for generic programming (§2.7), that is, programming using
types as parameters. The C++ template mechanism allows a type to be a parameter in the definition
of a class or a function. A template depends only on the properties that it actually uses from its
parameter types and does not require different types used as arguments to be explicitly related. In
particular, the argument types used for a template need not be from a single inheritance hierarchy.
Here, templates are introduced with the primary focus on techniques needed for the design,
implementation, and use of the standard library. The standard library requires a greater degree of
generality, flexibility, and efficiency than does most software. Consequently, techniques that can
be used in the design and implementation of the standard library are effective and efficient in the
design of solutions to a wide variety of problems. These techniques enable an implementer to hide
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
328
Templates
Chapter 13
sophisticated implementations behind simple interfaces and to expose complexity to the user only
when the user has a specific need for it. For example, ssoorrtt(vv) can be the interface to a variety of
sort algorithms for elements of a variety of types held in a variety of containers. The sort function
that is most appropriate for the particular v will be automatically chosen.
Every major standard library abstraction is represented as a template (for example, ssttrriinngg,
oossttrreeaam
m, ccoom
mpplleexx, lliisstt, and m
maapp) and so are the key operations (for example, ssttrriinngg compare, the
output operator <<, ccoom
mpplleexx addition, getting the next element from a lliisstt, and ssoorrtt()). This
makes the library chapters (Part 3) of this book a rich source of examples of templates and programming techniques relying on them. Consequently, this chapter concentrates on smaller examples illustrating technical aspects of templates and fundamental techniques for using them:
§13.2: The basic mechanisms for defining and using class templates
§13.3: Function templates, function overloading, and type deduction
§13.4: Template parameters used to specify policies for generic algorithms
§13.5: Multiple definitions providing alternative implementations for a template
§13.6: Derivation and templates (run-time and compile-time polymorphism)
§13.7: Source code organization
Templates were introduced in §2.7.1 and §3.8. Detailed rules for template name resolution, template syntax, etc., can be found in §C.13.
13.2 A Simple String Template [temp.string]
Consider a string of characters. A string is a class that holds characters and provides operations
such as subscripting, concatenation, and comparison that we usually associate with the notion of a
‘‘string.’’ We would like to provide that behavior for many different kinds of characters. For
example, strings of signed characters, of unsigned characters, of Chinese characters, of Greek characters, etc., are useful in various contexts. Thus, we want to represent the notion of ‘‘string’’ with
minimal dependence on a specific kind of character. The definition of a string relies on the fact that
a character can be copied, and little else. Thus, we can make a more general string type by taking
the string of cchhaarr from §11.12 and making the character type a parameter:
tteem
mppllaattee<ccllaassss C
C> ccllaassss SSttrriinngg {
ssttrruucctt SSrreepp;
SSrreepp *rreepp;
ppuubblliicc:
SSttrriinngg();
SSttrriinngg(ccoonnsstt C
C*);
SSttrriinngg(ccoonnsstt SSttrriinngg&);
C rreeaadd(iinntt ii) ccoonnsstt;
// ...
};
The tteem
mppllaattee <ccllaassss C
C> prefix specifies that a template is being declared and that a type argument
C will be used in the declaration. After its introduction, C is used exactly like other type names.
The scope of C extends to the end of the declaration prefixed by tteem
mppllaattee <ccllaassss C
C>. Note that
tteem
mppllaattee<ccllaassss C
C> says that C is a type name; it need not be the name of a class.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.2
A Simple String Template
329
The name of a class template followed by a type bracketed by < > is the name of a class (as
defined by the template) and can be used exactly like other class names. For example:
SSttrriinngg<cchhaarr> ccss;
SSttrriinngg<uunnssiiggnneedd cchhaarr> uuss;
SSttrriinngg<w
wcchhaarr__tt> w
wss;
ccllaassss JJcchhaarr {
// Japanese character
};
SSttrriinngg<JJcchhaarr> jjss;
Except for the special syntax of its name, SSttrriinngg<cchhaarr> works exactly as if it had been defined
using the definition of class SSttrriinngg in §11.12. Making SSttrriinngg a template allows us to provide the
facilities we had for SSttrriinngg of cchhaarr for SSttrriinnggs of any kind of character. For example, if we use the
standard library m
maapp and the SSttrriinngg template, the word-counting example from §11.8 becomes:
iinntt m
maaiinn()
// count the occurrences of each word on input
{
SSttrriinngg<cchhaarr> bbuuff;
m
maapp<SSttrriinngg<cchhaarr>,iinntt> m
m;
w
whhiillee (cciinn>>bbuuff) m
m[bbuuff]++;
// write out result
}
The version for our Japanese-character type JJcchhaarr would be:
iinntt m
maaiinn()
// count the occurrences of each word on input
{
SSttrriinngg<JJcchhaarr> bbuuff;
m
maapp<SSttrriinngg<JJcchhaarr>,iinntt> m
m;
w
whhiillee (cciinn>>bbuuff) m
m[bbuuff]++;
// write out result
}
The standard library provides the template class bbaassiicc__ssttrriinngg that is similar to the templatized
SSttrriinngg (§11.12, §20.3). In the standard library, ssttrriinngg is defined as a synonym for
bbaassiicc__ssttrriinngg<cchhaarr>:
ttyyppeeddeeff bbaassiicc__ssttrriinngg<cchhaarr> ssttrriinngg;
This allows us to write the word-counting program like this:
iinntt m
maaiinn()
// count the occurrences of each word on input
{
ssttrriinngg bbuuff;
m
maapp<ssttrriinngg,iinntt> m
m;
w
whhiillee (cciinn>>bbuuff) m
m[bbuuff]++;
// write out result
}
In general, ttyyppeeddeeffs are useful for shortening the long names of classes generated from templates.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
330
Templates
Chapter 13
Also, we often prefer not to know the details of how a type is defined, and a ttyyppeeddeeff allows us to
hide the fact that a type is generated from a template.
13.2.1 Defining a Template [temp.string.details]
A class generated from a class template is a perfectly ordinary class. Thus, use of a template does
not imply any run-time mechanisms beyond what is used for an equivalent ‘‘hand-written’’ class.
Nor does it necessarily imply any reduction in the amount of code generated.
It is usually a good idea to debug a particular class, such as SSttrriinngg, before turning it into a template such as SSttrriinngg<C
C>. By doing so, we handle many design problems and most of the code
errors in the context of a concrete example. This kind of debugging is familiar to all programmers,
and most people cope better with a concrete example than with an abstract concept. Later, we can
deal with any problems that might arise from generalization without being distracted by more conventional errors. Similarly, when trying to understand a template, it is often useful to imagine its
behavior for a particular type argument such as cchhaarr before trying to comprehend the template in
its full generality.
Members of a template class are declared and defined exactly as they would have been for a
non-template class. A template member need not be defined within the template class itself. In
that case, its definition must be provided somewhere else, as for non-template class members
(§C.13.7). Members of a template class are themselves templates parameterized by the parameters
of their template class. When such a member is defined outside its class, it must explicitly be
declared a template. For example:
tteem
mppllaattee<ccllaassss C
C> ssttrruucctt SSttrriinngg<C
C>::SSrreepp {
C
C* ss;
// pointer to elements
iinntt sszz;
// number of elements
iinntt nn;
// reference count
// ...
};
tteem
mppllaattee<ccllaassss C
C> C SSttrriinngg<C
C>::rreeaadd(iinntt ii) ccoonnsstt { rreettuurrnn rreepp->ss[ii]; }
tteem
mppllaattee<ccllaassss C
C> SSttrriinngg<C
C>::SSttrriinngg()
{
p = nneew
w SSrreepp(00,C
C());
}
Template parameters, such as C
C, are parameters rather than names of types defined externally to the
template. However, that doesn’t affect the way we write the template code using them. Within the
scope of SSttrriinngg<C
C>, qualification with <C
C> is redundant for the name of the template itself, so
SSttrriinngg<C
C>::SSttrriinngg is the name for the constructor. If you prefer, you can be explicit:
tteem
mppllaattee<ccllaassss C
C> SSttrriinngg<C
C>::SSttrriinngg<C
C>()
{
p = nneew
w SSrreepp(00,C
C());
}
Just as there can be only one function defining a class member function in a program, there can be
only one function template defining a class template member function in a program. However,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.2.1
Defining a Template
331
overloading is a possibility for functions only (§13.3.2), while specialization (§13.5) enables us to
provide alternative implementations for a template.
It is not possible to overload a class template name, so if a class template is declared in a scope,
no other entity can be declared there with the same name (see also §13.5). For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSttrriinngg { /* ... */ };
ccllaassss SSttrriinngg { /* ... */ }; // error: double definition
A type used as a template argument must provide the interface expected by the template. For
example, a type used as an argument to SSttrriinngg must provide the usual copy operations (§10.4.4.1,
§20.2.1). Note that there is no requirement that different arguments for the same template parameter should be related by inheritance.
13.2.2 Template Instantiation [temp.string.inst]
The process of generating a class declaration from a template class and a template argument is often
called template instantiation (§C.13.7). Similarly, a function is generated (‘‘instantiated’’) from a
template function plus a template argument. A version of a template for a particular template argument is called a specialization.
In general, it is the implementation’s job – not the programmer’s – to ensure that versions of a
template function are generated for each set of template arguments used (§C.13.7). For example:
SSttrriinngg<cchhaarr> ccss;
vvooiidd ff()
{
SSttrriinngg<JJcchhaarr> jjss;
ccss = "IItt´ss tthhee iim
mpplleem
meennttaattiioonn´ss jjoobb ttoo ffiigguurree oouutt w
whhaatt ccooddee nneeeeddss ttoo bbee ggeenneerraatteedd";
}
For this, the implementation generates declarations for SSttrriinngg<cchhaarr> and SSttrriinngg<JJcchhaarr>, for their
corresponding SSrreepp types, for their destructors and default constructors, and for the assignment
SSttrriinngg<cchhaarr>::ooppeerraattoorr=(cchhaarr*). Other member functions are not used and should not be generated. The generated classes are perfectly ordinary classes that obey all the usual rules for classes.
Similarly, generated functions are ordinary functions that obey all the usual rules for functions.
Obviously, templates provide a powerful way of generating code from relatively short definitions. Consequently, a certain amount of caution is in order to avoid flooding memory with almost
identical function definitions (§13.5).
13.2.3 Template Parameters [temp.param]
A template can take type parameters, parameters of ordinary types such as iinntts, and template
parameters (§C.13.3). Naturally, a template can take several parameters. For example:
tteem
mppllaattee<ccllaassss T
T, T ddeeff__vvaall> ccllaassss C
Coonntt { /* ... */ };
As shown, a template parameter can be used in the definition of subsequent template parameters.
Integer arguments come in handy for supplying sizes and limits. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
332
Templates
Chapter 13
tteem
mppllaattee<ccllaassss T
T, iinntt ii> ccllaassss B
Buuffffeerr {
T vv[ii];
iinntt sszz;
ppuubblliicc:
B
Buuffffeerr() : sszz(ii) {}
// ...
};
B
Buuffffeerr<cchhaarr,112277> ccbbuuff;
B
Buuffffeerr<R
Reeccoorrdd,88> rrbbuuff;
Simple and constrained containers such as B
Buuffffeerr can be important where run-time efficiency and
compactness are paramount (thus preventing the use of a more general ssttrriinngg or vveeccttoorr). Passing a
size as a template argument allows B
Buuffffeerr’s implementer to avoid free store use. Another example
is the R
Raannggee type in §25.6.1.
A template argument can be a constant expression (§C.5), the address of an object or function
with external linkage (§9.2), or a non-overloaded pointer to member (§15.5). A pointer used as a
template argument must be of the form &ooff, where ooff is the name of an object or a function, or of
the form ff, where f is the name of a function. A pointer to member must be of the form &X
X::ooff,
where ooff is the name of an member. In particular, a string literal is not acceptable as a template
argument.
An integer template argument must be a constant:
vvooiidd ff(iinntt ii)
{
B
Buuffffeerr<iinntt,ii> bbxx;
}
// error: constant expression expected
Conversely, a non-type template parameter is a constant within the template so that an attempt to
change the value of a parameter is an error.
13.2.4 Type Equivalence [temp.equiv]
Given a template, we can generate types by supplying template arguments. For example:
SSttrriinngg<cchhaarr> ss11;
SSttrriinngg<uunnssiiggnneedd cchhaarr> ss22;
SSttrriinngg<iinntt> ss33;
ttyyppeeddeeff uunnssiiggnneedd cchhaarr U
Ucchhaarr;
SSttrriinngg<U
Ucchhaarr> ss44;
SSttrriinngg<cchhaarr> ss55;
B
Buuffffeerr<SSttrriinngg<cchhaarr>,1100> bb11;
B
Buuffffeerr<cchhaarr,1100> bb22;
B
Buuffffeerr<cchhaarr,2200-1100> bb33;
When using the same set of template arguments for a template, we always refer to the same generated type. However, what does ‘‘the same’’ mean in this context? As usual, ttyyppeeddeeffs do not introduce new types, so SSttrriinngg<U
Ucchhaarr> is the same type as SSttrriinngg<uunnssiiggnneedd cchhaarr>. Conversely,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.2.4
Type Equivalence
333
because cchhaarr and uunnssiiggnneedd cchhaarr are different types (§4.3), SSttrriinngg<cchhaarr> and SSttrriinngg<uunnssiiggnneedd
cchhaarr> are different types.
The compiler can evaluate constant expressions (§C.5), so B
Buuffffeerr<cchhaarr,2200-1100> is recognized
to be the same type as B
Buuffffeerr<cchhaarr,1100>.
13.2.5 Type Checking [temp.check]
A template is defined and then later used in combination with a set of template arguments. When
the template is defined, the definition is checked for syntax errors and possibly also for other errors
that can be detected in isolation from a particular set of template arguments. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt {
ssttrruucctt L
Liinnkk {
L
Liinnkk* pprree;
L
Liinnkk* ssuucc;
T vvaall;
L
Liinnkk(L
Liinnkk* pp, L
Liinnkk* ss,ccoonnsstt T
T& vv) :pprree(pp), ssuucc(ss), vvaall(vv) { }
}
// syntax error: missing semicolon
L
Liinnkk* hheeaadd;
ppuubblliicc:
L
Liisstt() : hheeaadd(77) { }
// error: pointer initialized with int
L
Liisstt(ccoonnsstt T
T& tt) : hheeaadd(nneew
w L
Liinnkk(00,oo,tt)) { }
// error: undefined identifier ‘o’
// ...
vvooiidd pprriinntt__aallll() { ffoorr (L
Liinnkk* p = hheeaadd; pp; pp=pp->ssuucc) ccoouutt << pp->vvaall << ´\\nn´; }
};
A compiler can catch simple semantic errors at the point of definition or later at the point of use.
Users generally prefer early detection, but not all ‘‘simple’’ errors are easy to detect. Here, I made
three ‘‘mistakes.’’ Independently of what the template parameter is, a pointer T
T* cannot be initialized by the integer 77. Similarly, the identifier o (a mistyped 00, of course) cannot be an argument to
L
Liisstt<T
T>::L
Liinnkk’s constructor because there is no such name in scope.
A name used in a template definition must either be in scope or in some reasonably obvious
way depend on a template parameter (§C.13.8.1). The most common and obvious way of depending on a template parameter T is to use a member of a T or to take an argument of type T
T. In
L
Liisstt<T
T>::pprriinntt__aallll(), ccoouutt<<pp->vvaall is a slightly more subtle example.
Errors that relate to the use of template parameters cannot be detected until the template is used.
For example:
ccllaassss R
Reecc { /* ... */ };
vvooiidd ff(L
Liisstt<iinntt>& llii, L
Liisstt<R
Reecc>& llrr)
{
llii.pprriinntt__aallll();
llrr.pprriinntt__aallll();
}
The llii.pprriinntt__aallll() checks out fine, but llrr.pprriinntt__aallll() gives a type error because there is no <<
output operator defined for R
Reecc. The earliest that errors relating to a template parameter can be
detected is at the first point of use of the template for a particular template argument. That point is
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
334
Templates
Chapter 13
usually called the first point of instantiation, or simply the point of instantiation (see §C.13.7). The
implementation is allowed to postpone this checking until the program is linked. If we had only a
declaration of pprriinntt__aallll() available in this translation unit, rather than its definition, the implementation might have had to delay type checking (see §13.7). Independently of when checking is done,
the same set of rules is checked. Again, users prefer early checking. It is possible to express constraints on template arguments in terms of member functions (see §13.9[16]).
13.3 Function Templates [temp.fct]
For most people, the first and most obvious use of templates is to define and use container classes
such as bbaassiicc__ssttrriinngg (§20.3), vveeccttoorr (§16.3), lliisstt (§17.2.2), and m
maapp (§17.4.1). Soon after, the
need for template functions arises. Sorting an array is a simple example:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssoorrtt(vveeccttoorr<T
T>&);
// declaration
vvooiidd ff(vveeccttoorr<iinntt>& vvii, vveeccttoorr<ssttrriinngg>& vvss)
{
ssoorrtt(vvii); // sort(vector<int>&);
ssoorrtt(vvss); // sort(vector<string>&);
}
When a template function is called, the types of the function arguments determine which version of
the template is used; that is, the template arguments are deduced from the function arguments
(§13.3.1).
Naturally, the template function must be defined somewhere (§C.13.7):
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssoorrtt(vveeccttoorr<T
T>& vv)
// Shell sort (Knuth, Vol. 3, pg. 84).
{
ccoonnsstt ssiizzee__tt n = vv.ssiizzee();
// definition
ffoorr (iinntt ggaapp=nn/22; 00<ggaapp; ggaapp/=22)
ffoorr (iinntt ii=ggaapp; ii<nn; ii++)
ffoorr (iinntt jj=ii-ggaapp; 00<=jj; jj-=ggaapp)
iiff (vv[jj+ggaapp]<vv[jj]) {
// swap v[j] and v[j+gap]
T tteem
mpp = vv[jj];
vv[jj] = vv[jj+ggaapp];
vv[jj+ggaapp] = tteem
mpp;
}
}
Please compare this definition to the ssoorrtt() defined in (§7.7). This templatized version is cleaner
and shorter because it can rely on more information about the type of the elements it sorts. Most
likely, it is also faster because it doesn’t rely on a pointer to function for the comparison. This
implies that no indirect function calls are needed and that inlining of a simple < is easy.
A further simplification is to use the standard library template ssw
waapp() (§18.6.8) to reduce the
action to its natural form:
iiff (vv[jj+ggaapp]<vv[jj]) ssw
waapp(vv[jj],vv[jj+ggaapp]);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.3
Function Templates
335
This does not introduce any new overheads.
In this example, operator < is used for comparison. However, not every type has a < operator.
This limits the use of this version of ssoorrtt(), but the limitation is easily avoided (see §13.4).
13.3.1 Function Template Arguments [temp.deduce]
Function templates are essential for writing generic algorithms to be applied to a wide variety of
container types (§2.7.2, §3.8, Chapter 18). The ability to deduce the template arguments for a call
from the function arguments is crucial.
A compiler can deduce type and non-type arguments from a call, provided the function argument list uniquely identifies the set of template arguments (§C.13.4). For example:
tteem
mppllaattee<ccllaassss T
T, iinntt ii> T llooookkuupp(B
Buuffffeerr<T
T,ii>& bb, ccoonnsstt cchhaarr* pp);
ccllaassss R
Reeccoorrdd {
ccoonnsstt cchhaarr[1122];
// ...
};
R
Reeccoorrdd ff(B
Buuffffeerr<R
Reeccoorrdd,112288>& bbuuff, ccoonnsstt cchhaarr* pp)
{
rreettuurrnn llooookkuupp(bbuuff,pp); // use the lookup() where T is Record and i is 128
}
Here, T is deduced to be R
Reeccoorrdd and i is deduced to be 112288.
Note that class template parameters are never deduced. The reason is that the flexibility provided by several constructors for a class would make such deduction impossible in many cases and
obscure in many more. Specialization provides a mechanism for implicitly choosing between different implementations of a class (§13.5). If we need to create an object of a deduced type, we can
often do that by calling a function to do the creation; see m
maakkee__ppaaiirr() in §17.4.1.2.
If a template argument cannot be deduced from the template function arguments (§C.13.4), we
must specify it explicitly. This is done in the same way template arguments are explicitly specified
for a template class. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss vveeccttoorr { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> T
T* ccrreeaattee(); // make a T and return a pointer to it
vvooiidd ff()
{
vveeccttoorr<iinntt> vv;
iinntt* p = ccrreeaattee<iinntt>();
}
// class, template argument ‘int’
// function, template argument ‘int’
One common use of explicit specification is to provide a return type for a template function:
tteem
mppllaattee<ccllaassss T
T, ccllaassss U
U> T iim
mpplliicciitt__ccaasstt(U
U uu) { rreettuurrnn uu; }
vvooiidd gg(iinntt ii)
{
iim
mpplliicciitt__ccaasstt(ii);
iim
mpplliicciitt__ccaasstt<ddoouubbllee>(ii);
// error: can’t deduce T
// T is double; U is int
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
336
Templates
Chapter 13
iim
mpplliicciitt__ccaasstt<cchhaarr,ddoouubbllee>(ii); // T is char; U is double
iim
mpplliicciitt__ccaasstt<cchhaarr*,iinntt>(ii);
// T is char*; U is int; error: cannot convert int to char*
}
As with default function arguments (§7.5), only trailing arguments can be left out of a list of
explicit template arguments.
Explicit specification of template arguments allows the definition of families of conversion
functions and object creation functions (§13.3.2, §C.13.1, §C.13.5). An explicit version of the
implicit conversions (§C.6), such as iim
mpplliicciitt__ccaasstt(), is frequently useful. The syntax for
ddyynnaam
miicc__ccaasstt, ssttaattiicc__ccaasstt, etc., (§6.2.7, §15.4.1) matches the explicitly qualified template function
syntax. However, the built-in type conversion operators supply operations that cannot be expressed
by other language features.
13.3.2 Function Template Overloading [temp.over]
One can declare several function templates with the same name and even declare a combination of
function templates and ordinary functions with the same name. When an overloaded function is
called, overload resolution is necessary to find the right function or template function to invoke.
For example:
tteem
mppllaattee<ccllaassss T
T> T ssqqrrtt(T
T);
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ssqqrrtt(ccoom
mpplleexx<T
T>);
ddoouubbllee ssqqrrtt(ddoouubbllee);
vvooiidd ff(ccoom
mpplleexx<ddoouubbllee> zz)
{
ssqqrrtt(22);
// sqrt<int>(int)
ssqqrrtt(22.00);
// sqrt(double)
ssqqrrtt(zz);
// sqrt<double>(complex<double>)
}
In the same way that a template function is a generalization of the notion of a function, the rules for
resolution in the presence of function templates are generalizations of the function overload resolution rules. Basically, for each template we find the specialization that is best for the set of function
arguments. Then, we apply the usual function overload resolution rules to these specializations and
all ordinary functions:
[1] Find the set of function template specializations (§13.2.2) that will take part in overload resolution. Do this by considering each function template and deciding which template arguments, if any, would be used if no other function templates or functions of the same name
were in scope. For the call ssqqrrtt(zz), this makes ssqqrrtt<ddoouubbllee>(ccoom
mpplleexx<ddoouubbllee>) and
ssqqrrtt< ccoom
mpplleexx<ddoouubbllee> >(ccoom
mpplleexx<ddoouubbllee>) candidates.
[2] If two template functions can be called and one is more specialized than the other (§13.5.1),
consider only the most specialized template function in the following steps. For the call
ssqqrrtt(zz), this means that ssqqrrtt<ddoouubbllee>(ccoom
mpplleexx<ddoouubbllee>) is preferred over ssqqrrtt<
ccoom
mpplleexx<ddoouubbllee> >(ccoom
mpplleexx<ddoouubbllee>): any call that matches ssqqrrtt<T
T>(ccoom
mpplleexx<T
T>)
also matches ssqqrrtt<T
T>(T
T).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.3.2
Function Template Overloading
337
[3] Do overload resolution for this set of functions, plus any ordinary functions as for ordinary
functions (§7.4). If a template function argument has been determined by template argument deduction (§13.3.1), that argument cannot also have promotions, standard conversions,
or user-defined conversions applied. For ssqqrrtt(22), ssqqrrtt<iinntt>(iinntt) is an exact match, so it
is preferred over ssqqrrtt(ddoouubbllee).
[4] If a function and a specialization are equally good matches, the function is preferred. Consequently, ssqqrrtt(ddoouubbllee) is preferred over ssqqrrtt<ddoouubbllee>(ddoouubbllee) for ssqqrrtt(22.00).
[5] If no match is found, the call is an error. If we end up with two or more equally good
matches, the call is ambiguous and is an error.
For example:
tteem
mppllaattee<ccllaassss T
T> T m
maaxx(T
T,T
T);
ccoonnsstt iinntt s = 77;
vvooiidd kk()
{
m
maaxx(11,22);
m
maaxx(´aa´,´bb´);
m
maaxx(22.77,44.99);
m
maaxx(ss,77);
m
maaxx(´aa´,11);
m
maaxx(22.77,44);
// max<int>(1,2)
// max<char>(’a’,’b’)
// max<double>(2.7,4.9)
// max<int>(int(s),7) (trivial conversion used)
// error: ambiguous (no standard conversion)
// error: ambiguous (no standard conversion)
}
We could resolve the two ambiguities either by explicit qualification:
vvooiidd ff()
{
m
maaxx<iinntt>(´aa´,11);
m
maaxx<ddoouubbllee>(22.77,44);
}
// max<int>(int(’a’),1)
// max<double>(2.7,double(4))
or by adding suitable declarations:
iinnlliinnee
iinnlliinnee
iinnlliinnee
iinnlliinnee
iinntt m
maaxx(iinntt ii, iinntt jj) { rreettuurrnn m
maaxx<iinntt>(ii,jj); }
ddoouubbllee m
maaxx(iinntt ii, ddoouubbllee dd) { rreettuurrnn m
maaxx<ddoouubbllee>(ii,dd); }
ddoouubbllee m
maaxx(ddoouubbllee dd, iinntt ii) { rreettuurrnn m
maaxx<ddoouubbllee>(dd,ii); }
ddoouubbllee m
maaxx(ddoouubbllee dd11, ddoouubbllee dd22) { rreettuurrnn m
maaxx<ddoouubbllee>(dd11,dd22); }
vvooiidd gg()
{
m
maaxx(´aa´,11); // max(int(’a’),1)
m
maaxx(22.77,44); // max(2.7,double(4))
}
For ordinary functions, ordinary overloading rules (§7.4) apply, and the use of iinnlliinnee ensures that
no extra overhead is imposed.
The definition of m
maaxx() is trivial, so we could have written it explicitly. However, using a specialization of the template is an easy and general way of defining such resolution functions.
The overload resolution rules ensure that template functions interact properly with inheritance:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
338
Templates
Chapter 13
tteem
mppllaattee<ccllaassss T
T> ccllaassss B { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> ccllaassss D : ppuubblliicc B
B<T
T> { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> vvooiidd ff(B
B<T
T>*);
vvooiidd gg(B
B<iinntt>* ppbb, D
D<iinntt>* ppdd)
{
ff(ppbb);
// f<int>(pb)
ff(ppdd);
// f<int>(static_cast<B<int>*>(pd)); standard conversion D<int>* to B<int>* used
}
In this example, the template function ff() accepts a B
B<T
T>* for any type T
T. We have an argument
of type D
D<iinntt>*, so the compiler easily deduces that by choosing T to be iinntt, the call can be
uniquely resolved to a call of ff(B
B<iinntt>*).
A function argument that is not involved in the deduction of a template parameter is treated
exactly as an argument of a non-template function. In particular, the usual conversion rules hold.
Consider:
tteem
mppllaattee<ccllaassss C
C> iinntt ggeett__nntthh(C
C& pp, iinntt nn);
// get n-th element
This function presumably returns the value of the n-th element of a container of type C
C. Because C
has to be deduced from an actual argument of ggeett__nntthh() in a call, conversions are not applicable to
the first argument. However, the second argument is perfectly ordinary, so the full range of possible conversions is considered. For example:
ccllaassss IInnddeexx {
ppuubblliicc:
ooppeerraattoorr iinntt();
// ...
};
vvooiidd ff(vveeccttoorr<iinntt>& vv, sshhoorrtt ss, IInnddeexx ii)
{
iinntt ii11 = ggeett__nntthh(vv,22);
// exact match
iinntt ii22 = ggeett__nntthh(vv,ss);
// standard conversion: short to int
iinntt ii33 = ggeett__nntthh(vv,ii);
// user-defined conversion: Index to int
}
13.4 Using Template Arguments to Specify Policy [temp.policy]
Consider how to sort strings. Three concepts are involved: the string, the element type, and the criteria used by the sort algorithm for comparing string elements.
We can’t hardwire the sorting criteria into the container because the container can’t (in general)
impose its needs on the element types. We can’t hardwire the sorting criteria into the element type
because there are many different ways of sorting elements.
Consequently, the sorting criteria are built neither into the container nor into the element type.
Instead, the criteria must be supplied when a specific operation needs to be performed. For example, if I have strings of characters representing names of Swedes, what collating criteria would I
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.4
Using Template Arguments to Specify Policy
339
like to use for a comparison? Two different collating sequences (numerical orderings of the characters) are commonly used for sorting Swedish names. Naturally, neither a general string type nor a
general sort algorithm should know about the conventions for sorting names in Sweden. Therefore,
any general solution requires that the sorting algorithm be expressed in general terms that can be
defined not just for a specific type but also for a specific use of a specific type. For example, let us
generalize the standard C library function ssttrrccm
mpp() for SSttrriinnggs of any type T (§13.2):
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
C>
iinntt ccoom
mppaarree(ccoonnsstt SSttrriinngg<T
T>& ssttrr11, ccoonnsstt SSttrriinngg<T
T>& ssttrr22)
{
ffoorr(iinntt ii=00; ii<ssttrr11.lleennggtthh() && ii< ssttrr22.lleennggtthh(); ii++)
iiff (!C
C::eeqq(ssttrr11[ii],ssttrr22[ii])) rreettuurrnn C
C::lltt(ssttrr11[ii],ssttrr22[ii]) ? -11 : 11;
rreettuurrnn ssttrr11.lleennggtthh()-ssttrr22.lleennggtthh();
}
If someone wants ccoom
mppaarree() to ignore case, to reflect locale, etc., that can be done by defining
suitable C
C::eeqq() and C
C::lltt(). This allows any (comparison, sorting, etc.) algorithm that can be
described in terms of the operations supplied by the ‘‘C
C-operations’’ and the container to be
expressed. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss C
Cm
mpp { // normal, default compare
ppuubblliicc:
ssttaattiicc iinntt eeqq(T
T aa, T bb) { rreettuurrnn aa==bb; }
ssttaattiicc iinntt lltt(T
T aa, T bb) { rreettuurrnn aa<bb; }
};
ccllaassss L
Liitteerraattee { // compare Swedish names according to literary conventions
ppuubblliicc:
ssttaattiicc iinntt eeqq(cchhaarr aa, cchhaarr bb) { rreettuurrnn aa==bb; }
ssttaattiicc iinntt lltt(cchhaarr,cchhaarr); // a table lookup based on character value (§13.9[14])
};
We can now choose the rules for comparison by explicit specification of the template arguments:
vvooiidd ff(SSttrriinngg<cchhaarr> ssw
weeddee11, SSttrriinngg<cchhaarr> ssw
weeddee22)
{
ccoom
mppaarree< cchhaarr,C
Cm
mpp<cchhaarr> >(ssw
weeddee11,ssw
weeddee22);
ccoom
mppaarree< cchhaarr,L
Liitteerraattee >(ssw
weeddee11,ssw
weeddee22);
}
Passing the comparison operations as a template parameter has two significant benefits compared to
alternatives such as passing pointers to functions. Several operations can be passed as a single
argument with no run-time cost. In addition, the comparison operators eeqq() and lltt() are trivial to
inline, whereas inlining a call through a pointer to function requires exceptional attention from a
compiler.
Naturally, comparison operations can be provided for user-defined types as well as built-in
types. This is essential to allow general algorithms to be applied to types with nontrivial comparison criteria (see §18.4).
Each class generated from a class template gets a copy of each ssttaattiicc member of the class template (see §C.13.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
340
Templates
Chapter 13
13.4.1 Default Template Parameters [temp.default]
Explicitly specifying the comparison criteria for each call is tedious. Fortunately, it is easy to pick
a default so that only uncommon comparison criteria have to be explicitly specified. This can be
implemented through overloading:
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
C>
iinntt ccoom
mppaarree(ccoonnsstt SSttrriinngg<T
T>& ssttrr11, ccoonnsstt SSttrriinngg<T
T>& ssttrr22); // compare using C
tteem
mppllaattee<ccllaassss T
T>
iinntt ccoom
mppaarree(ccoonnsstt SSttrriinngg<T
T>& ssttrr11, ccoonnsstt SSttrriinngg<T
T>& ssttrr22); // compare using Cmp<T>
Alternatively, we can supply the normal convention as a default template argument:
tteem
mppllaattee<ccllaassss T
T, ccllaassss C = C
Cm
mpp<T
T> >
iinntt ccoom
mppaarree(ccoonnsstt SSttrriinngg<T
T>& ssttrr11, ccoonnsstt SSttrriinngg<T
T>& ssttrr22)
{
ffoorr(iinntt ii=00; ii<ssttrr11.lleennggtthh() && ii< ssttrr22.lleennggtthh(); ii++)
iiff (!C
C::eeqq(ssttrr11[ii],ssttrr22[ii])) rreettuurrnn C
C::lltt(ssttrr11[ii],ssttrr22[ii]) ? -11 : 11;
rreettuurrnn ssttrr11.lleennggtthh()-ssttrr22.lleennggtthh();
}
Given that, we can write:
vvooiidd ff(SSttrriinngg<cchhaarr> ssw
weeddee11, SSttrriinngg<cchhaarr> ssw
weeddee22)
{
ccoom
mppaarree(ssw
weeddee11,ssw
weeddee22);
ccoom
mppaarree<cchhaarr,L
Liitteerraattee>(ssw
weeddee11,ssw
weeddee22);
}
// use Cmp<char>
// use Literate
A less esoteric example (for non-Swedes) is comparing with and without taking case into account:
ccllaassss N
Noo__ccaassee { /* ... */ };
vvooiidd ff(SSttrriinngg<cchhaarr> ss11, SSttrriinngg<cchhaarr> ss22)
{
ccoom
mppaarree(ss11,ss22);
// case sensitive
ccoom
mppaarree<cchhaarr,N
Noo__ccaassee>(ss11,ss22);
// not sensitive to case
}
The technique of supplying a policy through a template argument and then defaulting that argument
to supply the most common policy is widely used in the standard library (e.g., §18.4). Curiously
enough, it is not used for bbaassiicc__ssttrriinngg (§13.2, Chapter 20) comparisons. Template parameters
used to express policies are often called ‘‘traits.’’ For example, the standard library string relies on
cchhaarr__ttrraaiittss (§20.2.1), the standard algorithms on iterator traits (§19.2.2), and the standard library
containers on aallllooccaattoorrss (§19.4).
The semantic checking of a default argument for a template parameter is done if and (only)
when that default argument is actually used. In particular, as long as we refrain from using the
default template argument C
Cm
mpp<T
T> we can ccoom
mppaarree() strings of a type X for which C
Cm
mpp<X
X>
wouldn’t compile (say, because < wasn’t defined for an X
X). This point is crucial in the design of
the standard containers, which rely on a template argument to specify default values (§16.3.4).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.5
Specialization
341
13.5 Specialization [temp.special]
By default, a template gives a single definition to be used for every template argument (or combination of template arguments) that a user can think of. This doesn’t always make sense for someone
writing a template. I might want to say, ‘‘if the template argument is a pointer, use this implementation; if it is not, use that implementation’’ or ‘‘give an error unless the template argument is a
pointer derived from class M
Myy__bbaassee.’’ Many such design concerns can be addressed by providing
alternative definitions of the template and having the compiler choose between them based on the
template arguments provided where they are used. Such alternative definitions of a template are
called user-defined specializations, or simply, user specializations.
Consider likely uses of a V
Veeccttoorr template:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr {
T
T* vv;
iinntt sszz;
ppuubblliicc:
V
Veeccttoorr();
V
Veeccttoorr(iinntt);
// general vector type
T
T& eelleem
m(iinntt ii) { rreettuurrnn vv[ii]; }
T
T& ooppeerraattoorr[](iinntt ii);
vvooiidd ssw
waapp(V
Veeccttoorr&);
// ...
};
V
Veeccttoorr<iinntt> vvii;
V
Veeccttoorr<SShhaappee*> vvppss;
V
Veeccttoorr<ssttrriinngg> vvss;
V
Veeccttoorr<cchhaarr*> vvppcc;
V
Veeccttoorr<N
Nooddee*> vvppnn;
Most V
Veeccttoorrs will be V
Veeccttoorrs of some pointer type. There are several reasons for this, but the primary reason is that to preserve run-time polymorphic behavior, we must use pointers (§2.5.4,
§12.2.6). That is, anyone who practices object-oriented programming and also uses type-safe containers (such as the standard library containers) will end up with a lot of containers of pointers.
The default behavior of most C++ implementations is to replicate the code for template functions. This is good for run-time performance, but unless care is taken it leads to code bloat in critical cases such as the V
Veeccttoorr example.
Fortunately, there is an obvious solution. Containers of pointers can share a single implementation. This can be expressed through specialization. First, we define a version (a specialization) of
V
Veeccttoorr for pointers to vvooiidd:
tteem
mppllaattee<> ccllaassss V
Veeccttoorr<vvooiidd*> {
vvooiidd** pp;
// ...
vvooiidd*& ooppeerraattoorr[](iinntt ii);
};
This specialization can then be used as the common implementation for all V
Veeccttoorrs of pointers.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
342
Templates
Chapter 13
The tteem
mppllaattee<> prefix says that this is a specialization that can be specified without a template
parameter. The template arguments for which the specialization is to be used are specified in <>
brackets after the name. That is, the <vvooiidd*> says that this definition is to be used as the implementation of every V
Veeccttoorr for which T is void* .
The V
Veeccttoorr<vvooiidd*> is a complete specialization. That is, there is no template parameter to
specify or deduce when we use the specialization; V
Veeccttoorr<vvooiidd*> is used for V
Veeccttoorrs declared like
this:
V
Veeccttoorr<vvooiidd*> vvppvv;
To define a specialization that is used for every V
Veeccttoorr of pointers and only for V
Veeccttoorrs of pointers,
we need a partial specialization:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr<T
T*> : pprriivvaattee V
Veeccttoorr<vvooiidd*> {
ppuubblliicc:
ttyyppeeddeeff V
Veeccttoorr<vvooiidd*> B
Baassee;
V
Veeccttoorr() : B
Baassee() {}
eexxpplliicciitt V
Veeccttoorr(iinntt ii) : B
Baassee(ii) {}
T
T*& eelleem
m(iinntt ii) { rreettuurrnn ssttaattiicc__ccaasstt<T
T*&>(B
Baassee::eelleem
m(ii)); }
T
T*& ooppeerraattoorr[](iinntt ii) { rreettuurrnn ssttaattiicc__ccaasstt<T
T*&>(B
Baassee::ooppeerraattoorr[](ii)); }
// ...
};
The specialization pattern <T
T*> after the name says that this specialization is to be used for every
pointer type; that is, this definition is to be used for every V
Veeccttoorr with a template argument that can
be expressed as T
T*. For example:
V
Veeccttoorr<SShhaappee*> vvppss; // <T*> is <Shape*> so T is Shape
V
Veeccttoorr<iinntt**> vvppppii; // <T*> is <int**> so T is int*
Note that when a partial specialization is used, a template parameter is deduced from the specialization pattern; the template parameter is not simply the actual template argument. In particular, for
V
Veeccttoorr<SShhaappee*>, T is SShhaappee and not SShhaappee*.
Given this partial specialization of V
Veeccttoorr, we have a shared implementation for all V
Veeccttoorrs of
pointers. The V
Veeccttoorr<T
T*> class is simply an interface to vvooiidd* implemented exclusively through
derivation and inline expansion.
It is important that this refinement of the implementation of V
Veeccttoorr is achieved without affecting the interface presented to users. Specialization is a way of specifying alternative implementations for different uses of a common interface. Naturally, we could have given the general V
Veeccttoorr
and the V
Veeccttoorr of pointers different names. However, when I tried that, many people who should
have known better forgot to use the pointer classes and found their code much larger than expected.
In this case, it is much better to hide the crucial implementation details behind a common interface.
This technique proved successful in curbing code bloat in real use. People who do not use a
technique like this (in C++ or in other languages with similar facilities for type parameterization)
have found that replicated code can cost megabytes of code space even in moderately-sized programs. By eliminating the time needed to compile those additional versions of the vector operations, this technique can also cut compile and link times dramatically. Using a single specialization
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.5
Specialization
343
to implement all lists of pointers is an example of the general technique of minimizing code bloat
by maximizing the amount of shared code.
The general template must be declared before any specialization. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt<T
T*> { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt { /* ... */ }; // error: general template after specialization
The critical information supplied by the general template is the set of template parameters that the
user must supply to use it or any of its specializations. Consequently, a declaration of the general
case is sufficient to allow the declaration or definition of a specialization:
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt;
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt<T
T*> { /* ... */ };
If used, the general template needs to be defined somewhere (§13.7).
If a user specializes a template somewhere, that specialization must be in scope for every use of
the template with the type for which it was specialized. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt { /* ... */ };
L
Liisstt<iinntt*> llii;
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt<T
T*> { /* ... */ }; // error
Here, L
Liisstt was specialized for iinntt* after L
Liisstt<iinntt*> had been used.
All specializations of a template must be declared in the same namespace as the template itself.
If used, a specialization that is explicitly declared (as opposed to generated from a more general
template) must also be explicitly defined somewhere (§13.7). In other words, explicitly specializing a template implies that no definition is generated for that specialization.
13.5.1 Order of Specializations [temp.special.order]
One specialization is more specialized than another if every argument list that matches its specialization pattern also matches the other, but not vice versa. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr;
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr<T
T*>;
tteem
mppllaattee<> ccllaassss V
Veeccttoorr<vvooiidd*>;
// general
// specialized for any pointer
// specialized for void*
Every type can be used as a template argument for the most general V
Veeccttoorr, but only pointers can
be used for V
Veeccttoorr<T
T*> and only vvooiidd*s can be used for V
Veeccttoorr<vvooiidd*>.
The most specialized version will be preferred over the others in declarations of objects, pointers, etc., (§13.5) and in overload resolution (§13.3.2).
A specialization pattern can be specified in terms of types composed using the constructs
allowed for template parameter deduction (§13.3.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
344
Templates
Chapter 13
13.5.2 Template Function Specialization [temp.special.fct]
Naturally, specialization is also useful for template functions. Consider the Shell sort from §7.7
and §13.3. It compares elements using < and swaps elements using detailed code. A better definition would be:
tteem
mppllaattee<ccllaassss T
T> bbooooll lleessss(T
T aa, T bb) { rreettuurrnn aa<bb; }
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssoorrtt(V
Veeccttoorr<T
T>& vv)
{
ccoonnsstt ssiizzee__tt n = vv.ssiizzee();
ffoorr (iinntt ggaapp=nn/22; 00<ggaapp; ggaapp/=22)
ffoorr (iinntt ii=ggaapp; ii<nn; ii++)
ffoorr (iinntt jj=ii-ggaapp; 00<=jj; jj-=ggaapp)
iiff (lleessss(vv[jj+ggaapp],vv[jj])) ssw
waapp(vv[jj],vv[jj+ggaapp]);
}
This does not improve the algorithm itself, but it allows improvements to its implementation. As
written, ssoorrtt() will not sort a V
Veeccttoorr<cchhaarr*> correctly because < will compare the two cchhaarr*s.
That is, it will compare the addresses of the first cchhaarr in each string. Instead, we would like it to
compare the characters pointed to. A simple specialization of lleessss() for ccoonnsstt cchhaarr* will take care
of that:
tteem
mppllaattee<> bbooooll lleessss<ccoonnsstt cchhaarr*>(ccoonnsstt cchhaarr* aa, ccoonnsstt cchhaarr* bb)
{
rreettuurrnn ssttrrccm
mpp(aa,bb)<00;
}
As for classes (§13.5), the tteem
mppllaattee<> prefix says that this is a specialization that can be specified
without a template parameter. The <ccoonnsstt cchhaarr*> after the template function name means that this
specialization is to be used in cases where the template argument is ccoonnsstt cchhaarr*. Because the template argument can be deduced from the function argument list, we need not specify it explicitly.
So, we could simplify the definition of the specialization:
tteem
mppllaattee<> bbooooll lleessss<>(ccoonnsstt cchhaarr* aa, ccoonnsstt cchhaarr* bb)
{
rreettuurrnn ssttrrccm
mpp(aa,bb)<00;
}
Given the tteem
mppllaattee<> prefix, the second empty <> is redundant, so we would typically simply
write:
tteem
mppllaattee<> bbooooll lleessss(ccoonnsstt cchhaarr* aa, ccoonnsstt cchhaarr* bb)
{
rreettuurrnn ssttrrccm
mpp(aa,bb)<00;
}
I prefer this shorter form of declaration.
Consider the obvious definition of ssw
waapp():
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.5.2
Template Function Specialization
345
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssw
waapp(T
T& xx, T
T& yy)
{
T t = xx;
// copy x to temporary
x = yy;
// copy y to x
y = tt;
// copy temporary to y
}
This is rather inefficient when invoked for V
Veeccttoorrs of V
Veeccttoorrs; it swaps V
Veeccttoorrs by copying all elements. This problem can also be solved by appropriate specialization. A V
Veeccttoorr object will itself
hold only sufficient data to give indirect access to the elements (like ssttrriinngg; §11.12, §13.2). Thus,
a swap can be done by swapping those representations. To be able to manipulate that representation, I provided V
Veeccttoorr with a member function ssw
waapp() (§13.5):
tteem
mppllaattee<ccllaassss T
T> vvooiidd V
Veeccttoorr<T
T>::ssw
waapp(V
Veeccttoorr & aa)
{
ssw
waapp(vv,aa.vv);
ssw
waapp(sszz,aa.sszz);
// swap representations
}
This member ssw
waapp() can now be used to define a specialization of the general ssw
waapp():
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssw
waapp(V
Veeccttoorr<T
T>& aa, V
Veeccttoorr<T
T>& bb)
{
aa.ssw
waapp(bb);
}
These specializations of lleessss() and ssw
waapp() are used in the standard library (§16.3.9, §20.3.16).
In addition, they are examples of widely applicable techniques. Specialization is useful when there
is a more efficient alternative to a general algorithm for a set of template arguments (here,
ssw
waapp()). In addition, specialization comes in handy when an irregularity of an argument type
causes the general algorithm to give an undesired result (here, lleessss()). These ‘‘irregular types’’
are often the built-in pointer and array types.
13.6 Derivation and Templates [temp.derive]
Templates and derivation are mechanisms for building new types out of existing ones, and generally for writing useful code that exploits various forms of commonality. As shown in §3.7.1,
§3.8.5, and §13.5, combinations of the two mechanisms are the basis for many useful techniques.
Deriving a template class from a non-template class is a way of providing a common implementation for a set of templates. The list from §13.5 is a good example of this:
tteem
mppllaattee<ccllaassss T
T> ccllaassss lliisstt<T
T*> : pprriivvaattee lliisstt<vvooiidd*> { /* ... */ };
Another way of looking at such examples is that a template is used to provide an elegant and typesafe interface to an otherwise unsafe and inconvenient-to-use facility.
Naturally, it is often useful to derive one template class from another. One use of a base class is
as a building block in the implementation of further classes. If the data or operations in such a base
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
346
Templates
Chapter 13
class depend on a template parameter of a derived class, the base itself must be parameterized; V
Veecc
from §3.7.1 is an example of this:
tteem
mppllaattee<ccllaassss T
T> ccllaassss vveeccttoorr { /* ... */ };
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veecc : ppuubblliicc vveeccttoorr<T
T> { /* ... */ };
The overload resolution rules for template functions ensure that functions work ‘‘correctly’’ for
such derived types (§13.3.2).
Having the same template parameter for the base and derived class is the most common case,
but it is not a requirement. Interesting, although less frequently used, techniques rely on passing
the derived type itself to the base class. For example:
tteem
mppllaattee <ccllaassss C
C> ccllaassss B
Baassiicc__ooppss { // basic operators on containers
bbooooll ooppeerraattoorr==(ccoonnsstt C
C&) ccoonnsstt; // compare all elements
bbooooll ooppeerraattoorr!=(ccoonnsstt C
C&) ccoonnsstt;
// ...
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss M
Maatthh__ccoonnttaaiinneerr : ppuubblliicc B
Baassiicc__ooppss< M
Maatthh__ccoonnttaaiinneerr<T
T> > {
ppuubblliicc:
ssiizzee__tt ssiizzee() ccoonnsstt;
T
T& ooppeerraattoorr[](ssiizzee__tt);
// ...
};
This allows the definition of the basic operations on containers to be separate from the definition of
the containers themselves and defined once only. However, the definition of operations such as ==
and != must be expressed in terms of both the container and its elements, so the base class needs to
be passed to the container template.
Assuming that a M
Maatthh__ccoonnttaaiinneerr is similar to a traditional vector, the definitions of a
B
Baassiicc__ooppss member would look something like this:
tteem
mppllaattee <ccllaassss C
C> bbooooll B
Baassiicc__ooppss<C
C>::ooppeerraattoorr==(ccoonnsstt C
C& aa) ccoonnsstt
{
iiff (ssiizzee() != aa.ssiizzee()) rreettuurrnn ffaallssee;
ffoorr (iinntt i = 00; ii<ssiizzee(); ++ii)
iiff ((*tthhiiss)[ii] != aa[ii]) rreettuurrnn ffaallssee;
rreettuurrnn ttrruuee;
}
An alternative technique for keeping the containers and operations separate would be to combine
them from template arguments rather than use derivation:
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
C> ccllaassss M
Mccoonnttaaiinneerr {
C eelleem
meennttss;
ppuubblliicc:
// ...
T
T& ooppeerraattoorr[](ssiizzee__tt ii) { rreettuurrnn eelleem
meennttss[ii]; }
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.6
Derivation and Templates
347
ffrriieenndd bbooooll ooppeerraattoorr==(ccoonnsstt M
Mccoonnttaaiinneerr&, ccoonnsstt M
Mccoonnttaaiinneerr&); // compare elements
ffrriieenndd bbooooll ooppeerraattoorr!=(ccoonnsstt M
Mccoonnttaaiinneerr&, ccoonnsstt M
Mccoonnttaaiinneerr&);
// ...
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss M
Myy__aarrrraayy { /* ... */ };
M
Mccoonnttaaiinneerr< ddoouubbllee,M
Myy__aarrrraayy<ddoouubbllee> > m
mcc;
A class generated from a class template is a perfectly ordinary class. Consequently, it can have
ffrriieenndd functions (§C.13.2). In this case, I used ffrriieenndds to achieve the conventional symmetric argument style for == and != (§11.3.2). One might also consider passing a template rather than a container as the C argument in such cases (§13.2.3).
13.6.1 Parameterization and Inheritance [temp.inherit]
A template parameterizes the definition of a type or a function with another type. Code implementing the template is identical for all parameter types, as is most code using the template. An abstract
class defines an interface. Much code for different implementations of the abstract class can be
shared in class hierarchies, and most code using the abstract class doesn’t depend on its implementation. From a design perspective, the two approaches are so close that they deserve a common
name. Since both allow an algorithm to be expressed once and applied to a variety of types, people
sometimes refer to both as ppoollyym
moorrpphhiicc. To distinguish them, what virtual functions provide is
called run-time polymorphism, and what templates offer is called compile-time polymorphism or
parametric polymorphism.
So when do we choose to use a template and when do we rely on an abstract class? In either
case, we manipulate objects that share a common set of operations. If no hierarchical relationship
is required between these objects, they are best used as template arguments. If the actual types of
these objects cannot be known at compile-time, they are best represented as classes derived from a
common abstract class. If run-time efficiency is at a premium, that is, if inlining of operations is
essential, a template should be used. This issue is discussed in greater detail in §24.4.1.
13.6.2 Member Templates [temp.member]
A class or a class template can have members that are themselves templates. For example:
tteem
mppllaattee<ccllaassss SSccaallaarr> ccllaassss ccoom
mpplleexx {
SSccaallaarr rree, iim
m;
ppuubblliicc:
tteem
mppllaattee<ccllaassss T
T>
ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx<T
T>& cc) : rree(cc.rree), iim
m(cc.iim
m) { }
// ...
};
ccoom
mpplleexx<ffllooaatt> ccff(00,00);
ccoom
mpplleexx<ddoouubbllee> ccdd = ccff; // ok: uses float to double conversion
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
348
Templates
Chapter 13
ccllaassss Q
Quuaadd {
// no conversion to int
};
ccoom
mpplleexx<Q
Quuaadd> ccqq;
ccoom
mpplleexx<iinntt> ccii = ccqq;
// error: no Quad to int conversion
In other words, you can construct a ccoom
mpplleexx<T
T11> from a ccoom
mpplleexx<T
T22> if and only if you can initialize a T
T11 by a T
T22. That seems reasonable.
Unfortunately, C++ accepts some unreasonable conversions between built-in types, such as
from ddoouubbllee to iinntt. Truncation problems could be caught at run time using a checked conversion in
the style of iim
mpplliicciitt__ccaasstt (§13.3.1) and cchheecckkeedd (§C.6.2.6):
tteem
mppllaattee<ccllaassss SSccaallaarr> ccllaassss ccoom
mpplleexx {
SSccaallaarr rree, iim
m;
ppuubblliicc:
ccoom
mpplleexx() : rree(00), iim
m(00) { }
ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx<SSccaallaarr>& cc) : rree(cc.rree), iim
m(cc.iim
m) { }
tteem
mppllaattee<ccllaassss T
T22> ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx<T
T22>& cc)
: rree(cchheecckkeedd__ccaasstt<SSccaallaarr>(cc.rreeaall())), iim
m(cchheecckkeedd__ccaasstt<SSccaallaarr>(cc.iim
maagg())) { }
// ...
};
For completeness, I added a default constructor and a copy constructor. Curiously enough, a template constructor is never used to generate a copy constructor, so without the explicitly declared
copy constructor, a default copy constructor would have been generated. In that case, that generated copy constructor would have been identical to the one I explicitly specified.
A member template cannot be vviirrttuuaall. For example:
ccllaassss SShhaappee {
// ...
tteem
mppllaattee<ccllaassss T
T> vviirrttuuaall bbooooll iinntteerrsseecctt(ccoonnsstt T
T&) ccoonnsstt =00; // error: virtual template
};
This must be illegal. If it were allowed, the traditional virtual function table technique for implementing virtual functions (§2.5.5) could not be used. The linker would have to add a new entry to
the virtual table for class SShhaappee each time someone called iinntteerrsseecctt() with a new argument type.
13.6.3 Inheritance Relationships [temp.rel.inheritance]
A class template is usefully understood as a specification of how particular types are to be created.
In other words, the template implementation is a mechanism that generates types when needed
based on the user’s specification. Consequently, a class template is sometimes called a type
generator.
As far as the C++ language rules are concerned, there is no relationship between two classes
generated from a single class template. For example:
ccllaassss SShhaappee { /* ... */ };
ccllaassss C
Ciirrccllee : ppuubblliicc SShhaappee { /* ... */ };
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.6.3
Inheritance Relationships
349
Given these declarations, people sometimes try to treat a sseett<C
Ciirrccllee*> as a sseett<SShhaappee*>. This is
a serious logical error based on a flawed argument: ‘‘A C
Ciirrccllee is a SShhaappee, so a set of C
Ciirrcclleess is also
a set of SShhaappeess; therefore, I should be able to use a set of C
Ciirrcclleess as a set of SShhaappeess.’’ The ‘‘therefore’’ part of this argument doesn’t hold. The reason is that a set of C
Ciirrccllees guarantees that the
member of the set are C
Ciirrcclleess; a set of SShhaappees does not provide that guarantee. For example:
ccllaassss T
Trriiaannggllee : ppuubblliicc SShhaappee { /* ... */ };
vvooiidd ff(sseett<SShhaappee*>& ss)
{
// ...
ss.iinnsseerrtt(nneew
w T
Trriiaannggllee());
// ...
}
vvooiidd gg(sseett<C
Ciirrccllee*>& ss)
{
ff(ss); // error, type mismatch: s is a set<Circle*>, not a set<Shape*>
}
This won’t compile because there is no built-in conversion from sseett<C
Ciirrccllee*>& to sseett<SShhaappee*>&.
Nor should there be. The guarantee that the members of a sseett<C
Ciirrccllee*> are C
Ciirrccllees allows us to
safely and efficiently apply C
Ciirrccllee-specific operations, such as determining the radius, to members
of the set. If we allowed a sseett<C
Ciirrccllee*> to be treated as a sseett<SShhaappee*>, we could no longer maintain that guarantee. For example, ff() inserts a T
Trriiaannggllee* into its sseett<SShhaappee*> argument. If the
sseett<SShhaappee*> could have been a sseett<C
Ciirrccllee*>, the fundamental guarantee that a sseett<C
Ciirrccllee*>
contains C
Ciirrccllee*s only would have been violated.
13.6.3.1 Template Conversions [temp.mem.temp]
The example in the previous section demonstrates that there cannot be any default relationship
between classes generated from the same templates. However, for some templates we would like to
express such a relationship. For example, when we define a pointer template, we would like to
reflect inheritance relationships among the objects pointed to. Member templates (§13.6.2) allow
us to specify many such relationships where desired. Consider:
tteem
mppllaattee<ccllaassss T
T> ccllaassss P
Pttrr { // pointer to T
T
T* pp;
ppuubblliicc:
P
Pttrr(T
T*);
tteem
mppllaattee<ccllaassss T
T22> ooppeerraattoorr P
Pttrr<T
T22> (); // convert Ptr<T> to Ptr<T2>
// ...
};
We would like to define the conversion operators to provide the inheritance relationships we are
accustomed to for built-in pointers for these user-defined P
Pttrrs. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
350
Templates
vvooiidd ff(P
Pttrr<C
Ciirrccllee> ppcc)
{
P
Pttrr<SShhaappee> ppss = ppcc;
P
Pttrr<C
Ciirrccllee> ppcc22 = ppss;
}
Chapter 13
// should work
// should give error
We want to allow the first initialization if and only if SShhaappee really is a direct or indirect public base
class of C
Ciirrccllee. In general, we need to define the conversion operator so that the P
Pttrr<T
T> to
P
Pttrr<T
T22> conversion is accepted if and only if a T
T* can be assigned to a T
T22*. That can be done
like this:
tteem
mppllaattee<ccllaassss T
T>
tteem
mppllaattee<ccllaassss T
T22>
P
Pttrr<T
T>::ooppeerraattoorr P
Pttrr<T
T22> () { rreettuurrnn P
Pttrr<T
T22>(pp); }
The return statement will compile if and only if p (which is a T
T*) can be an argument to the
P
Pttrr<T
T22>(T
T22*) constructor. Therefore, if T
T* can be implicitly converted into a T
T22*, the P
Pttrr<T
T>
to P
Pttrr<T
T22> conversion will work. For example
vvooiidd ff(P
Pttrr<C
Ciirrccllee> ppcc)
{
P
Pttrr<SShhaappee> ppss = ppcc;
P
Pttrr<C
Ciirrccllee> ppcc22 = ppss;
}
// ok: can convert Circle* to Shape*
// error: cannot convert Shape* to Circle*
Be careful to define logically meaningful conversions only.
Note that the template parameter lists of a template and its template member cannot be combined. For example:
tteem
mppllaattee<ccllaassss T
T, ccllaassss T
T22>
// error
P
Pttrr<T
T>::ooppeerraattoorr P
Pttrr<T
T22> () { rreettuurrnn P
Pttrr<T
T22>(pp); }
13.7 Source Code Organization [temp.source]
There are two obvious ways of organizing code using templates:
[1] Include template definitions before their use in a translation unit.
[2] Include template declarations (only) before their use in a translation unit, and compile their
definitions separately.
In addition, template functions are sometimes first declared, then used, and finally defined in a single translation unit.
To see the differences between the two main approaches, consider a simple template:
#iinncclluuddee<iioossttrreeaam
m>
tteem
mppllaattee<ccllaassss T
T> vvooiidd oouutt(ccoonnsstt T
T& tt) { ssttdd::cceerrrr << tt; }
We could call this oouutt.cc and #iinncclluuddee it wherever oouutt() was needed. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.7
Source Code Organization
351
// user1.c:
#iinncclluuddee "oouutt.cc"
// use out()
// user2.c:
#iinncclluuddee "oouutt.cc"
// use out()
That is, the definition of oouutt() and all declarations it depends on are #iinncclluuddeed in several different
compilation units. It is up to the compiler to generate code when needed (only) and to optimize the
process of reading redundant definitions. This strategy treats template functions the same way as
inline functions.
One obvious problem with this is that everything on which the definition of oouutt() depends is
added to each file using oouutt(), thus increasing the amount of information that the compiler must
process. Another problem is that users may accidentally come to depend on declarations included
only for the benefit of the definition of oouutt(). This danger can be minimized by using namespaces, by avoiding macros, and generally by reducing the amount of information included.
The separate compilation strategy is the logical conclusion of this line of thinking: if the template definition isn’t included in the user code, none of its dependencies can affect that code. Thus
we split the original oouutt.cc into two files:
// out.h:
tteem
mppllaattee<ccllaassss T
T> vvooiidd oouutt(ccoonnsstt T
T& tt);
// out.c:
#iinncclluuddee<iioossttrreeaam
m>
#iinncclluuddee "oouutt.hh"
eexxppoorrtt tteem
mppllaattee<ccllaassss T
T> vvooiidd oouutt(ccoonnsstt T
T& tt) { ssttdd::cceerrrr << tt; }
The file oouutt.cc now holds all of the information needed to define oouutt(), and oouutt.hh holds only what
is needed to call it. A user #iinncclluuddees only the declaration (the interface):
// user1.c:
#iinncclluuddee "oouutt.hh"
// use out()
// user2.c:
#iinncclluuddee "oouutt.hh"
// use out()
This strategy treats template functions the same way it does non-inline functions. The definition (in
oouutt.cc) is compiled separately, and it is up to the implementation to find the definition of oouutt()
when needed. This strategy also puts a burden on the implementation. Instead of having to filter
out redundant copies of a template definition, the implementation must find the unique definition
when needed.
Note that to be accessible from other compilation units, a template definition must be explicitly
declared eexxppoorrtt (§9.2.3). This can be done by adding eexxppoorrtt to the definition or to a preceding
declaration. Otherwise, the definition must be in scope wherever the template is used.
Which strategy or combination of strategies is best depends on the compilation and linkage
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
352
Templates
Chapter 13
system used, the kind of application you are building, and the external constraints on the way you
build systems. Generally, inline functions and other small functions that primarily call other template functions are candidates for inclusion into every compilation unit in which they are used. On
an implementation with average support from the linker for template instantiation, doing this can
speed up compilation and improve error messages.
Including a definition makes it vulnerable to having its meaning affected by macros and declarations in the context into which it is included. Consequently, larger template functions and template functions with nontrivial context dependencies are better compiled separately. Also, if the
definition of a template requires a large number of declarations, these declarations can have undesirable side effects if they are included into the context in which the template is used.
I consider the approach of separately compiling template definitions and including declarations
only in user code ideal. However, the application of ideals must be tempered by practical constraints, and separate compilation of templates is expensive on some implementations.
Whichever strategy is used, non-iinnlliinnee ssttaattiicc members (§C.13.1) must have a unique definition
in some compilation unit. This implies that such members are best not used for templates that are
otherwise included in many translation units.
One ideal is for code to work the same whether it is compiled as a single unit or separated into
several separately translated units. That ideal should be approached by restricting a template
definition’s dependency on its environment rather than by trying to carry as much as possible of its
definition context with it into the instantiation process.
13.8 Advice [temp.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
Use templates to express algorithms that apply to many argument types; §13.3.
Use templates to express containers; §13.2.
Provide specializations for containers of pointers to minimize code size; §13.5.
Always declare the general form of a template before specializations; §13.5.
Declare a specialization before its use; §13.5.
Minimize a template definition’s dependence on its instantiation contexts; §13.2.5, §C.13.8.
Define every specialization you declare; §13.5.
Consider if a template needs specializations for C-style strings and arrays; §13.5.2.
Parameterize with a policy object; §13.4.
Use specialization and overloading to provide a single interface to implementations of the
same concept for different types; §13.5.
Provide a simple interface for simple cases and use overloading and default arguments to
express less common cases; §13.5, §13.4.
Debug concrete examples before generalizing to a template; §13.2.1.
Remember to eexxppoorrtt template definitions that need to be accessible from other translation
units; §13.7.
Separately compile large templates and templates with nontrivial context dependencies; §13.7.
Use templates to express conversions but define those conversions very carefully; §13.6.3.1.
Where necessary, constrain template arguments using a ccoonnssttrraaiinntt() member function;
§13.9[16].
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 13.8
Advice
353
[17] Use explicit instantiation to minimize compile time and link time; §C.13.10.
[18] Prefer a template over derived classes when run-time efficiency is at a premium; §13.6.1.
[19] Prefer derived classes over a template if adding new variants without recompilation is important; §13.6.1.
[20] Prefer a template over derived classes when no common base can be defined; §13.6.1.
[21] Prefer a template over derived classes when built-in types and structures with compatibility
constraints are important; §13.6.1.
13.9 Exercises [temp.exercises]
1. (∗2) Fix the errors in the definition of L
Liisstt from §13.2.5 and write out C++ code equivalent to
what the compiler must generate for the definition of L
Liisstt and the function ff(). Run a small
test case using your hand-generated code and the code generated by the compiler from the template version. If possible on your system given your knowledge, compare the generated code.
2. (∗3) Write a singly-linked list class template that accepts elements of any type derived from a
class L
Liinnkk that holds the information necessary to link elements. This is called an intrusive list.
Using this list, write a singly-linked list that accepts elements of any type (a non-intrusive list).
Compare the performance of the two list classes and discuss the tradeoffs between them.
3. (∗2.5) Write intrusive and non-intrusive doubly-linked lists. What operations should be provided in addition to the ones you found necessary to supply for a singly-linked list?
4. (∗2) Complete the SSttrriinngg template from §13.2 based on the SSttrriinngg class from §11.12.
5. (∗2) Define a ssoorrtt() that takes its comparison criterion as a template argument. Define a class
R
Reeccoorrdd with two data members ccoouunntt and pprriiccee. Sort a vveeccttoorr<R
Reeccoorrdd> on each data member.
6. (∗2) Implement a qqssoorrtt() template.
7. (∗2) Write a program that reads (kkeeyy,vvaalluuee) pairs and prints out the sum of the vvaalluuees corresponding to each distinct kkeeyy. Specify what is required for a type to be a kkeeyy and a vvaalluuee.
8. (∗2.5) Implement a simple M
Maapp class based on the A
Assssoocc class from §11.8. Make sure M
Maapp
works correctly using both C-style strings and ssttrriinnggs as keys. Make sure M
Maapp works correctly
for types with and without default constructors. Provide a way of iterating over the elements of
aM
Maapp.
9. (∗3) Compare the performance of the word count program from §11.8 against a program not
using an associative array. Use the same style of I/O in both cases.
10. (∗3) Re-implement M
Maapp from §13.9[8] using a more suitable data structure (e.g., a red-black
tree or a Splay tree).
11. (∗2.5) Use M
Maapp to implement a topological sort function. Topological sort is described in
[Knuth,1968] vol. 1 (second edition), pg 262.
12. (∗1.5) Make the sum program from §13.9[7] work correctly for names containing spaces; for
example, ‘‘thumb tack.’’
13. (∗2) Write rreeaaddlliinnee() templates for different kinds of lines. For example (item,count,price).
14. (∗2) Use the technique outlined for L
Liitteerraattee in §13.4 to sort strings in reverse lexicographical
order. Make sure the technique works both for C++ implementations where cchhaarr is ssiiggnneedd and
for C++ implementations where it is uunnssiiggnneedd. Use a variant of that technique to provide a sort
that is not case-sensitive.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
354
Templates
Chapter 13
15. (∗1.5) Construct an example that demonstrates at least three differences between a function template and a macro (not counting the differences in definition syntax).
16. (∗2) Devise a scheme that ensures that the compiler tests general constraints on the template
arguments for every template for which an object is constructed. It is not sufficient just to test
constraints of the form ‘‘the argument T must be a class derived from M
Myy__bbaassee.’’
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
14
________________________________________
________________________________________________________________________________________________________________________________________________________________
Exception Handling
Don´t interrupt me
while I´m interrupting.
– Winston S. Churchill
Error handling — grouping of exceptions — catching exceptions — catch all — rethrow — resource management — aauuttoo__ppttrr — exceptions and nneew
w — resource exhaustion — exceptions in constructors — exceptions in destructors — exceptions that are not
errors — exception specifications — unexpected exceptions — uncaught exceptions —
exceptions and efficiency — error-handling alternatives — standard exceptions —
advice — exercises.
14.1 Error Handling [except.error]
As pointed out in §8.3, the author of a library can detect run-time errors but does not in general
have any idea what to do about them. The user of a library may know how to cope with such errors
but cannot detect them – or else they would have been handled in the user’s code and not left for
the library to find. The notion of an exception is provided to help deal with such problems. The
fundamental idea is that a function that finds a problem it cannot cope with throws an exception,
hoping that its (direct or indirect) caller can handle the problem. A function that wants to handle
that kind of problem can indicate that it is willing to catch that exception (§2.4.2, §8.3).
This style of error handling compares favorably with more traditional techniques. Consider the
alternatives. Upon detecting a problem that cannot be handled locally, the program could:
[1] terminate the program,
[2] return a value representing ‘‘error,’’
[3] return a legal value and leave the program in an illegal state, or
[4] call a function supplied to be called in case of ‘‘error.’’
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
356
Exception Handling
Chapter 14
Case [1], ‘‘terminate the program,’’ is what happens by default when an exception isn’t caught.
For most errors, we can and must do better. In particular, a library that doesn’t know about the purpose and general strategy of the program in which it is embedded cannot simply eexxiitt() or
aabboorrtt(). A library that unconditionally terminates cannot be used in a program that cannot afford
to crash. One way of viewing exceptions is as a way of giving control to a caller when no meaningful action can be taken locally.
Case [2], ‘‘return an error value,’’ isn’t always feasible because there is often no acceptable
‘‘error value.’’ For example, if a function returns an iinntt, every iinntt might be a plausible result.
Even where this approach is feasible, it is often inconvenient because every call must be checked
for the error value. This can easily double the size of a program (§14.8). Consequently, this
approach is rarely used systematically enough to detect all errors.
Case [3], ‘‘return a legal value and leave the program in an illegal state,’’ has the problem that
the calling function may not notice that the program has been put in an illegal state. For example,
many standard C library functions set the global variable eerrrrnnoo to indicate an error (§20.4.1,
§22.3). However, programs typically fail to test eerrrrnnoo consistently enough to avoid consequential
errors caused by values returned from failed calls. Furthermore, the use of global variables for
recording error conditions doesn’t work well in the presence of concurrency.
Exception handling is not meant to handle problems for which case [4], ‘‘call an error-handler
function,’’ is relevant. However, in the absence of exceptions, an error-handler function has
exactly the three other cases as alternatives for how it handles the error. For a further discussion of
error-handling functions and exceptions, see §14.4.5.
The exception-handling mechanism provides an alternative to the traditional techniques when
they are insufficient, inelegant, and error-prone. It provides a way of explicitly separating errorhandling code from ‘‘ordinary’’ code, thus making the program more readable and more amenable
to tools. The exception-handling mechanism provides a more regular style of error handling, thus
simplifying cooperation between separately written program fragments.
One aspect of the exception-handling scheme that will appear novel to C and Pascal programmers is that the default response to an error (especially to an error in a library) is to terminate the
program. The traditional response has been to muddle through and hope for the best. Thus, exception handling makes programs more ‘‘brittle’’ in the sense that more care and effort must be taken
to get a program to run acceptably. This seems preferable, though, to getting wrong results later in
the development process – or after the development process is considered complete and the program is handed over to innocent users. Where termination is unacceptable, we can catch all exceptions (§14.3.2) or catch all exceptions of a specific kind (§14.6.2). Thus, an exception terminates a
program only if a programmer allows it to terminate. This is preferable to the unconditional termination that happens when a traditional incomplete recovery leads to a catastrophic error.
Sometimes people have tried to alleviate the unattractive aspects of ‘‘muddling through’’ by
writing out error messages, putting up dialog boxes asking the user for help, etc. Such approaches
are primarily useful in debugging situations in which the user is a programmer familiar with the
structure of the program. In the hands of nondevelopers, a library that asks the (possibly absent)
user/operator for help is unacceptable. Also, in many cases error messages have no place to go
(say, if the program runs in an environment in which cceerrrr doesn’t connect to anything a user
notices); they would be incomprehensible to an end user anyway. At a minimum, the error message might be in the wrong natural language (say, in Finnish to a English user). Worse, the error
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.1
Error Handling
357
message would typically refer to library concepts completely unknown to a user (say, ‘‘bad argument to atan2,’’ caused by bad input to a graphics system). A good library doesn’t ‘‘blabber’’ in
this way. Exceptions provide a way for code that detects a problem from which it cannot recover to
pass the problem on to some part of the system that might be able to recover. Only a part of the
system that has some idea of the context in which the program runs has any chance of composing a
meaningful error message.
The exception-handling mechanism can be seen as a run-time analog to the compile-time type
checking and ambiguity control mechanisms. It makes the design process more important and can
increase the work needed to get an initial and buggy version of a program running. However, the
result is code that has a much better chance to run as expected, to run as an acceptable part of a
larger program, to be comprehensible to other programmers, and to be amenable to manipulation by
tools. Similarly, exception handling provides specific language features to support ‘‘good style’’ in
the same way other C++ features support ‘‘good style’’ that can be practiced only informally and
incompletely in languages such as C and Pascal.
It should be recognized that error handling will remain a difficult task and that the exceptionhandling mechanism – although more formalized than the techniques it replaces – is still relatively
unstructured compared with language features involving only local control flow. The C++
exception-handling mechanism provides the programmer with a way of handling errors where they
are most naturally handled, given the structure of a system. Exceptions make the complexity of
error handling visible. However, exceptions are not the cause of that complexity. Be careful not to
blame the messenger for bad news.
This may be a good time to review §8.3, where the basic syntax, semantics, and style-of-use
aspects of exception handling are presented.
14.1.1 Alternative Views on Exceptions [except.views]
‘‘Exception’’ is one of those words that means different things to different people. The C++
exception-handling mechanism is designed to support handling of errors and other exceptional conditions (hence the name). In particular, it is intended to support error handling in programs composed of independently developed components.
The mechanism is designed to handle only synchronous exceptions, such as array range checks
and I/O errors. Asynchronous events, such as keyboard interrupts and certain arithmetic errors, are
not necessarily exceptional and are not handled directly by this mechanism. Asynchronous events
require mechanisms fundamentally different from exceptions (as defined here) to handle them
cleanly and efficiently. Many systems offer mechanisms, such as signals, to deal with asynchrony,
but because these tend to be system-dependent, they are not described here.
The exception-handling mechanism is a nonlocal control structure based on stack unwinding
(§14.4) that can be seen as an alternative return mechanism. There are therefore legitimate uses of
exceptions that have nothing to do with errors (§14.5). However, the primary aim of the
exception-handling mechanism and the focus of this chapter is error handling and the support of
fault tolerance.
Standard C++ doesn’t have the notion of a thread or a process. Consequently, exceptional circumstances relating to concurrency are not discussed here. The concurrency facilities available on
your system are described in its documentation. Here, I’ll just note that the C++ exception-
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
358
Exception Handling
Chapter 14
handling mechanism was designed to be effective in a concurrent program as long as the programmer (or system) enforces basic concurrency rules, such as properly locking a shared data structure
while using it.
The C++ exception-handling mechanisms are provided to report and handle errors and exceptional events. However, the programmer must decide what it means to be exceptional in a given
program. This is not always easy (§14.5). Can an event that happens most times a program is run
be considered exceptional? Can an event that is planned for and handled be considered an error?
The answer to both questions is yes. ‘‘Exceptional’’ does not mean ‘‘almost never happens’’ or
‘‘disastrous.’’ It is better to think of an exception as meaning ‘‘some part of the system couldn’t do
what it was asked to do.’’ Usually, we can then try something else. Exception tthhrroow
ws should be
infrequent compared to function calls or the structure of the system has been obscured. However,
we should expect most large programs to tthhrroow
w and ccaattcchh at least some exceptions in the course of
a normal and successful run.
14.2 Grouping of Exceptions [except.grouping]
An exception is an object of some class representing an exceptional occurrence. Code that detects
an error (often a library) tthhrroow
ws an object (§8.3). A piece of code expresses desire to handle an
exception by a ccaattcchh clause. The effect of a tthhrroow
w is to unwind the stack until a suitable ccaattcchh is
found (in a function that directly or indirectly invoked the function that threw the exception).
Often, exceptions fall naturally into families. This implies that inheritance can be useful to
structure exceptions and to help exception handling. For example, the exceptions for a mathematical library might be organized like this:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
// ...
M
Maatthheerrrr { };
O
Ovveerrfflloow
w: ppuubblliicc M
Maatthheerrrr { };
U
Unnddeerrfflloow
w: ppuubblliicc M
Maatthheerrrr { };
Z
Zeerrooddiivviiddee: ppuubblliicc M
Maatthheerrrr { };
This allows us to handle any M
Maatthheerrrr without caring precisely which kind it is. For example:
vvooiidd ff()
{
ttrryy {
// ...
}
ccaattcchh (O
Ovveerrfflloow
w) {
// handle Overflow or anything derived from Overflow
}
ccaattcchh (M
Maatthheerrrr) {
// handle any Matherr that is not Overflow
}
}
Here, an O
Ovveerrfflloow
w is handled specifically. All other M
Maatthheerrrr exceptions will be handled by the
general case.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.2
Grouping of Exceptions
359
Organizing exceptions into hierarchies can be important for robustness of code. For example,
consider how you would handle all exceptions from a library of mathematical functions without
such a grouping mechanism. This would have to be done by exhaustively listing the exceptions:
vvooiidd gg()
{
ttrryy {
// ...
}
ccaattcchh (O
Ovveerrfflloow
w) { /* ... */ }
ccaattcchh (U
Unnddeerrfflloow
w) { /* ... */ }
ccaattcchh (Z
Zeerrooddiivviiddee) { /* ... */ }
}
This is not only tedious, but a programmer can easily forget to add an exception to the list. Consider what would be needed if we didn’t group math exceptions. When we added a new exception
to the math library, every piece of code that tried to handle every math exception would have to be
modified. In general, such universal update is not feasible after the initial release of the library.
Often, there is no way of finding every relevant piece of code. Even when there is, we cannot in
general assume that every piece of source code is available or that we would be willing to make
changes if it were. These recompilation and maintenance problems would lead to a policy that no
new exceptions can be added to a library after its first release; that would be unacceptable for
almost all libraries. This reasoning leads exceptions to be defined as per-library or per-subsystem
class hierarchies (§14.6.2).
Please note that neither the built-in mathematical operations nor the basic math library (shared
with C) reports arithmetic errors as exceptions. One reason for this is that detection of some arithmetic errors, such as divide-by-zero, are asynchronous on many pipelined machine architectures.
The M
Maatthheerrrr hierarchy described here is only an illustration. The standard library exceptions are
described in §14.10.
14.2.1 Derived Exceptions [except.derived]
The use of class hierarchies for exception handling naturally leads to handlers that are interested
only in a subset of the information carried by exceptions. In other words, an exception is typically
caught by a handler for its base class rather than by a handler for its exact class. The semantics for
catching and naming an exception are identical to those of a function accepting an argument. That
is, the formal argument is initialized with the argument value (§7.2). This implies that the exception thrown is ‘‘sliced’’ to the exception caught (§12.2.3). For example:
ccllaassss M
Maatthheerrrr {
// ...
vviirrttuuaall vvooiidd ddeebbuugg__pprriinntt() ccoonnsstt { cceerrrr << "M
Maatthh eerrrroorr"; }
};
ccllaassss IInntt__oovveerrfflloow
w: ppuubblliicc M
Maatthheerrrr {
ccoonnsstt cchhaarr* oopp;
iinntt aa11, aa22;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
360
Exception Handling
Chapter 14
ppuubblliicc:
IInntt__oovveerrfflloow
w(ccoonnsstt cchhaarr* pp, iinntt aa, iinntt bb) { oopp = pp; aa11 = aa; aa22 = bb; }
vviirrttuuaall vvooiidd ddeebbuugg__pprriinntt() ccoonnsstt { cceerrrr << oopp << ´(´ << aa11 << ´,´ << aa22 << ´)´; }
// ...
};
vvooiidd ff()
{
ttrryy {
gg();
}
ccaattcchh (M
Maatthheerrrr m
m) {
// ...
}
}
When the M
Maatthheerrrr handler is entered, m is a M
Maatthheerrrr object – even if the call to gg() threw
IInntt__oovveerrfflloow
w. This implies that the extra information found in an IInntt__oovveerrfflloow
w is inaccessible.
As always, pointers or references can be used to avoid losing information permanently. For
example, we might write:
iinntt aadddd(iinntt xx, iinntt yy)
{
iiff ((xx>00 && yy>00 && xx>IIN
NT
T__M
MA
AX
X-yy) || (xx<00 && yy<00 && xx<IIN
NT
T__M
MIIN
N-yy))
tthhrroow
w IInntt__oovveerrfflloow
w("+",xx,yy);
rreettuurrnn xx+yy;
// x+y will not overflow
}
vvooiidd ff()
{
ttrryy {
iinntt ii11 = aadddd(11,22);
iinntt ii22 = aadddd(IIN
NT
T__M
MA
AX
X,-22);
iinntt ii33 = aadddd(IIN
NT
T__M
MA
AX
X,22);
// here we go!
}
ccaattcchh (M
Maatthheerrrr& m
m) {
// ...
m
m.ddeebbuugg__pprriinntt();
}
}
The last call of aadddd() triggers an exception that causes IInntt__oovveerrfflloow
w::ddeebbuugg__pprriinntt() to be
invoked. Had the exception been caught by value rather than by reference,
M
Maatthheerrrr::ddeebbuugg__pprriinntt() would have been invoked instead.
14.2.2 Composite Exceptions [except.composite]
Not every grouping of exceptions is a tree structure. Often, an exception belongs to two groups.
For example:
ccllaassss N
Neettffiillee__eerrrr : ppuubblliicc N
Neettw
woorrkk__eerrrr, ppuubblliicc F
Fiillee__ssyysstteem
m__eerrrr { /* ... */ };
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.2.2
Composite Exceptions
361
Such a N
Neettffiillee__eerrrr can be caught by functions dealing with network exceptions:
vvooiidd ff()
{
ttrryy {
// something
}
ccaattcchh(N
Neettw
woorrkk__eerrrr& ee) {
// ...
}
}
and also by functions dealing with file system exceptions:
vvooiidd gg()
{
ttrryy {
// something else
}
ccaattcchh(F
Fiillee__ssyysstteem
m__eerrrr& ee) {
// ...
}
}
This nonhierarchical organization of error handling is important where services, such as networking, are transparent to users. In this case, the writer of gg() might not even be aware that a network
is involved (see also §14.6).
14.3 Catching Exceptions [except.catch]
Consider:
vvooiidd ff()
{
ttrryy {
tthhrroow
w E
E();
}
ccaattcchh(H
H) {
// when do we get here?
}
}
The handler is invoked:
[1] If H is the same type as E
E.
[2] If H is an unambiguous public base of E
E.
[3] If H and E are pointer types and [1] or [2] holds for the types to which they refer.
[4] If H is a reference and [1] or [2] holds for the type to which H refers.
In addition, we can add ccoonnsstt to the type used to catch an exception in the same way that we can
add it to a function parameter. This doesn’t change the set of exceptions we can catch; it only
restricts us from modifying the exception caught.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
362
Exception Handling
Chapter 14
In principle, an exception is copied when it is thrown, so the handler gets hold of a copy of the
original exception. In fact, an exception may be copied several times before it is caught. Consequently, we cannot throw an exception that cannot be copied. The implementation may apply a
wide variety of strategies for storing and transmitting exceptions. It is guaranteed, however, that
there is sufficient memory to allow nneew
w to throw the standard out-of-memory exception, bbaadd__aalllloocc
(§14.4.5).
14.3.1 Re-Throw [except.rethrow]
Having caught an exception, it is common for a handler to decide that it can’t completely handle
the error. In that case, the handler typically does what can be done locally and then throws the
exception again. Thus, an error can be handled where it is most appropriate. This is the case even
when the information needed to best handle the error is not available in a single place, so that the
recovery action is best distributed over several handlers. For example:
vvooiidd hh()
{
ttrryy {
// code that might throw Math errors
}
ccaattcchh (M
Maatthheerrrr) {
iiff (ccaann__hhaannddllee__iitt__ccoom
mpplleetteellyy) {
// handle the Matherr
rreettuurrnn;
}
eellssee {
// do what can be done here
tthhrroow
w;
// re-throw the exception
}
}
}
A re-throw is indicated by a tthhrroow
w without an operand. If a re-throw is attempted when there is no
exception to re-throw, tteerrm
miinnaattee() (§14.7) will be called. A compiler can detect and warn about
some, but not all, such cases.
The exception re-thrown is the original exception caught and not just the part of it that was
accessible as a M
Maatthheerrrr. In other words, had an IInntt__oovveerrfflloow
w been thrown, a caller of hh() could
still catch an IInntt__oovveerrfflloow
w that hh() had caught as a M
Maatthheerrrr and decided to re-throw.
14.3.2 Catch Every Exception [except.every]
A degenerate version of this catch-and-rethrow technique can be important. As for functions, the
ellipsis ... indicates ‘‘any argument’’ (§7.6), so ccaattcchh(...) means ‘‘catch any exception.’’
For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.3.2
Catch Every Exception
363
vvooiidd m
m()
{
ttrryy {
// something
}
ccaattcchh (...) {
// cleanup
tthhrroow
w;
}
// handle every exception
}
That is, if any exception occurs as the result of executing the main part of m
m(), the cleanup action
in the handler is invoked. Once the local cleanup is done, the exception that caused the cleanup is
re-thrown to trigger further error handling. See §14.6.3.2 for a technique to gain information about
an exception caught by a ... handler.
One important aspect of error handling in general and exception handling in particular is to
maintain invariants assumed by the program (§24.3.7.1). For example, if m
m() is supposed to leave
certain pointers in the state in which it found them, then we can write code in the handler to give
them acceptable values. Thus, a ‘‘catch every exception’’ handler can be used to maintain arbitrary
invariants. However, for many important cases such a handler is not the most elegant solution to
this problem (see §14.4).
14.3.2.1 Order of Handlers [except.order]
Because a derived exception can be caught by handlers for more than one exception type, the order
in which the handlers are written in a ttrryy statement is significant. The handlers are tried in order.
For example:
vvooiidd ff()
{
ttrryy {
// ...
}
ccaattcchh (ssttdd::iiooss__bbaassee::ffaaiilluurree) {
// handle any stream io error (§14.10)
}
ccaattcchh (ssttdd::eexxcceeppttiioonn& ee) {
// handle any standard library exception (§14.10)
}
ccaattcchh (...) {
// handle any other exception (§14.3.2)
}
}
Because the compiler knows the class hierarchy, it can catch many logical mistakes. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
364
Exception Handling
Chapter 14
vvooiidd gg()
{
ttrryy {
// ...
}
ccaattcchh (...) {
// handle every exception (§14.3.2)
}
ccaattcchh (ssttdd::eexxcceeppttiioonn& ee) {
// handle any standard library exception (§14.10)
}
ccaattcchh (ssttdd::bbaadd__ccaasstt) {
// handle dynamic_cast failure (§15.4.2)
}
}
Here, the eexxcceeppttiioonn will never be considered. Even if we removed the ‘‘catch-all’’ handler,
bbaadd__ccaasstt wouldn’t be considered because it is derived from eexxcceeppttiioonn.
14.4 Resource Management [except.resource]
When a function acquires a resource – that is, it opens a file, allocates some memory from the free
store, sets an access control lock, etc., – it is often essential for the future running of the system that
the resource be properly released. Often that ‘‘proper release’’ is achieved by having the function
that acquired it release it before returning to its caller. For example:
vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn)
{
F
FIIL
LE
E* f = ffooppeenn(ffnn,"w
w");
// use f
ffcclloossee(ff);
}
This looks plausible until you realize that if something goes wrong after the call of ffooppeenn() and
before the call of ffcclloossee(), an exception may cause uussee__ffiillee() to be exited without ffcclloossee()
being called. Exactly the same problem can occur in languages that do not support exception handling. For example, the standard C library function lloonnggjjm
mpp() can cause the same problem. Even
an ordinary rreettuurrnn-statement could exit uussee__ffiillee without closing ff.
A first attempt to make uussee__ffiillee() to be fault-tolerant looks like this:
vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn)
{
F
FIIL
LE
E* f = ffooppeenn(ffnn,"rr");
ttrryy {
// use f
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.4
Resource Management
365
ccaattcchh (...) {
ffcclloossee(ff);
tthhrroow
w;
}
ffcclloossee(ff);
}
The code using the file is enclosed in a ttrryy block that catches every exception, closes the file, and
re-throws the exception.
The problem with this solution is that it is verbose, tedious, and potentially expensive. Furthermore, any verbose and tedious solution is error-prone because programmers get bored. Fortunately,
there is a more elegant solution. The general form of the problem looks like this:
vvooiidd aaccqquuiirree()
{
// acquire resource 1
// ...
// acquire resource n
// use resources
// release resource n
// ...
// release resource 1
}
It is typically important that resources are released in the reverse order of their acquisition. This
strongly resembles the behavior of local objects created by constructors and destroyed by
destructors. Thus, we can handle such resource acquisition and release problems by a suitable use
of objects of classes with constructors and destructors. For example, we can define a class F
Fiillee__ppttrr
that acts like a F
FIIL
LE
E*:
ccllaassss F
Fiillee__ppttrr {
F
FIIL
LE
E* pp;
ppuubblliicc:
F
Fiillee__ppttrr(ccoonnsstt cchhaarr* nn, ccoonnsstt cchhaarr* aa) { p = ffooppeenn(nn,aa); }
F
Fiillee__ppttrr(F
FIIL
LE
E* pppp) { p = pppp; }
~F
Fiillee__ppttrr() { ffcclloossee(pp); }
ooppeerraattoorr F
FIIL
LE
E*() { rreettuurrnn pp; }
};
We can construct a F
Fiillee__ppttrr given either a F
FIIL
LE
E* or the arguments required for ffooppeenn(). In either
case, a F
Fiillee__ppttrr will be destroyed at the end of its scope and its destructor will close the file. Our
program now shrinks to this minimum:
vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn)
{
F
Fiillee__ppttrr ff(ffnn,"rr");
// use f
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
366
Exception Handling
Chapter 14
The destructor will be called independently of whether the function is exited normally or exited
because an exception is thrown. That is, the exception-handling mechanisms enable us to remove
the error-handling code from the main algorithm. The resulting code is simpler and less errorprone than its traditional counterpart.
The process of searching ‘‘up through the stack’’ to find a handler for an exception is commonly called ‘‘stack unwinding.’’ As the call stack is unwound, the destructors for constructed
local objects are invoked.
14.4.1 Using Constructors and Destructors [except.using]
The technique for managing resources using local objects is usually referred to as ‘‘resource acquisition is initialization.’’ This is a general technique that relies on the properties of constructors and
destructors and their interaction with exception handling.
An object is not considered constructed until its constructor has completed. Then and only then
will stack unwinding call the destructor for the object. An object composed of sub-objects is constructed to the extent that its sub-objects have been constructed. An array is constructed to the
extent that its elements have been constructed (and only fully constructed elements are destroyed
during unwinding).
A constructor tries to ensure that its object is completely and correctly constructed. When that
cannot be achieved, a well-written constructor restores – as far as possible – the state of the system
to what it was before creation. Ideally, naively written constructors always achieve one of these
alternatives and don’t leave their objects in some ‘‘half-constructed’’ state. This can be achieved
by applying the ‘‘resource acquisition is initialization’’ technique to the members.
Consider a class X for which a constructor needs to acquire two resources: a file x and a lock yy.
This acquisition might fail and throw an exception. Class X
X’s constructor must never return having
acquired the file but not the lock. Furthermore, this should be achieved without imposing a burden
of complexity on the programmer. We use objects of two classes, F
Fiillee__ppttrr and L
Loocckk__ppttrr, to represent the acquired resources. The acquisition of a resource is represented by the initialization of the
local object that represents the resource:
ccllaassss X {
F
Fiillee__ppttrr aaaa;
L
Loocckk__ppttrr bbbb;
ppuubblliicc:
X
X(ccoonnsstt cchhaarr* xx, ccoonnsstt cchhaarr* yy)
: aaaa(xx,"rrw
w"), // acquire ‘x’
bbbb(yy)
// acquire ‘y’
{}
// ...
};
Now, as in the local object case, the implementation can take care of all of the bookkeeping. The
user doesn’t have to keep track at all. For example, if an exception occurs after aaaa has been constructed but before bbbb has been, then the destructor for aaaa but not for bbbb will be invoked.
This implies that where this simple model for acquisition of resources is adhered to, the author
of the constructor need not write explicit exception-handling code.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.4.1
Using Constructors and Destructors
367
The most common resource acquired in an ad-hoc manner is memory. For example:
ccllaassss Y {
iinntt* pp;
vvooiidd iinniitt();
ppuubblliicc:
Y
Y(iinntt ss) { p = nneew
w iinntt[ss]; iinniitt(); }
~Y
Y() { ddeelleettee[] pp; }
// ...
};
This practice is common and can lead to ‘‘memory leaks.’’ If an exception is thrown by iinniitt(),
then the store acquired will not be freed; the destructor will not be called because the object wasn’t
completely constructed. A safe variant is:
ccllaassss Z {
vveeccttoorr<iinntt> pp;
vvooiidd iinniitt();
ppuubblliicc:
Z
Z(iinntt ss) : pp(ss) { iinniitt(); }
// ...
};
The memory used by p is now managed by vveeccttoorr. If iinniitt() throws an exception, the memory
acquired will be freed when the destructor for p is (implicitly) invoked.
14.4.2 Auto_ptr [except.autoptr]
The standard library provides the template class aauuttoo__ppttrr, which supports the ‘‘resource acquisition
is initialization’’ technique. Basically, an aauuttoo__ppttrr is initialized by a pointer and can be dereferenced in the way that a pointer can. Also, the object pointed to will be implicitly deleted at the end
of the aauuttoo__ppttrr’s scope. For example:
vvooiidd ff(P
Pooiinntt pp11, P
Pooiinntt pp22, aauuttoo__ppttrr<C
Ciirrccllee> ppcc, SShhaappee* ppbb) // remember to delete pb on exit
{
aauuttoo__ppttrr<SShhaappee> pp(nneew
w R
Reeccttaannggllee(pp11,pp22)); // p points to a rectangle
aauuttoo__ppttrr<SShhaappee> ppbbooxx(ppbb);
pp->rroottaattee(4455); // use auto_ptr<Shape> exactly as a Shape*
// ...
iiff (iinn__aa__m
meessss) tthhrroow
w M
Meessss();
// ...
}
Here the R
Reeccttaannggllee, the SShhaappee pointed to by ppbb, and the C
Ciirrccllee pointed to by ppcc are deleted
whether or not an exception is thrown.
To achieve this ownership semantics (also called destructive copy semantics), aauuttoo__ppttrrs have a
copy semantics that differs radically from that of ordinary pointers: When one aauuttoo__ppttrr is copied
into another, the source no longer points to anything. Because copying an aauuttoo__ppttrr modifies it, a
ccoonnsstt aauuttoo__ppttrr cannot be copied.
The aauuttoo__ppttrr template is declared in <m
meem
moorryy>. It can be described by an implementation:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
368
Exception Handling
Chapter 14
tteem
mppllaattee<ccllaassss X
X> ccllaassss ssttdd::aauuttoo__ppttrr {
tteem
mppllaattee <ccllaassss Y
Y> ssttrruucctt aauuttoo__ppttrr__rreeff { /* ... */ }; // helper class
X
X* ppttrr;
ppuubblliicc:
ttyyppeeddeeff X eelleem
meenntt__ttyyppee;
eexxpplliicciitt aauuttoo__ppttrr(X
X* p =00) tthhrroow
w() { ppttrr=00; }
aauuttoo__ppttrr(aauuttoo__ppttrr& aa) tthhrroow
w() { ppttrr=aa.ppttrr; aa.ppttrr=00; } // note: not const auto_ptr&
tteem
mppllaattee<ccllaassss Y
Y> aauuttoo__ppttrr(aauuttoo__ppttrr<Y
Y>& aa) tthhrroow
w() { ppttrr=aa.ppttrr; aa.ppttrr=00; }
aauuttoo__ppttrr& ooppeerraattoorr=(aauuttoo__ppttrr& aa) tthhrroow
w() { ppttrr=aa.ppttrr; aa.ppttrr=00; }
tteem
mppllaattee<ccllaassss Y
Y> aauuttoo__ppttrr& ooppeerraattoorr=(aauuttoo__ppttrr<Y
Y>& aa) tthhrroow
w() { ppttrr=aa.ppttrr; aa.ppttrr=00; }
~aauuttoo__ppttrr() tthhrroow
w() { ddeelleettee ppttrr; }
X
X& ooppeerraattoorr*() ccoonnsstt tthhrroow
w() { rreettuurrnn *ppttrr; }
X
X* ooppeerraattoorr->() ccoonnsstt tthhrroow
w() { rreettuurrnn ppttrr; }
X
X* ggeett() ccoonnsstt tthhrroow
w() { rreettuurrnn ppttrr; }
// extract pointer
X
X* rreelleeaassee() tthhrroow
w() { X
X* t = ppttrr; ppttrr=00; rreettuurrnn tt; }
// relinquish ownership
vvooiidd rreesseett(X
X* p =00) tthhrroow
w() { iiff (pp!=ppttrr) { ddeelleettee ppttrr; ppttrr=pp; } }
aauuttoo__ppttrr(aauuttoo__ppttrr__rreeff<X
X>) tthhrroow
w();
// copy from auto_ptr_ref
tteem
mppllaattee<ccllaassss Y
Y> ooppeerraattoorr aauuttoo__ppttrr__rreeff<Y
Y>() tthhrroow
w(); // copy from auto_ptr_ref
tteem
mppllaattee<ccllaassss Y
Y> ooppeerraattoorr aauuttoo__ppttrr<Y
Y>() tthhrroow
w();
// destructive copy from auto_ptr
};
The purpose of aauuttoo__ppttrr__rreeff is to implement the destructive copy semantics for ordinary aauuttoo__ppttrrs
while making it impossible to copy a ccoonnsstt aauuttoo__ppttrr. The template constructor and template
assignment ensures that an aauuttoo__ppttrr<D
D> can be implicitly converted to a aauuttoo__ppttrr<B
B> if a D
D* can
be converted to a B
B*. For example:
vvooiidd gg(C
Ciirrccllee* ppcc)
{
aauuttoo__ppttrr<C
Ciirrccllee> pp22 = ppcc; // now p2 is responsible for deletion
aauuttoo__ppttrr<C
Ciirrccllee> pp33 = pp22; // now p3 is responsible for deletion (and p2 isn’t)
pp22->m
m = 77;
// programmer error: p2.get()==0
SShhaappee* ppss = pp33.ggeett();
// extract the pointer from an auto_ptr
aauuttoo__ppttrr<SShhaappee> aappss = pp33; // transfer of ownership and convert type
aauuttoo__ppttrr<C
Ciirrccllee> pp44 = pp; // programmer error: now p4 is also responsible for deletion
}
The effect of having more than one aauuttoo__ppttrr own an object is undefined; most likely the object will
be deleted twice (with bad effects).
Note that aauuttoo__ppttrr’s destructive copy semantics means that it does not meet the requirements
for elements of a standard container or for standard algorithms such as ssoorrtt(). For example:
vvooiidd hh(vveeccttoorr< aauuttoo__ppttrr<SShhaappee*> >& vv) // dangerous: use of auto_ptr in container
{
ssoorrtt(vv.bbeeggiinn(),vv.eenndd());
// Don’t do this: The sort will probably mess up v
}
Clearly, aauuttoo__ppttrr isn’t a general smart pointer. However, it provides the service for which it was
designed – exception safety for automatic pointers – with essentially no overhead.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.4.3
Caveat
369
14.4.3 Caveat [except.caveat]
Not all programs need to be resilient against all forms of failure, and not all resources are critical
enough to warrant the effort to protect them using ‘‘resource acquisition is initialization,’’
aauuttoo__ppttrr, and ccaattcchh(...). For example, for many programs that simply read an input and run to
completion, the most suitable response to a serious run-time error is to abort the process (after producing a suitable diagnostic). That is, let the system release all acquired resources and let the user
re-run the program with a more suitable input. The strategy discussed here is intended for applications for which such a simplistic response to a run-time error is unacceptable. In particular, a
library designer usually cannot make assumptions about the fault tolerance requirements of a program using the library and is thus forced to avoid all unconditional run-time failures and to release
all resources before a library function returns to the calling program. The ‘‘resource acquisition is
initialization’’ strategy, together with the use of exceptions to signal failure, is suitable for many
such libraries.
14.4.4 Exceptions and New [except.new]
Consider:
vvooiidd ff(A
Arreennaa& aa, X
X* bbuuffffeerr)
{
X
X* pp11 = nneew
w X
X;
X
X* pp22 = nneew
w X
X[1100];
X
X* pp33 = nneew
w(bbuuffffeerr[1100]) X
X;
X
X* pp44 = nneew
w(bbuuffffeerr[1111]) X
X[1100];
// place X in buffer (no deallocation needed)
X
X* pp55 = nneew
w(aa) X
X;
X
X* pp66 = nneew
w(aa) X
X[1100];
// allocation from Arena a (deallocate from a)
}
What happens if X
X´ss constructor throws an exception? Is the memory allocated by the ooppeerraattoorr
nneew
w() freed? For the ordinary case, the answer is yes, so the initializations of pp11 and pp22 don’t
cause memory leaks.
When the placement syntax (§10.4.11) is used, the answer cannot be that simple. Some uses of
that syntax allocate memory, which then ought to be released; however, some don’t. Furthermore,
the point of using the placement syntax is to achieve nonstandard allocation, so nonstandard freeing
is typically required. Consequently, the action taken depends on the allocator used. If an allocator
Z
Z::ooppeerraattoorr nneew
w() is used, Z
Z::ooppeerraattoorr ddeelleettee() is invoked if it exists; otherwise, no
deallocation is attempted. Arrays are handled equivalently (§15.6.1). This strategy correctly handles the standard library placement nneew
w operator (§10.4.11), as well as any case in which the programmer has provided a matching pair of allocation and deallocation functions.
14.4.5 Resource Exhaustion [except.exhaust]
A recurring programming problem is what to do when an attempt to acquire a resource fails. For
example, previously we blithely opened files (using ffooppeenn()) and requested memory from the free
store (using operator nneew
w) without worrying about what happened if the file wasn’t there or if we
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
370
Exception Handling
Chapter 14
had run out of free store. When confronted with such problems, programmers come up with two
styles of solutions:
Resumption: Ask some caller to fix the problem and carry on.
Termination: Abandon the computation and return to some caller.
In the former case, a caller must be prepared to help out with resource acquisition problems in
unknown pieces of code. In the latter, a caller must be prepared to cope with failure of the attempt
to acquire the resource. The latter is in most cases far simpler and allows a system to maintain a
better separation of levels of abstraction. Note that it is not the program that terminates when one
uses the termination strategy; only an individual computation terminates. ‘‘Termination’’ is the traditional term for a strategy that returns from a ‘‘failed’’ computation to an error handler associated
with a caller (which may re-try the failed computation), rather than trying to repair a bad situation
and resume from the point at which the problem was detected.
In C++, the resumption model is supported by the function-call mechanism and the termination
model is supported by the exception-handling mechanism. Both can be illustrated by a simple
implementation and use of the standard library ooppeerraattoorr nneew
w():
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt ssiizzee)
{
ffoorr (;;) {
iiff (vvooiidd* p = m
maalllloocc(ssiizzee)) rreettuurrnn pp;
iiff (__nneew
w__hhaannddlleerr == 00) tthhrroow
w bbaadd__aalllloocc();
__nneew
w__hhaannddlleerr();
}
}
// try to find memory
// no handler: give up
// ask for help
Here, I use the standard C library m
maalllloocc() to do the real search for memory; other implementations of ooppeerraattoorr nneew
w() may choose other ways. If memory is found, ooppeerraattoorr nneew
w() can return
a pointer to it. Otherwise, ooppeerraattoorr nneew
w() calls the __nneew
w__hhaannddlleerr. If the __nneew
w__hhaannddlleerr can find
more memory for m
maalllloocc() to allocate, all is fine. If it can’t, the handler cannot return to ooppeerraattoorr
nneew
w() without causing an infinite loop. The __nneew
w__hhaannddlleerr() might then choose to throw an
exception, thus leaving the mess for some caller to handle:
vvooiidd m
myy__nneew
w__hhaannddlleerr()
{
iinntt nnoo__ooff__bbyytteess__ffoouunndd = ffiinndd__ssoom
mee__m
meem
moorryy();
iiff (nnoo__ooff__bbyytteess__ffoouunndd < m
miinn__aallllooccaattiioonn) tthhrroow
w bbaadd__aalllloocc();
}
// give up
Somewhere, there ought to be a try_block with a suitable handler:
ttrryy {
// ...
}
ccaattcchh (bbaadd__aalllloocc) {
// somehow respond to memory exhaustion
}
The __nneew
w__hhaannddlleerr used in the implementation of ooppeerraattoorr nneew
w() is a pointer to a function maintained by the standard function sseett__nneew
w__hhaannddlleerr(). If I want m
myy__nneew
w__hhaannddlleerr() to be used as
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.4.5
Resource Exhaustion
371
the __nneew
w__hhaannddlleerr, I say:
sseett__nneew
w__hhaannddlleerr(&m
myy__nneew
w__hhaannddlleerr);
If I also want to catch bbaadd__aalllloocc, I might say:
vvooiidd ff()
{
vvooiidd(*oollddnnhh)() = sseett__nneew
w__hhaannddlleerr(&m
myy__nneew
w__hhaannddlleerr);
ttrryy {
// ...
}
ccaattcchh (bbaadd__aalllloocc) {
// ...
}
ccaattcchh (...) {
sseett__nneew
w__hhaannddlleerr(oollddnnhh); // re-set handler
tthhrroow
w;
// re-throw
}
sseett__nneew
w__hhaannddlleerr(oollddnnhh);
// re-set handler
}
Even better, avoid the ccaattcchh(...) handler by applying the ‘‘resource acquisition is initialization’’ technique described in §14.4 to the __nneew
w__hhaannddlleerr (§14.12[1]).
With the __nneew
w__hhaannddlleerr, no extra information is passed along from where the error is detected
to the helper function. It is easy to pass more information. However, the more information that is
passed between the code detecting a run-time error and a function helping correct that error, the
more the two pieces of code become dependent on each other. This implies that changes to the one
piece of code require understanding of and maybe even changes to the other. To keep separate
pieces of software separate, it is usually a good idea to minimize such dependencies. The
exception-handling mechanism supports such separation better than do function calls to helper routines provided by a caller.
In general, it is wise to organize resource allocation in layers (levels of abstraction) and avoid
having one layer depend on help from the layer that called it. Experience with larger systems
shows that successful systems evolve in this direction.
Throwing an exception requires an object to throw. A C++ implementation is required to have
enough spare memory to be able to throw bbaadd__aalllloocc in case of memory exhaustion. However, it is
possible that throwing some other exception will cause memory exhaustion.
14.4.6 Exceptions in Constructors [except.ctor]
Exceptions provide a solution to the problem of how to report errors from a constructor. Because a
constructor does not return a separate value for a caller to test, the traditional (that is, nonexception-handling) alternatives are:
[1] Return an object in a bad state, and trust the user to test the state.
[2] Set a nonlocal variable (e.g., eerrrrnnoo) to indicate that the creation failed, and trust the user to
test that variable.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
372
Exception Handling
Chapter 14
[3] Don’t do any initialization in the constructor, and rely on the user to call an initialization
function before the first use.
[4] Mark the object ‘‘uninitialized’’ and have the first member function called for the object do
the real initialization, and that function can then report an error if initialization fails.
Exception handling allows the information that a construction failed to be transmitted out of the
constructor. For example, a simple V
Veeccttoorr class might protect itself from excessive demands on
memory like this:
ccllaassss V
Veeccttoorr {
ppuubblliicc:
ccllaassss SSiizzee { };
eennuum
m{m
maaxx = 3322000000 };
V
Veeccttoorr::V
Veeccttoorr(iinntt sszz)
{
iiff (sszz<00 || m
maaxx<sszz) tthhrroow
w SSiizzee();
// ...
}
// ...
};
Code creating V
Veeccttoorrs can now catch V
Veeccttoorr::SSiizzee errors, and we can try to do something sensible
with them:
V
Veeccttoorr* ff(iinntt ii)
{
ttrryy {
V
Veeccttoorr* p = nneew
w V
Veeccttoorr(ii);
// ...
rreettuurrnn pp;
}
ccaattcchh(V
Veeccttoorr::SSiizzee) {
// deal with size error
}
}
As always, the error handler itself can use the standard set of fundamental techniques for error
reporting and recovery. Each time an exception is passed along to a caller, the view of what went
wrong changes. If suitable information is passed along in the exception, the amount of information
available to deal with the problem could increase. In other words, the fundamental aim of the
error-handling techniques is to pass information about an error from the original point of detection
to a point where there is sufficient information available to recover from the problem, and to do so
reliably and conveniently.
The ‘‘resource acquisition is initialization’’ technique is the safest and most elegant way of handling constructors that acquire more than one resource (§14.4). In essence, the technique reduces
the problem of handling many resources to repeated application of the (simple) technique for handling one resource.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.4.6.1
Exceptions and Member Initialization
373
14.4.6.1 Exceptions and Member Initialization [except.member]
What happens if a member initializer (directly or indirectly) throws an exception? By default, the
exception is passed on to whatever invoked the constructor for the member’s class. However, the
constructor itself can catch such exceptions by enclosing the complete function body – including
the member initializer list – in a try-block. For example:
ccllaassss X {
V
Veeccttoorr vv;
// ...
ppuubblliicc:
X
X(iinntt);
// ...
};
X
X::X
X(iinntt ss)
ttrryy
:vv(ss)
// initialize v by s
{
// ...
}
ccaattcchh (V
Veeccttoorr::SSiizzee) { // exceptions thrown for v are caught here
// ...
}
Copy constructors (§10.4.4.1) are special in that they are invoked implicitly and because they often
both acquire and release resources. In particular, the standard library assumes proper – nonexception-throwing – behavior of copy constructors. For these reasons, care should be taken that a
copy constructor throws an exception only in truly disastrous circumstances. Complete recovery
from an exception in a copy constructor is unlikely to be feasible in every context of its use. To be
even potentially safe, a copy constructor must leave behind two objects, each of which fulfills the
invariant of its class (§24.3.7.1).
Naturally, copy assignment operators should be treated with as much care as copy constructors.
14.4.7 Exceptions in Destructors [except.dtor]
From the point of view of exception handling, a destructor can be called in one of two ways:
[1] Normal call: As the result of a normal exit from a scope (§10.4.3), a ddeelleettee (§10.4.5), etc.
[2] Call during exception handling: During stack unwinding (§14.4), the exception-handling
mechanism exits a scope containing an object with a destructor.
In the latter case, an exception may not escape from the destructor itself. If it does, it is considered
a failure of the exception-handling mechanism and ssttdd::tteerrm
miinnaattee() (§14.7) is called. After all,
there is no general way for the exception-handling mechanism or the destructor to determine
whether it is acceptable to ignore one of the exceptions in favor of handling the other.
If a destructor calls functions that may throw exceptions, it can protect itself. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
374
Exception Handling
Chapter 14
X
X::~X
X()
ttrryy {
ff(); // might throw
}
ccaattcchh (...) {
// do something
}
The standard library function uunnccaauugghhtt__eexxcceeppttiioonn() returns ttrruuee if an exception has been thrown
but hasn’t yet been caught. This allows the programmer to specify different actions in a destructor
depending on whether an object is destroyed normally or as part of stack unwinding.
14.5 Exceptions That Are Not Errors [except.not.error]
If an exception is expected and caught so that it has no bad effects on the behavior of the program,
then how can it be an error? Only because the programmer thinks of it as an error and of the
exception-handling mechanisms as tools for handling errors. Alternatively, one might think of the
exception-handling mechanisms as simply another control structure. For example:
vvooiidd ff(Q
Quueeuuee<X
X>& qq)
{
ttrryy {
ffoorr (;;) {
X m = qq.ggeett();
// ...
}
}
ccaattcchh (Q
Quueeuuee<X
X>::E
Em
mppttyy) {
rreettuurrnn;
}
}
// throws ‘Empty’ if queue is empty
This actually has some charm, so it is a case in which it is not entirely clear what should be considered an error and what should not.
Exception handling is a less structured mechanism than local control structures such as iiff and
ffoorr and is often less efficient when an exception is actually thrown. Therefore, exceptions should
be used only where the more traditional control structures are inelegant or impossible to use. Note
that the standard library offers a qquueeuuee of arbitrary elements without using exceptions (§17.3.2).
Using exceptions as alternate returns can be an elegant technique for terminating search functions – especially highly recursive search functions such as a lookup in a tree. For example:
vvooiidd ffnndd(T
Trreeee* pp, ccoonnsstt ssttrriinngg& ss)
{
iiff (ss == pp->ssttrr) tthhrroow
w pp;
// found s
iiff (pp->lleefftt) ffnndd(pp->lleefftt,ss);
iiff (pp->rriigghhtt) ffnndd(pp->rriigghhtt,ss);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.5
Exceptions That Are Not Errors
375
T
Trreeee* ffiinndd(T
Trreeee* pp, ccoonnsstt ssttrriinngg& ss)
{
ttrryy {
ffnndd(pp,ss);
}
ccaattcchh (T
Trreeee* qq) {
// q– >str==s
rreettuurrnn qq;
}
rreettuurrnn 00;
}
However, such use of exceptions can easily be overused and lead to obscure code. Whenever reasonable, one should stick to the ‘‘exception handling is error handling’’ view. When this is done,
code is clearly separated into two categories: ordinary code and error-handling code. This makes
code more comprehensible. Unfortunately, the real world isn’t so clear cut. Program organization
will (and to some extent should) reflect that.
Error handling is inherently difficult. Anything that helps preserve a clear model of what is an
error and how it is handled should be treasured.
14.6 Exception Specifications [except.spec]
Throwing or catching an exception affects the way a function relates to other functions. It can
therefore be worthwhile to specify the set of exceptions that might be thrown as part of the function
declaration. For example:
vvooiidd ff(iinntt aa) tthhrroow
w (xx22, xx33);
This specifies that ff() may throw only exceptions xx22, xx33, and exceptions derived from these types,
but no others. When a function specifies what exceptions it might throw, it effectively offers a
guarantee to its callers. If during execution that function does something that tries to abrogate the
guarantee, the attempt will be transformed into a call of ssttdd::uunneexxppeecctteedd(). The default meaning
of uunneexxppeecctteedd() is ssttdd::tteerrm
miinnaattee(), which in turn normally calls aabboorrtt(); see §9.4.1.1 for
details.
In effect,
vvooiidd ff() tthhrroow
w (xx22, xx33)
{
// stuff
}
is equivalent to:
vvooiidd ff()
ttrryy
{
// stuff
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
376
Exception Handling
Chapter 14
ccaattcchh (xx22) { tthhrroow
w; }
// re-throw
ccaattcchh (xx33) { tthhrroow
w; }
// re-throw
ccaattcchh (...) {
ssttdd::uunneexxppeecctteedd(); // unexpected() will not return
}
The most important advantage is that the function declaration belongs to an interface that is visible
to its callers. Function definitions, on the other hand, are not universally available. Even when we
do have access to the source code of all our libraries, we strongly prefer not to have to look at it
very often. In addition, a function with an exception-specification is shorter and clearer than the
equivalent hand-written version.
A function declared without an exception-specification is assumed to throw every exception.
For example:
iinntt ff();
// can throw any exception
A function that will throw no exceptions can be declared with an empty list:
iinntt gg() tthhrroow
w ();
// no exception thrown
One might think that the default should be that a function throws no exceptions. However, that
would require exception specifications for essentially every function, would be a significant cause
for recompilation, and would inhibit cooperation with software written in other languages. This
would encourage programmers to subvert the exception-handling mechanisms and to write spurious
code to suppress exceptions. It would provide a false sense of security to people who failed to
notice the subversion.
14.6.1 Checking Exception Specifications [except.check.spec]
It is not possible to catch every violation of an interface specification at compile time. However,
much compile-time checking is done. The way to think about exception-specifications is to assume
that a function will throw any exception it can. The rules for compile-time checking exceptionspecifications outlaw easily detected absurdities.
If any declaration of a function has an exception-specification, every declaration of that function
(including the definition) must have an exception-specification with exactly the same set of exception types. For example:
iinntt ff() tthhrroow
w (ssttdd::bbaadd__aalllloocc);
iinntt ff()
// error: exception-specification missing
{
// ...
}
Importantly, exception-specifications are not required to be checked exactly across compilation-unit
boundaries. Naturally, an implementation can check. However, for many large and long-lived systems, it is important that the implementation does not – or, if it does, that it carefully gives hard
errors only where violations will not be caught at run time.
The point is to ensure that adding an exception somewhere doesn’t force a complete update of
related exception specifications and a recompilation of all potentially affected code. A system can
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.6.1
Checking Exception Specifications
377
then function in a partially updated state relying on the dynamic (run-time) detection of unexpected
exceptions. This is essential for the maintenance of large systems in which major updates are
expensive and not all source code is accessible.
A virtual function may be overridden only by a function that has an exception-specification at
least as restrictive as its own (explicit or implicit) exception-specification. For example:
ccllaassss B {
ppuubblliicc:
vviirrttuuaall vvooiidd ff();
// can throw anything
vviirrttuuaall vvooiidd gg() tthhrroow
w(X
X,Y
Y);
vviirrttuuaall vvooiidd hh() tthhrroow
w(X
X);
};
ccllaassss D : ppuubblliicc B {
ppuubblliicc:
vvooiidd ff() tthhrroow
w(X
X);
vvooiidd gg() tthhrroow
w(X
X);
vvooiidd hh() tthhrroow
w(X
X,Y
Y);
};
// ok
// ok: D::g() is more restrictive than B::g()
// error: D::h() is less restrictive than B::h()
This rule is really only common sense. If a derived class threw an exception that the original function didn’t advertise, a caller couldn’t be expected to catch it. On the other hand, an overriding
function that throws fewer exceptions clearly obeys the rule set out by the overridden function’s
exception-specification.
Similarly, you can assign a pointer to function that has a more restrictive exceptionspecification to a pointer to function that has a less restrictive exception-specification, but not vice
versa. For example:
vvooiidd ff() tthhrroow
w(X
X);
vvooiidd (*ppff11)() tthhrroow
w(X
X,Y
Y) = &ff;
vvooiidd (*ppff22)() tthhrroow
w() = &ff;
// ok
// error: f() is less restrictive than pf2
In particular, you cannot assign a pointer to a function without an exception-specification to a
pointer to function that has one:
vvooiidd gg(); // might throw anything
vvooiidd (*ppff33)() tthhrroow
w(X
X) = ≫
// error: g() less restrictive than pf3
An exception-specification is not part of the type of a function and a ttyyppeeddeeff may not contain one.
For example:
ttyyppeeddeeff vvooiidd (*P
PF
F)() tthhrroow
w(X
X);
// error
14.6.2 Unexpected Exceptions [except.unexpected]
An exception-specification can lead to calls to uunneexxppeecctteedd(). Such calls are typically undesirable
except during testing. Such calls can be avoided through careful organization of exceptions and
specification of interfaces. Alternatively, calls to uunneexxppeecctteedd() can be intercepted and rendered
harmless.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
378
Exception Handling
Chapter 14
A well-defined subsystem Y will often have all its exceptions derived from a class Y
Yeerrrr. For
example, given
ccllaassss SSoom
mee__Y
Yeerrrr : ppuubblliicc Y
Yeerrrr { /* ... */ };
a function declared
vvooiidd ff() tthhrroow
w (X
Xeerrrr, Y
Yeerrrr, eexxcceeppttiioonn);
will pass any Y
Yeerrrr on to its caller. In particular, ff() would handle a SSoom
mee__Y
Yeerrrr by passing it on to
its caller. Thus, no Y
Yeerrrr in ff() will trigger uunneexxppeecctteedd().
All exceptions thrown by the standard library are derived from class eexxcceeppttiioonn (§14.10).
14.6.3 Mapping Exceptions [except.mapping]
Occasionally, the policy of terminating a program upon encountering an unexpected exception is
too Draconian. In such cases, the behavior of uunneexxppeecctteedd() must be modified into something
acceptable.
The simplest way of achieving that is to add the standard library exception ssttdd::bbaadd__eexxcceeppttiioonn
to an exception-specification. In that case, uunneexxppeecctteedd() will simply throw bbaadd__eexxcceeppttiioonn instead
of invoking a function to try to cope. For example:
ccllaassss X { };
ccllaassss Y { };
vvooiidd ff() tthhrroow
w(X
X,ssttdd::bbaadd__eexxcceeppttiioonn)
{
// ...
tthhrroow
w Y
Y();
// throw ‘‘bad’’ exception
}
The exception-specification will catch the unacceptable exception Y and throw an exception of type
bbaadd__eexxcceeppttiioonn instead.
There is actually nothing particularly bad about bbaadd__eexxcceeppttiioonn; it simply provides a mechanism that is less drastic than calling tteerrm
miinnaattee(). However, it is still rather crude. In particular,
information about which exception caused the problem is lost.
14.6.3.1 User Mapping of Exceptions [except.user.mapping]
Consider a function gg() written for a non-networked environment. Assume further that gg() has
been declared with an exception-specification so that it will throw only exceptions related to its
‘‘subsystem Y:’’
vvooiidd gg() tthhrroow
w(Y
Yeerrrr);
Now assume that we need to call gg() in a networked environment.
Naturally, gg() will not know about network exceptions and will invoke uunneexxppeecctteedd() when it
encounters one. To use gg() in a distributed environment, we must either provide code that handles
network exceptions or rewrite gg(). Assuming a rewrite is infeasible or undesirable, we can handle
the problem by redefining the meaning of uunneexxppeecctteedd().
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.6.3.1
User Mapping of Exceptions
379
Memory exhaustion is dealt with by the __nneew
w__hhaannddlleerr determined by sseett__nneew
w__hhaannddlleerr().
Similarly, the response to an unexpected exception is determined by an __uunneexxppeecctteedd__hhaannddlleerr set
by ssttdd::sseett__uunneexxppeecctteedd() from <eexxcceeppttiioonn>:
ttyyppeeddeeff vvooiidd(*uunneexxppeecctteedd__hhaannddlleerr)();
uunneexxppeecctteedd__hhaannddlleerr sseett__uunneexxppeecctteedd(uunneexxppeecctteedd__hhaannddlleerr);
To handle unexpected exceptions well, we first define a class to allow us to use the ‘‘resource
acquisition is initialization’’ technique for uunneexxppeecctteedd() functions:
ccllaassss SST
TC
C{
// store and reset class
uunneexxppeecctteedd__hhaannddlleerr oolldd;
ppuubblliicc:
SST
TC
C(uunneexxppeecctteedd__hhaannddlleerr ff) { oolldd = sseett__uunneexxppeecctteedd(ff); }
~SST
TC
C() { sseett__uunneexxppeecctteedd(oolldd); }
};
Then, we define a function with the meaning we want for uunneexxppeecctteedd() in this case:
ccllaassss Y
Yuunneexxppeecctteedd : Y
Yeerrrr { };
vvooiidd tthhrroow
wY
Y() tthhrroow
w(Y
Yuunneexxppeecctteedd) { tthhrroow
w Y
Yuunneexxppeecctteedd(); }
Used as an uunneexxppeecctteedd() function, tthhrroow
wY
Y() maps any unexpected exception into Y
Yuunneexxppeecctteedd.
Finally, we provide a version of gg() to be used in the networked environment:
vvooiidd nneettw
woorrkkeedd__gg() tthhrroow
w(Y
Yeerrrr)
{
SST
TC
C xxxx(&tthhrroow
wY
Y); // now unexpected() throws Yunexpected
gg();
}
Because Y
Yuunneexxppeecctteedd is derived from Y
Yeerrrr, the exception-specification is not violated. Had
tthhrroow
wY
Y() thrown an exception that did violate the exception-specification, tteerrm
miinnaattee() would
have been called.
By saving and restoring the __uunneexxppeecctteedd__hhaannddlleerr, we make it possible for several subsystems
to control the handling of unexpected exceptions without interfering with each other. Basically,
this technique for mapping an unexpected exception into an expected one is a more flexible variant
of what the system offers in the form of bbaadd__eexxcceeppttiioonn.
14.6.3.2 Recovering the Type of an Exception [except.recover]
Mapping unexpected exceptions to Y
Yuunneexxppeecctteedd would allow a user of nneettw
woorrkkeedd__gg() to know
that an unexpected exception had been mapped into Y
Yuunneexxppeecctteedd. However, such a user wouldn’t
know which exception had been mapped. That information was lost in tthhrroow
wY
Y(). A simple technique allows that information to be recorded and passed on:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
380
Exception Handling
Chapter 14
ccllaassss Y
Yuunneexxppeecctteedd : ppuubblliicc Y
Yeerrrr {
ppuubblliicc:
N
Neettw
woorrkk__eexxcceeppttiioonn* ppee;
Y
Yuunneexxppeecctteedd(N
Neettw
woorrkk__eexxcceeppttiioonn* pp) :ppee(pp) { }
};
vvooiidd tthhrroow
wY
Y() tthhrroow
w(Y
Yuunneexxppeecctteedd)
{
ttrryy {
tthhrroow
w; // re-throw to be caught immediately!
}
ccaattcchh(N
Neettw
woorrkk__eexxcceeppttiioonn& pp) {
tthhrroow
w Y
Yuunneexxppeecctteedd(&pp); // throw mapped exception
}
ccaattcchh(...) {
tthhrroow
w Y
Yuunneexxppeecctteedd(00);
}
}
Re-throwing an exception and catching it allows us to get a handle on any exception of a type we
can name. The tthhrroow
wY
Y() function is called from uunneexxppeecctteedd(), which is conceptually called
from a ccaattcchh(...) handler. There therefore is definitely an exception to re-throw. It is not possible for an uunneexxppeecctteedd() function to ignore the exception and return. If it tries to, uunneexxppeecctteedd()
itself will throw a bbaadd__eexxcceeppttiioonn (§14.6.3).
14.7 Uncaught Exceptions [except.uncaught]
If an exception is thrown but not caught, the function ssttdd::tteerrm
miinnaattee() will be called. The tteerrm
mii-nnaattee() function will also be called when the exception-handling mechanism finds the stack corrupted and when a destructor called during stack unwinding caused by an exception tries to exit
using an exception.
An unexpected exception is dealt with by the __uunneexxppeecctteedd__hhaannddlleerr determined by
sseett__uunneexxppeecctteedd(). Similarly, the response to an uncaught exception is determined by an
__uunnccaauugghhtt__hhaannddlleerr set by ssttdd::sseett__tteerrm
miinnaattee() from <eexxcceeppttiioonn>:
ttyyppeeddeeff vvooiidd(*tteerrm
miinnaattee__hhaannddlleerr)();
tteerrm
miinnaattee__hhaannddlleerr sseett__tteerrm
miinnaattee(tteerrm
miinnaattee__hhaannddlleerr);
The return value is the previous function given to sseett__tteerrm
miinnaattee().
The reason for tteerrm
miinnaattee() is that exception handling must occasionally be abandoned for less
subtle error-handling techniques. For example, tteerrm
miinnaattee() could be used to abort a process or
maybe to re-initialize a system. The intent is for tteerrm
miinnaattee() to be a drastic measure to be applied
when the error-recovery strategy implemented by the exception-handling mechanism has failed and
it is time to go to another level of a fault tolerance strategy.
By default, tteerrm
miinnaattee() will call aabboorrtt() (§9.4.1.1). This default is the correct choice for
most users – especially during debugging.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.7
Uncaught Exceptions
381
An __uunnccaauugghhtt__hhaannddlleerr is assumed not to return to its caller. If it tries to, tteerrm
miinnaattee() will
call aabboorrtt().
Note that aabboorrtt() indicates abnormal exit from the program. The function eexxiitt() can be used
to exit a program with a return value that indicates to the surrounding system whether the exit is
normal or abnormal (§9.4.1.1).
It is implementation-defined whether destructors are invoked when a program is terminated
because of an uncaught exception. On some systems, it is essential that the destructors are not
called so that the program can be resumed from the debugger. On other systems, it is architecturally close to impossible not to invoke the destructors while searching for a handler.
If you want to ensure cleanup when an uncaught exception happens, you can add a catch-all
handler (§14.3.2) to m
maaiinn() in addition to handlers for exceptions you really care about. For
example:
iinntt m
maaiinn()
ttrryy {
// ...
}
ccaattcchh (ssttdd::rraannggee__eerrrroorr)
{
cceerrrr << "rraannggee eerrrroorr: N
Noott aaggaaiinn!\\nn";
}
ccaattcchh (ssttdd::bbaadd__aalllloocc)
{
cceerrrr << "nneew
w rraann oouutt ooff m
meem
moorryy\\nn";
}
ccaattcchh (...) {
// ...
}
This will catch every exception, except those thrown by construction and destruction of global variables. There is no way of catching exceptions thrown during initialization of global variables. The
only way of gaining control in case of tthhrroow
w from an initializer of a nonlocal static object is
sseett__uunneexxppeecctteedd() (§14.6.2). This is another reason to avoid global variables whenever possible.
When an exception is caught, the exact point where it was thrown is generally not known. This
represents a loss of information compared to what a debugger might know about the state of a program. In some C++ development environments, for some programs, and for some people, it might
therefore be preferable nnoott to catch exceptions from which the program isn’t designed to recover.
14.8 Exceptions and Efficiency [except.efficiency]
In principle, exception handling can be implemented so that there is no run-time overhead when no
exception is thrown. In addition, this can be done so that throwing an exception isn’t all that
expensive compared to calling a function. Doing so without adding significant memory overhead
while maintaining compatibility with C calling sequences, debugger conventions, etc., is possible,
but hard. However, please remember that the alternatives to exceptions are not free either. It is not
unusual to find traditional systems in which half of the code is devoted to error handling.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
382
Exception Handling
Chapter 14
Consider a simple function ff() that appears to have nothing to do with exception handling:
vvooiidd gg(iinntt);
vvooiidd ff()
{
ssttrriinngg ss;
// ...
gg(11);
gg(22);
}
However, gg() may throw an exception, so ff() must contain code ensuring that s is destroyed correctly in case of an exception. However, had gg() not thrown an exception it would have had to
report its error some other way. Consequently, the comparable code using ordinary code to handle
errors instead of exceptions isn’t the plain code above, but something like:
bbooooll gg(iinntt);
bbooooll ff()
{
ssttrriinngg ss;
// ...
iiff (gg(11))
iiff (gg(22))
rreettuurrnn ttrruuee;
eellssee
rreettuurrnn ffaallssee;
eellssee
rreettuurrnn ffaallssee;
}
People don’t usually handle errors this systematically, though, and it is not always critical to do so.
However, when careful and systematic handling of errors is necessary, such housekeeping is best
left to a computer, that is, to the exception-handling mechanisms.
Exception-specifications (§14.6) can be most helpful in improving generated code. Had we
stated that gg() didn’t throw an exception:
vvooiidd gg(iinntt) tthhrroow
w();
the code generation for ff() could have been improved. It is worth observing that no traditional C
function throws an exception, so in most programs every C function can be declared with the empty
throw specification tthhrroow
w(). In particular, an implementation knows that only a few standard C
library functions (such as aatteexxiitt() and qqssoorrtt()) can throw exceptions, and it can take advantage of
that fact to generate better code.
Before giving a ‘‘C function’’ an empty exception-specification, tthhrroow
w(), take a minute to
consider if it could possibly throw an exception. For example, it might have been converted to use
the C++ operator nneew
w, which can throw bbaadd__aalllloocc, or it might call a C++ library that throws an
exception.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.9
Error-Handling Alternatives
383
14.9 Error-Handling Alternatives [except.alternatives]
The purpose of the exception-handling mechanisms is to provide a means for one part of a program
to inform another part of a program that an ‘‘exceptional circumstance’’ has been detected. The
assumption is that the two parts of the program are written independently and that the part of the
program that handles the exception often can do something sensible about the error.
To use handlers effectively in a program, we need an overall strategy. That is, the various parts
of the program must agree on how exceptions are used and where errors are dealt with. The
exception-handling mechanisms are inherently nonlocal, so adherence to an overall strategy is
essential. This implies that the error-handling strategy is best considered in the earliest phases of a
design. It also implies that the strategy must be simple (relative to the complexity of the total program) and explicit. Something complicated would not be consistently adhered to in an area as
inherently tricky as error recovery.
First of all, the idea that a single mechanism or technique can handle all errors must be dispelled; it would lead to complexity. Successful fault-tolerant systems are multilevel. Each level
copes with as many errors as it can without getting too contorted and leaves the rest to higher levels. The notion of tteerrm
miinnaattee() is intended to support this view by providing an escape if the
exception-handling mechanism itself is corrupted or if it has been incompletely used, thus leaving
exceptions uncaught. Similarly, the notion of uunneexxppeecctteedd() is intended to provide an escape when
the strategy using exception-specifications to provide firewalls fails.
Not every function should be a firewall. In most systems, it is not feasible to write every function to do sufficient checking to ensure that it either completes successfully or fails in a welldefined manner. The reasons that this will not work varies from program to program and from programmer to programmer. However, for larger programs:
[1] The amount of work needed to ensure this notion of ‘‘reliability’’ is too great to be done
consistently.
[2] The overheads in time and space are too great for the system to run acceptably (there will be
a tendency to check for the same errors, such as invalid arguments, over and over again).
[3] Functions written in other languages won’t obey the rules.
[4] This purely local notion of ‘‘reliability’’ leads to complexities that actually become a burden
to overall system reliability.
However, separating the program into distinct subsystems that either complete successfully or fail
in well-defined ways is essential, feasible, and economical. Thus, a major library, subsystem, or
key function should be designed in this way. Exception specifications are intended for interfaces to
such libraries and subsystems.
Usually, we don’t have the luxury of designing all of the code of a system from scratch. Therefore, to impose a general error-handling strategy on all parts of a program, we must take into
account program fragments implemented using strategies different from ours. To do this we must
address a variety of concerns relating to the way a program fragment manages resources and the
state in which it leaves the system after an error. The aim is to have the program fragment appear
to follow the general error-handling strategy even if it internally follows a different strategy.
Occasionally, it is necessary to convert from one style of error reporting to another. For example, we might check eerrrrnnoo and possibly throw an exception after a call to a C library or, conversely,
catch an exception and set eerrrrnnoo before returning to a C program from a C++ library:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
384
Exception Handling
Chapter 14
vvooiidd ccaallllC
C() tthhrroow
w(C
C__bblleew
wiitt)
{
eerrrrnnoo = 00;
cc__ffuunnccttiioonn();
iiff (eerrrrnnoo) {
// cleanup, if possible and necessary
tthhrroow
w C
C__bblleew
wiitt(eerrrrnnoo);
}
}
eexxtteerrnn "C
C" vvooiidd ccaallll__ffrroom
m__C
C() tthhrroow
w()
{
ttrryy {
cc__pplluuss__pplluuss__ffuunnccttiioonn();
}
ccaattcchh (...) {
// cleanup, if possible and necessary
eerrrrnnoo = E
E__C
CP
PL
LP
PL
LF
FC
CT
TB
BL
LE
EW
WIIT
T;
}
}
In such cases, it is important to be systematic enough to ensure that the conversion of error reporting styles is complete.
Error handling should be – as far as possible – hierarchical. If a function detects a run-time
error, it should not ask its caller for help with recovery or resource acquisition. Such requests set
up cycles in the system dependencies. That in turn makes the program hard to understand and
introduces the possibility of infinite loops in the error-handling and recovery code.
Simplifying techniques such as ‘‘resource acquisition is initialization’’ and simplifying assumptions such as ‘‘exceptions represent errors’’ should be used to make the error-handling code more
regular. See also §24.3.7.1 for ideas about how to use invariants and assertions to make the triggering of exceptions more regular.
14.10 Standard Exceptions [except.std]
Here is a table of standard exceptions and the functions, operators, and general facilities that throw
them:
_________________________________________________________________
_________________________________________________________________
Standard Exceptions (thrown by the language)
_________________________________________________________________
Name
Thrown by
Reference
Header
_________________________________________________________________
bbaadd__aalllloocc
new
§6.2.6.2, §19.4.5
<new>
bbaadd__ccaasstt
dynamic_cast
§15.4.1.1
<typeinfo>
bbaadd__ttyyppeeiidd
typeid
§15.4.4
<typeinfo>
exception specification
§14.6.3
<exception>
bbaadd__eexxcceeppttiioonn
_________________________________________________________________
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.10
Standard Exceptions
385
_________________________________________________________________________
Standard Exceptions (thrown by the standard library)
__________________________________________________________________________
________________________________________________________________________
Name
Thrown by
Reference
Header
_________________________________________________________________________
oouutt__ooff__rraannggee
at()
§3.7.2, §16.3.3, §20.3.3
<stdexcept>
bitset<>::operator[]() §17.5.3
<stdexcept>
iinnvvaalliidd__aarrgguum
meenntt
bitset constructor
§17.5.3.1
<stdexcept>
w__eerrrroorr
bitset<>::to_ulong()
§17.5.3.3
<stdexcept>
oovveerrfflloow
iiooss__bbaassee::::ffaaiilluurree
ios_base::clear()
§21.3.6
<ios>
_________________________________________________________________________
The library exceptions are part of a class hierarchy rooted in the standard library exception class
eexxcceeppttiioonn presented in <eexxcceeppttiioonn>:
ccllaassss eexxcceeppttiioonn {
ppuubblliicc:
eexxcceeppttiioonn() tthhrroow
w();
eexxcceeppttiioonn(ccoonnsstt eexxcceeppttiioonn&) tthhrroow
w();
eexxcceeppttiioonn& ooppeerraattoorr=(ccoonnsstt eexxcceeppttiioonn&) tthhrroow
w();
vviirrttuuaall ~eexxcceeppttiioonn() tthhrroow
w();
vviirrttuuaall ccoonnsstt cchhaarr* w
whhaatt() ccoonnsstt tthhrroow
w();
pprriivvaattee:
// ...
};
The hierarchy looks like this:
eexxcceeppttiioonn
.
llooggiicc__eerrrroorr
lleennggtthh__eerrrroorr
ddoom
maaiinn__eerrrroorr
oouutt__ooff__rraannggee
iinnvvaalliidd__aarrgguum
meenntt
rruunnttiim
mee__eerrrroorr
bbaadd__aalllloocc
bbaadd__ccaasstt
bbaadd__eexxcceeppttiioonn
bbaadd__ttyyppeeiidd
iiooss__bbaassee::::ffaaiilluurree
rraannggee__eerrrroorr
oovveerrfflloow
w__eerrrroorr
uunnddeerrfflloow
w__eerrrroorr
This seems rather elaborate for organizing the eight standard exceptions. This hierarchy attempts to
provide a framework for exceptions beyond the ones defined by the standard library. Logic errors
are errors that in principle could be caught either before the program starts executing or by tests of
arguments to functions and constructors. Run-time errors are all other errors. Some people view
this as a useful framework for all errors and exceptions; I don’t.
The standard library exception classes don’t add functions to the set provided by eexxcceeppttiioonn;
they simply define the required virtual functions appropriately. Thus, we can write:
vvooiidd ff()
ttrryy {
// use standard library
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
386
Exception Handling
ccaattcchh (eexxcceeppttiioonn& ee) {
ccoouutt << "ssttaannddaarrdd lliibbrraarryy eexxcceeppttiioonn " << ee.w
whhaatt() << ´\\nn´;
// ...
}
ccaattcchh (...) {
ccoouutt << "ootthheerr eexxcceeppttiioonn\\nn";
// ...
}
Chapter 14
// well, maybe
The standard exceptions are derived from eexxcceeppttiioonn. However, not every exception is, so it would
be a mistake to try to catch every exception by catching eexxcceeppttiioonn. Similarly, it would be a mistake to assume that every exception derived from eexxcceeppttiioonn is a standard library exception: programmers can add their own exceptions to the eexxcceeppttiioonn hierarchy .
Note that eexxcceeppttiioonn operations do not themselves throw exceptions. In particular, this implies
that throwing a standard library exception doesn’t cause a bbaadd__aalllloocc exception. The exceptionhandling mechanism keeps a bit of memory to itself for holding exceptions (possibly on the stack).
Naturally, it is possible to write code that eventually consumes all memory in the system, thus forcing a failure.
Here is a function that – if called – tests whether the function call or the exception-handling
mechanism runs out of memory first:
vvooiidd ppeerrvveerrtteedd()
{
ttrryy {
tthhrroow
w eexxcceeppttiioonn(); // recursive exception throw
}
ccaattcchh (eexxcceeppttiioonn& ee) {
ppeerrvveerrtteedd();
// recursive function call
ccoouutt << ee.w
whhaatt();
}
}
The purpose of the output statement is simply to prevent the compiler from re-using the memory
occupied by the exception named ee.
14.11 Advice [except.advice]
[1]
[2]
[3]
[4]
[5]
Use exceptions for error handling; §14.1, §14.5, §14.9.
Don’t use exceptions where more local control structures will suffice; §14.1.
Use the ‘‘resource allocation is initialization’’ technique to manage resources; §14.4.
Not every program needs to be exception safe; §14.4.3.
Use ‘‘resource allocation is initialization’’ and exception handlers to maintain invariants;
§14.3.2.
[6] Minimize the use of try-blocks. Use ‘‘resource acquisition is initialization’’ instead of explicit
handler code; §14.4.
[7] Not every function needs to handle every possible error; §14.9.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 14.11
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
Advice
387
Throw an exception to indicate failure in a constructor; §14.4.6.
Avoid throwing exceptions from copy constructors; §14.4.6.1.
Avoid throwing exceptions from destructors; §14.4.7.
Have m
maaiinn() catch and report all exceptions; §14.7.
Keep ordinary code and error-handling code separate; §14.4.5, §14.5.
Be sure that every resource acquired in a constructor is released when throwing an exception
in that constructor; §14.4.
Keep resource management hierarchical; §14.4.
Use exception-specifications for major interfaces; §14.9.
Beware of memory leaks caused by memory allocated by nneew
w not being released in case of an
exception; §14.4.1, §14.4.2, §14.4.4.
Assume that every exception that can be thrown by a function will be thrown; §14.6.
Don’t assume that every exception is derived from class eexxcceeppttiioonn; §14.10.
A library shouldn’t unilaterally terminate a program. Instead, throw an exception and let a
caller decide; §14.1.
A library shouldn’t produce diagnostic output aimed at an end user. Instead, throw an exception and let a caller decide; §14.1.
Develop an error-handling strategy early in a design; §14.9.
14.12 Exercises [except.exercises]
1. (∗2) Generalize the SST
TC
C class (§14.6.3.1) to a template that can use the ‘‘resource acquisition is
initialization’’ technique to store and reset functions of a variety of types.
2. (∗3) Complete the P
Pttrr__ttoo__T
T class from §11.11 as a template that uses exceptions to signal runtime errors.
3. (∗3) Write a function that searches a binary tree of nodes based on a cchhaarr* field for a match. If
a node containing hheelllloo is found, ffiinndd("hheelllloo") will return a pointer to that node. Use an
exception to indicate ‘‘not found.’’
4. (∗3) Define a class IInntt that acts exactly like the built-in type iinntt, except that it throws exceptions
rather than overflowing or underflowing.
5. (∗2.5) Take the basic operations for opening, closing, reading, and writing from the C interface
to your operating system and provide equivalent C++ functions that call the C functions but
throw exceptions in case of errors.
6. (∗2.5) Write a complete V
Veeccttoorr template with R
Raannggee and SSiizzee exceptions.
7. (∗1) Write a loop that computes the sum of a V
Veeccttoorr as defined in §14.12[6] without examining
the size of the V
Veeccttoorr. Why is this a bad idea?
8. (∗2.5) Consider using a class E
Exxcceeppttiioonn as the base of all classes used as exceptions. What
should it look like? How should it be used? What good might it do? What disadvantages
might result from a requirement to use such a class?
9. (∗1) Given a
iinntt m
maaiinn() { /* ... */ }
change it so that it catches all exceptions, turns them into error messages, and aabboorrtt()s. Hint:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
388
Exception Handling
Chapter 14
ccaallll__ffrroom
m__C
C() in §14.9 doesn’t quite handle all cases.
10. (∗2) Write a class or template suitable for implementing callbacks.
11. (∗2.5) Write a L
Loocckk class for some system supporting concurrency.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
15
________________________________________
________________________________________________________________________________________________________________________________________________________________
Class Hierarchies
Abstraction is selective ignorance.
– Andrew Koenig
Multiple inheritance — ambiguity resolution — inheritance and using-declarations —
replicated base classes — virtual base classes — uses of multiple inheritance — access
control — protected — access to base classes — run-time type information —
ddyynnaam
miicc__ccaasstt — static and dynamic casts — casting from virtual bases — ttyyppeeiidd —
extended type information — uses and misuses of run-time type information — pointers
to members — free store — virtual constructors — advice — exercises.
15.1 Introduction and Overview [hier.intro]
This chapter discusses how derived classes and virtual functions interact with other language facilities such as access control, name lookup, free store management, constructors, pointers, and type
conversions. It has five main parts:
§15.2 Multiple Inheritance
§15.3 Access Control
§15.4 Run-time Type Identification
§15.5 Pointers to Members
§15.6 Free Store Use
In general, a class is constructed from a lattice of base classes. Because most such lattices historically have been trees, a class lattice is often called a class hierarchy. We try to design classes so
that users need not be unduly concerned about the way a class is composed out of other classes. In
particular, the virtual call mechanism ensures that when we call a function ff() on an object, the
same function is called whichever class in the hierarchy provided the declaration of ff() used for
the call. This chapter focuses on ways to compose class lattices and to control access to parts of
classes and on facilities for navigating class lattices at compile time and run time.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
390
Class Hierarchies
Chapter 15
15.2 Multiple Inheritance [hier.mi]
As shown in §2.5.4 and §12.3, a class can have more than one direct base class, that is, more than
one class specified after the : in the class declaration. Consider a simulation in which concurrent
activities are represented by a class T
Taasskk and data gathering and display is achieved through a class
D
Diissppllaayyeedd. We can then define a class of simulated entities, class SSaatteelllliittee:
ccllaassss SSaatteelllliittee : ppuubblliicc T
Taasskk, ppuubblliicc D
Diissppllaayyeedd {
// ...
};
The use of more than one immediate base class is usually called multiple inheritance. In contrast,
having just one direct base class is called single inheritance.
In addition to whatever operations are defined specifically for a SSaatteelllliittee, the union of operations on T
Taasskks and D
Diissppllaayyeedds can be applied. For example:
vvooiidd ff(SSaatteelllliittee& ss)
{
ss.ddrraaw
w();
// Displayed::draw()
ss.ddeellaayy(1100); // Task::delay()
ss.ttrraannssm
miitt(); // Satellite::transmit()
}
Similarly, a SSaatteelllliittee can be passed to functions that expect a T
Taasskk or a D
Diissppllaayyeedd. For example:
vvooiidd hhiigghhlliigghhtt(D
Diissppllaayyeedd*);
vvooiidd ssuussppeenndd(T
Taasskk*);
vvooiidd gg(SSaatteelllliittee* pp)
{
hhiigghhlliigghhtt(pp); // pass a pointer to the Displayed part of the Satellite
ssuussppeenndd(pp);
// pass a pointer to the Task part of the Satellite
}
The implementation of this clearly involves some (simple) compiler technique to ensure that functions expecting a T
Taasskk see a different part of a SSaatteelllliittee than do functions expecting a D
Diissppllaayyeedd.
Virtual functions work as usual. For example:
ccllaassss T
Taasskk {
// ...
vviirrttuuaall vvooiidd ppeennddiinngg() = 00;
};
ccllaassss D
Diissppllaayyeedd {
// ...
vviirrttuuaall vvooiidd ddrraaw
w() = 00;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.2
Multiple Inheritance
391
ccllaassss SSaatteelllliittee : ppuubblliicc T
Taasskk, ppuubblliicc D
Diissppllaayyeedd {
// ...
vvooiidd ppeennddiinngg();
// override Task::pending()
vvooiidd ddrraaw
w();
// override Displayed::draw()
};
This ensures that SSaatteelllliittee::ddrraaw
w() and SSaatteelllliittee::ppeennddiinngg() will be called for a SSaatteelllliittee
treated as a D
Diissppllaayyeedd and a T
Taasskk, respectively.
Note that with single inheritance (only), the programmer’s choices for implementing the classes
D
Diissppllaayyeedd, T
Taasskk, and SSaatteelllliittee would be limited. A SSaatteelllliittee could be a T
Taasskk or a D
Diissppllaayyeedd, but
not both (unless T
Taasskk was derived from D
Diissppllaayyeedd or vice versa). Either alternative involves a loss
of flexibility.
Why would anyone want a class SSaatteelllliittee? Contrary to some people’s conjectures, the SSaatteelllliittee
example is real. There really was – and maybe there still is – a program constructed along the
lines used to describe multiple inheritance here. It was used to study the design of communication
systems involving satellites, ground stations, etc. Given such a simulation, we can answer questions about traffic flow, determine proper responses to a ground station that is being blocked by a
rainstorm, consider tradeoffs between satellite connections and Earth-bound connections, etc. Such
simulations do involve a variety of display and debugging operations. Also, we do need to store
the state of objects such as SSaatteelllliittees and their subcomponents for analysis, debugging, and error
recovery.
15.2.1 Ambiguity Resolution [hier.ambig]
Two base classes may have member functions with the same name. For example:
ccllaassss T
Taasskk {
// ...
vviirrttuuaall ddeebbuugg__iinnffoo* ggeett__ddeebbuugg();
};
ccllaassss D
Diissppllaayyeedd {
// ...
vviirrttuuaall ddeebbuugg__iinnffoo* ggeett__ddeebbuugg();
};
When a SSaatteelllliittee is used, these functions must be disambiguated:
vvooiidd ff(SSaatteelllliittee* sspp)
{
ddeebbuugg__iinnffoo* ddiipp = sspp->ggeett__ddeebbuugg(); // error: ambiguous
ddiipp = sspp->T
Taasskk::ggeett__ddeebbuugg();
// ok
ddiipp = sspp->D
Diissppllaayyeedd::ggeett__ddeebbuugg(); // ok
}
However, explicit disambiguation is messy, so it is usually best to resolve such problems by defining a new function in the derived class:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
392
Class Hierarchies
Chapter 15
ccllaassss SSaatteelllliittee : ppuubblliicc T
Taasskk, ppuubblliicc D
Diissppllaayyeedd {
// ...
ddeebbuugg__iinnffoo* ggeett__ddeebbuugg() // override Task::get_debug() and Displayed::get_debug()
{
ddeebbuugg__iinnffoo* ddiipp11 = T
Taasskk::ggeett__ddeebbuugg();
ddeebbuugg__iinnffoo* ddiipp22 = D
Diissppllaayyeedd::ggeett__ddeebbuugg();
rreettuurrnn ddiipp11->m
meerrggee(ddiipp22);
}
};
This localizes the information about SSaatteelllliittee’s base classes. Because SSaatteelllliittee::ggeett__ddeebbuugg()
overrides the ggeett__ddeebbuugg() functions from both of its base classes, SSaatteelllliittee::ggeett__ddeebbuugg() is
called wherever ggeett__ddeebbuugg() is called for a SSaatteelllliittee object.
A qualified name T
Teellssttaarr::ddrraaw
w can refer to a ddrraaw
w declared either in T
Teellssttaarr or in one of its
base classes. For example:
ccllaassss T
Teellssttaarr : ppuubblliicc SSaatteelllliittee {
// ...
vvooiidd ddrraaw
w()
{
ddrraaw
w();
// oops!: recursive call
SSaatteelllliittee::ddrraaw
w();
// finds Displayed::draw
D
Diissppllaayyeedd::ddrraaw
w();
SSaatteelllliittee::D
Diissppllaayyeedd::ddrraaw
w(); // redundant double qualification
}
};
In other words, if a SSaatteelllliittee::ddrraaw
w doesn’t resolve to a ddrraaw
w declared in SSaatteelllliittee, the compiler
recursively looks in its base classes; that is, it looks for T
Taasskk::ddrraaw
w and D
Diissppllaayyeedd::ddrraaw
w. If
exactly one match is found, that name will be used. Otherwise, SSaatteelllliittee::ddrraaw
w is either not found
or is ambiguous.
15.2.2 Inheritance and Using-Declarations [hier.using]
Overload resolution is not applied across different class scopes (§7.4). In particular, ambiguities
between functions from different base classes are not resolved based on argument types.
When combining essentially unrelated classes, such as T
Taasskk and D
Diissppllaayyeedd in the SSaatteelllliittee
example, similarity in naming typically does not indicate a common purpose. When such name
clashes occur, they often come as quite a surprise to the programmer. For example:
ccllaassss T
Taasskk {
// ...
vvooiidd ddeebbuugg(ddoouubbllee pp);
};
// print info only if priority is lower than p
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.2.2
Inheritance and Using-Declarations
393
ccllaassss D
Diissppllaayyeedd {
// ...
vvooiidd ddeebbuugg(iinntt vv); // the higher the ‘v,’ the more debug information is printed
};
ccllaassss SSaatteelllliittee : ppuubblliicc T
Taasskk, ppuubblliicc D
Diissppllaayyeedd {
// ...
};
vvooiidd gg(SSaatteelllliittee* pp)
{
pp->ddeebbuugg(11);
// error: ambiguous. Displayed::debug(int) or Task::debug(double) ?
pp->T
Taasskk::ddeebbuugg(11);
// ok
pp->D
Diissppllaayyeedd::ddeebbuugg(11); // ok
}
What if the use of the same name in different base classes was the result of a deliberate design decision and the user wanted selection based on the argument types? In that case, a using-declaration
(§8.2.2) can bring the functions into a common scope. For example:
ccllaassss A {
ppuubblliicc:
iinntt ff(iinntt);
cchhaarr ff(cchhaarr);
// ...
};
ccllaassss B {
ppuubblliicc:
ddoouubbllee ff(ddoouubbllee);
// ...
};
ccllaassss A
AB
B: ppuubblliicc A
A, ppuubblliicc B {
ppuubblliicc:
uussiinngg A
A::ff;
uussiinngg B
B::ff;
cchhaarr ff(cchhaarr); // hides A::f(char)
A
AB
B ff(A
AB
B);
};
vvooiidd gg(A
AB
B& aabb)
{
aabb.ff(11);
aabb.ff(´aa´);
aabb.ff(22.00);
aabb.ff(aabb);
}
// A::f(int)
// AB::f(char)
// B::f(double)
// AB::f(AB)
Using-declarations allow a programmer to compose a set of overloaded functions from base classes
and the derived class. Functions declared in the derived class hide functions that would otherwise
be available from a base. Virtual functions from bases can be overridden as ever (§15.2.3.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
394
Class Hierarchies
Chapter 15
A using-declaration (§8.2.2) in a class definition must refer to members of a base class. A
using-declaration may not be used for a member of a class from outside that class, its derived
classes, and their member functions. A using-directive (§8.2.3) may not appear in a class definition
and may not be used for a class.
A using-declaration cannot be used to gain access to additional information. It is simply a
mechanism for making accessible information more convenient to use (§15.3.2.2).
15.2.3 Replicated Base Classes [hier.replicated]
With the ability of specifying more than one base class comes the possibility of having a class as a
base twice. For example, had T
Taasskk and D
Diissppllaayyeedd each been derived from a L
Liinnkk class, a SSaatteelllliittee
would have two L
Liinnkks:
ssttrruucctt L
Liinnkk {
L
Liinnkk* nneexxtt;
};
ccllaassss T
Taasskk : ppuubblliicc L
Liinnkk {
// the Link is used to maintain a list of all Tasks (the scheduler list)
// ...
};
ccllaassss D
Diissppllaayyeedd : ppuubblliicc L
Liinnkk {
// the Link is used to maintain a list of all Displayed objects (the display list)
// ...
};
This causes no problems. Two separate L
Liinnkk objects are used to represent the links, and the two
lists do not interfere with each other. Naturally, one cannot refer to members of the L
Liinnkk class
without risking an ambiguity (§15.2.3.1). A SSaatteelllliittee object could be drawn like this:
L
Liinnkk
L
Liinnkk
T
Taasskk.
D
Diissppllaayyeedd
SSaatteelllliittee
Examples of where the common base class shouldn’t be represented by two separate objects can be
handled using a virtual base class (§15.2.4).
Usually, a base class that is replicated the way L
Liinnkk is here is an implementation detail that
shouldn’t be used from outside its immediate derived class. If such a base must be referred to from
a point where more than one copy of the base is visible, the reference must be explicitly qualified to
resolve the ambiguity. For example:
vvooiidd m
meessss__w
wiitthh__lliinnkkss(SSaatteelllliittee* pp)
{
pp->nneexxtt = 00;
pp->L
Liinnkk::nneexxtt = 00;
// error: ambiguous (which Link?)
// error: ambiguous (which Link?)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.2.3
Replicated Base Classes
395
pp->T
Taasskk::L
Liinnkk::nneexxtt = 00;
// ok
pp->D
Diissppllaayyeedd::L
Liinnkk::nneexxtt = 00; // ok
// ...
}
This is exactly the mechanism used to resolve ambiguous references to members (§15.2.1).
15.2.3.1 Overriding [hier.override]
A virtual function of a replicated base class can be overridden by a (single) function in a derived
class. For example, one might represent the ability of an object to read itself from a file and write
itself back to a file like this:
ccllaassss SSttoorraabbllee {
ppuubblliicc:
vviirrttuuaall ccoonnsstt cchhaarr* ggeett__ffiillee() = 00;
vviirrttuuaall vvooiidd rreeaadd() = 00;
vviirrttuuaall vvooiidd w
wrriittee() = 00;
vviirrttuuaall ~SSttoorraabbllee() { w
wrriittee(); } // to be called from overriding destructors (see §15.2.4.1)
};
Naturally, several programmers might rely on this to develop classes that can be used independently or in combination to build more elaborate classes. For example, one way of stopping and
restarting a simulation is to store components of a simulation and then restore them later. That idea
might be implemented like this:
ccllaassss T
Trraannssm
miitttteerr : ppuubblliicc SSttoorraabbllee {
ppuubblliicc:
vvooiidd w
wrriittee();
// ...
};
ccllaassss R
Reecceeiivveerr : ppuubblliicc SSttoorraabbllee {
ppuubblliicc:
vvooiidd w
wrriittee();
// ...
};
ccllaassss R
Raaddiioo : ppuubblliicc T
Trraannssm
miitttteerr, ppuubblliicc R
Reecceeiivveerr {
ppuubblliicc:
ccoonnsstt cchhaarr* ggeett__ffiillee();
vvooiidd rreeaadd();
vvooiidd w
wrriittee();
// ...
};
Typically, an overriding function calls its base class versions and then does the work specific to the
derived class:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
396
Class Hierarchies
Chapter 15
vvooiidd R
Raaddiioo::w
wrriittee()
{
T
Trraannssm
miitttteerr::w
wrriittee();
R
Reecceeiivveerr::w
wrriittee();
// write radio-specific information
}
Casting from a replicated base class to a derived class is discussed in §15.4.2. For a technique for
overriding each of the w
wrriittee() functions with separate functions from derived classes, see §25.6.
15.2.4 Virtual Base Classes [hier.vbase]
The R
Raaddiioo example in the previous subsection works because class SSttoorraabbllee can be safely, conveniently, and efficiently replicated. Often, that is not the case for the kind of class that makes a good
building block for other classes. For example, we might define SSttoorraabbllee to hold the name of the
file to be used for storing the object:
ccllaassss SSttoorraabbllee {
ppuubblliicc:
SSttoorraabbllee(ccoonnsstt cchhaarr* ss);
vviirrttuuaall vvooiidd rreeaadd() = 00;
vviirrttuuaall vvooiidd w
wrriittee() = 00;
vviirrttuuaall ~SSttoorraabbllee();
pprriivvaattee:
ccoonnsstt cchhaarr* ssttoorree;
SSttoorraabbllee(ccoonnsstt SSttoorraabbllee&);
SSttoorraabbllee& ooppeerraattoorr=(ccoonnsstt SSttoorraabbllee&);
};
Given this apparently minor change to SSttoorraabbllee, we must must change the design of R
Raaddiioo. All
parts of an object must share a single copy of SSttoorraabbllee; otherwise, it becomes unnecessarily hard to
avoid storing multiple copies of the object. One mechanism for specifying such sharing is a virtual
base class. Every vviirrttuuaall base of a derived class is represented by the same (shared) object. For
example:
ccllaassss T
Trraannssm
miitttteerr : ppuubblliicc vviirrttuuaall SSttoorraabbllee {
ppuubblliicc:
vvooiidd w
wrriittee();
// ...
};
ccllaassss R
Reecceeiivveerr : ppuubblliicc vviirrttuuaall SSttoorraabbllee {
ppuubblliicc:
vvooiidd w
wrriittee();
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.2.4
Virtual Base Classes
397
ccllaassss R
Raaddiioo : ppuubblliicc T
Trraannssm
miitttteerr, ppuubblliicc R
Reecceeiivveerr {
ppuubblliicc:
vvooiidd w
wrriittee();
// ...
};
Or graphically:
SSttoorraabbllee
R
Reecceeiivveer.r
T
Trraannssm
miitttteerr
R
Raaddiioo
Compare this diagram with the drawing of the SSaatteelllliittee object in §15.2.3 to see the difference
between ordinary inheritance and virtual inheritance. In an inheritance graph, every base class of a
given name that is specified to be virtual will be represented by a single object of that class. On the
other hand, each base class not specified vviirrttuuaall will have its own sub-object representing it.
15.2.4.1 Programming Virtual Bases [hier.vbase.prog]
When defining the functions for a class with a virtual base, the programmer in general cannot know
whether the base will be shared with other derived classes. This can be a problem when implementing a service that requires a base class function to be called exactly once. For example, the
language ensures that a constructor of a virtual base is called exactly once. The constructor of a
virtual base is invoked (implicitly or explicitly) from the constructor for the complete object (the
constructor for the most derived class). For example:
ccllaassss A { // no constructor
// ...
};
ccllaassss B {
ppuubblliicc:
B
B(); // default constructor
// ...
};
ccllaassss C {
ppuubblliicc:
C
C(iinntt);
};
// no default constructor
ccllaassss D : vviirrttuuaall ppuubblliicc A
A, vviirrttuuaall ppuubblliicc B
B, vviirrttuuaall ppuubblliicc C
{
D
D() { /* ... */ }
// error: no default constructor for C
D
D(iinntt ii) : C
C(ii) { /* ... */ }; // ok
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
398
Class Hierarchies
Chapter 15
The constructor for a virtual base is called before the constructors for its derived classes.
Where needed, the programmer can simulate this scheme by calling a virtual base class function
only from the most derived class. For example, assume we have a basic W
Wiinnddoow
w class that knows
how to draw its contents:
ccllaassss W
Wiinnddoow
w{
// basic stuff
vviirrttuuaall vvooiidd ddrraaw
w();
};
In addition, we have various ways of decorating a window and adding facilities:
ccllaassss W
Wiinnddoow
w__w
wiitthh__bboorrddeerr : ppuubblliicc vviirrttuuaall W
Wiinnddoow
w{
// border stuff
vvooiidd oow
wnn__ddrraaw
w(); // display the border
vvooiidd ddrraaw
w();
};
ccllaassss W
Wiinnddoow
w__w
wiitthh__m
meennuu : ppuubblliicc vviirrttuuaall W
Wiinnddoow
w{
// menu stuff
vvooiidd oow
wnn__ddrraaw
w(); // display the menu
vvooiidd ddrraaw
w();
};
The oow
wnn__ddrraaw
w() functions need not be virtual because they are meant to be called from within a
virtual ddrraaw
w() function that ‘‘knows’’ the type of the object for which it was called.
From this, we can compose a plausible C
Clloocckk class:
ccllaassss C
Clloocckk : ppuubblliicc W
Wiinnddoow
w__w
wiitthh__bboorrddeerr, ppuubblliicc W
Wiinnddoow
w__w
wiitthh__m
meennuu {
// clock stuff
vvooiidd oow
wnn__ddrraaw
w(); // display the clock face and hands
vvooiidd ddrraaw
w();
};
Or graphically:
W
Wiinnddoow
w
W
Wiinnddoow
w__w
wiitthh__b
. boorrddeerr
W
Wiinnddoow
w__w
wiitthh__m
meennuu
C
Clloocckk
The ddrraaw
w() functions can now be written using the oow
wnn__ddrraaw
w() functions so that a caller of any
ddrraaw
w() gets W
Wiinnddoow
w::ddrraaw
w() invoked exactly once. This is done independently of the kind of
W
Wiinnddoow
w for which ddrraaw
w() is invoked:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.2.4.1
Programming Virtual Bases
399
vvooiidd W
Wiinnddoow
w__w
wiitthh__bboorrddeerr::ddrraaw
w()
{
W
Wiinnddoow
w::ddrraaw
w();
oow
wnn__ddrraaw
w(); // display the border
}
vvooiidd W
Wiinnddoow
w__w
wiitthh__m
meennuu::ddrraaw
w()
{
W
Wiinnddoow
w::ddrraaw
w();
oow
wnn__ddrraaw
w(); // display the menu
}
vvooiidd C
Clloocckk::ddrraaw
w()
{
W
Wiinnddoow
w::ddrraaw
w();
W
Wiinnddoow
w__w
wiitthh__bboorrddeerr::oow
wnn__ddrraaw
w();
W
Wiinnddoow
w__w
wiitthh__m
meennuu::oow
wnn__ddrraaw
w();
oow
wnn__ddrraaw
w(); // display the clock face and hands
}
Casting from a vviirrttuuaall base class to a derived class is discussed in §15.4.2.
15.2.5 Using Multiple Inheritance [hier.using.mi]
The simplest and most obvious use of multiple inheritance is to ‘‘glue’’ two otherwise unrelated
classes together as part of the implementation of a third class. The SSaatteelllliittee class built out of the
T
Taasskk and D
Diissppllaayyeedd classes in §15.2 is an example of this. This use of multiple inheritance is
crude, effective, and important, but not very interesting. Basically, it saves the programmer from
writing a lot of forwarding functions. This technique does not affect the overall design of a program significantly and can occasionally clash with the wish to keep implementation details hidden.
However, a technique doesn’t have to be clever to be useful.
Using multiple inheritance to provide implementations for abstract classes is more fundamental
in that it affects the way a program is designed. Class B
BB
B__iivvaall__sslliiddeerr (§12.3) is an example:
ccllaassss B
BB
B__iivvaall__sslliiddeerr
: ppuubblliicc IIvvaall__sslliiddeerr // interface
, pprrootteecctteedd B
BB
Bsslliiddeerr // implementation
{
// implementation of functions required by ‘Ival_slider’ and ‘BBslider’
// using the facilities provided by ‘BBslider’
};
In this example, the two base classes play logically distinct roles. One base is a public abstract
class providing the interface and the other is a protected concrete class providing implementation
‘‘details.’’ These roles are reflected in both the style of the classes and in the access control provided. The use of multiple inheritance is close to essential here because the derived class needs to
override virtual functions from both the interface and the implementation.
Multiple inheritance allows sibling classes to share information without introducing a dependence on a unique common base class in a program. This is the case in which the so-called
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
400
Class Hierarchies
Chapter 15
diamond-shaped inheritance occurs (for example, the R
Raaddiioo (§15.2.4) and C
Clloocckk (§15.2.4.1)). A
virtual base class, as opposed to an ordinary base class, is needed if the base class cannot be replicated.
I find that a diamond-shaped inheritance lattice is most manageable if either the virtual base
class or the classes directly derived from it are abstract classes. For example, consider again the
IIvvaall__bbooxx classes from §12.4. In the end, I made all the IIvvaall__bbooxx classes abstract to reflect their
role as pure interfaces. Doing that allowed me to place all implementation details in specific implementation classes. Also, all sharing of implementation details was done in the classical hierarchy
of the windows system used for the implementation.
It would make sense for the class implementing a P
Pooppuupp__iivvaall__sslliiddeerr to share most of the
implementation of the class implementing a plain IIvvaall__sslliiddeerr. After all, these implementation
classes would share everything except the handling of prompts. However, it would then seem natural to avoid replication of IIvvaall__sslliiddeerr objects within the resulting slider implementation objects.
Therefore, we could make IIvvaall__sslliiddeerr a virtual base:
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc vviirrttuuaall IIvvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bsslliiddeerr { /* ... */ };
ccllaassss P
Pooppuupp__iivvaall__sslliiddeerr : ppuubblliicc vviirrttuuaall IIvvaall__sslliiddeerr { /* ... */ };
ccllaassss B
BB
B__ppooppuupp__iivvaall__sslliiddeerr
: ppuubblliicc vviirrttuuaall P
Pooppuupp__iivvaall__sslliiddeerr, pprrootteecctteedd B
BB
B__iivvaall__sslliiddeerr { /* ... */ };
or graphically:
IIvvaall__sslliiddeerr
P
Pooppuupp__iivvaall__s
.slliiddeerr
B
BB
Bsslliiddeerr
B
BB
B__iivvaall__sslliiddeerr
B
BB
B__ppooppuupp__iivvaall__sslliiddeerr
It is easy to imagine further interfaces derived from P
Pooppuupp__iivvaall__sslliiddeerr and further implementation
classes derived from such classes and B
BB
B__ppooppuupp__sslliiddeerr.
If we take this idea to its logical conclusion, all of the derivations from the abstract classes that
constitute our application’s interfaces would become virtual. This does indeed seem to be the most
logical, general, and flexible approach. The reason I didn’t do that was partly historical and partly
because the most obvious and common techniques for implementing virtual bases impose time and
space overhead that make their extensive use within a class unattractive. Should this overhead
become an issue for an otherwise attractive design, note that an object representing an IIvvaall__sslliiddeerr
usually holds only a virtual table pointer. As noted in §15.2.4, such an abstract class holding no
variable data can be replicated without ill effects. Thus, we can eliminate the virtual base in favor
of ordinary ones:
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bsslliiddeerr { /* ... */ };
ccllaassss P
Pooppuupp__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr { /* ... */ };
ccllaassss B
BB
B__ppooppuupp__iivvaall__sslliiddeerr
: ppuubblliicc P
Pooppuupp__iivvaall__sslliiddeerr, pprrootteecctteedd B
BB
B__iivvaall__sslliiddeerr { /* ... */ };
or graphically:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.2.5
Using Multiple Inheritance
IIvvaall__sslliiddeerr
IIvvaall__sslliiddeerr
P
Pooppuupp__iivvaall__s
.slliiddeerr
401
B
BB
Bsslliiddeerr
B
BB
B__iivvaall__sslliiddeerr
B
BB
B__ppooppuupp__iivvaall__sslliiddeerr
This is most likely a viable optimization to the admittedly cleaner alternative presented previously.
15.2.5.1 Overriding Virtual Base Functions [hier.dominance]
A derived class can override a virtual function of its direct or indirect virtual base class. In particular, two different classes might override different virtual functions from the virtual base. In that
way, several derived classes can contribute implementations to the interface presented by a virtual
base class. For example, the W
Wiinnddoow
w class might have functions sseett__ccoolloorr() and pprroom
mpptt(). In
that case, W
Wiinnddoow
w__w
wiitthh__bboorrddeerr might override sseett__ccoolloorr() as part of controlling the color
scheme and W
Wiinnddoow
w__w
wiitthh__m
meennuu might override pprroom
mpptt() as part of its control of user interactions:
ccllaassss W
Wiinnddoow
w{
// ...
vviirrttuuaall sseett__ccoolloorr(C
Coolloorr) = 00;
vviirrttuuaall vvooiidd pprroom
mpptt() = 00;
};
// set background color
ccllaassss W
Wiinnddoow
w__w
wiitthh__bboorrddeerr : ppuubblliicc vviirrttuuaall W
Wiinnddoow
w{
// ...
sseett__ccoolloorr(C
Coolloorr);
// control background color
};
ccllaassss W
Wiinnddoow
w__w
wiitthh__m
meennuu : ppuubblliicc vviirrttuuaall W
Wiinnddoow
w{
// ...
vvooiidd pprroom
mpptt(); // control user interactions
};
ccllaassss M
Myy__w
wiinnddoow
w : ppuubblliicc W
Wiinnddoow
w__w
wiitthh__m
meennuu, ppuubblliicc W
Wiinnddoow
w__w
wiitthh__bboorrddeerr {
// ...
};
What if different derived classes override the same function? This is allowed if and only if some
overriding class is derived from every other class that overrides the function. That is, one function
must override all others. For example, M
Myy__w
wiinnddoow
w could override pprroom
mpptt() to improve on what
W
Wiinnddoow
w__w
wiitthh__m
meennuu provides:
ccllaassss M
Myy__w
wiinnddoow
w : ppuubblliicc W
Wiinnddoow
w__w
wiitthh__m
meennuu, ppuubblliicc W
Wiinnddoow
w__w
wiitthh__bboorrddeerr {
// ...
vvooiidd pprroom
mpptt(); // don’t leave user interactions to base
};
or graphically:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
402
Class Hierarchies
Chapter 15
W
Wiinnddoow
w { sseett__ccoolloorr(), pprroom
mpptt() }
W
Wiinnddoow
w__w
wiitthh__bboorrddeerr { sseett__ccoolloorr() }
W
Wiinnddoow
w__w
wiitthh__m
meennuu { pprroom
mpptt() }
M
Myy__w
wiinnddoow
w { pprroom
mpptt() }
If two classes override a base class function, but neither overrides the other, the class hierarchy is
an error. No virtual function table can be constructed because a call to that function on the complete object would have been ambiguous. For example, had R
Raaddiioo in §15.2.4 not declared
w
wrriittee(), the declarations of w
wrriittee() in R
Reecceeiivveerr and T
Trraannssm
miitttteerr would have caused an error
when defining R
Raaddiioo. As with R
Raaddiioo, such a conflict is resolved by adding an overriding function
to the most derived class.
A class that provides some – but not all – of the implementation for a virtual base class is often
called a ‘‘mixin.’’
15.3 Access Control [hier.access]
A member of a class can be pprriivvaattee, pprrootteecctteedd, or ppuubblliicc:
– If it is pprriivvaattee, its name can be used only by member functions and friends of the class in
which it is declared.
– If it is pprrootteecctteedd, its name can be used only by member functions and friends of the class in
which it is declared and by member functions and friends of classes derived from this class
(see §11.5).
– If it is ppuubblliicc, its name can be used by any function.
This reflects the view that there are three kinds of functions accessing a class: functions implementing the class (its friends and members), functions implementing a derived class (the derived class’
friends and members), and other functions. This can be presented graphically:
general users
derived class’ member functions and friends
own member functions and friends
.....................................................................
.
.
.
.
.
.
public:
.
.
.
.
.
.
protected:
.
.
.
.
.
.
.
.
private:
.
.
.....................................................................
The access control is applied uniformly to names. What a name refers to does not affect the control
of its use. This means that we can have private member functions, types, constants, etc., as well as
private data members. For example, an efficient non-intrusive (§16.2.1) list class often requires
data structures to keep track of elements. Such information is best kept private:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.3
Access Control
403
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt {
pprriivvaattee:
ssttrruucctt L
Liinnkk { T vvaall; L
Liinnkk* nneexxtt; };
ssttrruucctt C
Chhuunnkk {
eennuum
m { cchhuunnkk__ssiizzee = 1155 };
L
Liinnkk vv[cchhuunnkk__ssiizzee];
C
Chhuunnkk* nneexxtt;
};
ccllaassss U
Unnddeerrfflloow
w { };
C
Chhuunnkk* aallllooccaatteedd;
L
Liinnkk* ffrreeee;
L
Liinnkk* ggeett__ffrreeee();
L
Liinnkk* hheeaadd;
ppuubblliicc:
vvooiidd iinnsseerrtt(T
T);
T ggeett();
// ...
};
tteem
mppllaattee<ccllaassss T
T> vvooiidd L
Liisstt<T
T>::iinnsseerrtt(T
T vvaall)
{
L
Liinnkk* llnnkk = ggeett__ffrreeee();
llnnkk->vvaall = vvaall;
llnnkk->nneexxtt = hheeaadd;
hheeaadd = llnnkk;
}
tteem
mppllaattee<ccllaassss T
T> L
Liisstt<T
T>::L
Liinnkk* L
Liisstt<T
T>::ggeett__ffrreeee()
{
iiff (ffrreeee == 00) {
// allocate a new chunk and place its Links on the free list
}
L
Liinnkk* p = ffrreeee;
ffrreeee = ffrreeee->nneexxtt;
rreettuurrnn pp;
}
tteem
mppllaattee<ccllaassss T
T> T L
Liisstt<T
T>::ggeett()
{
iiff (hheeaadd == 00) tthhrroow
w U
Unnddeerrfflloow
w();
L
Liinnkk* pp= hheeaadd;
hheeaadd = pp->nneexxtt;
pp->nneexxtt = ffrreeee;
ffrreeee = pp;
rreettuurrnn pp->vvaall;
}
The L
Liisstt<T
T> scope is entered by saying L
Liisstt<T
T>:: in a member function definition. Because the
return type of ggeett__ffrreeee() is mentioned before the name L
Liisstt<T
T>::ggeett__ffrreeee() is mentioned, the
full name L
Liisstt<T
T>::L
Liinnkk must be used instead of the abbreviation L
Liinnkk<T
T>.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
404
Class Hierarchies
Chapter 15
Nonmember functions (except friends) do not have such access:
vvooiidd w
woouulldd__bbee__m
meeddddlleerr(L
Liisstt<T
T>* pp)
{
L
Liisstt<T
T>::L
Liinnkk* q = 00;
// ...
q = pp->ffrreeee;
// ...
iiff (L
Liisstt<T
T>::C
Chhuunnkk::cchhuunnkk__ssiizzee > 3311) {
// ...
}
}
// error: List<T>::Link is private
// error: List<T>::free is private
// error: List<T>::Chunk::chunk_size is private
In a ccllaassss, a member is by default private; in a ssttrruucctt, a member is by default public (§10.2.8).
15.3.1 Protected Members [hier.protected]
As an example of how to use pprrootteecctteedd members, consider the W
Wiinnddoow
w example from §15.2.4.1.
The oow
wnn__ddrraaw
w() functions were (deliberately) incomplete in the service they provided. They
were designed as building blocks for use by derived classes (only) and are not safe or convenient
for general use. The ddrraaw
w() operations, on the other hand, were designed for general use. This
distinction can be expressed by separating the interface of the W
Wiinnddoow
w classes in two, the pprrootteecctteedd
interface and the ppuubblliicc interface:
ccllaassss W
Wiinnddoow
w__w
wiitthh__bboorrddeerr {
ppuubblliicc:
vviirrttuuaall vvooiidd ddrraaw
w();
// ...
pprrootteecctteedd:
vvooiidd oow
wnn__ddrraaw
w();
// other tool-building stuff
pprriivvaattee:
// representation, etc.
};
A derived class can access a base class’ protected members only for objects of its own type:
ccllaassss B
Buuffffeerr {
pprrootteecctteedd:
cchhaarr aa[112288];
// ...
};
ccllaassss L
Liinnkkeedd__bbuuffffeerr : ppuubblliicc B
Buuffffeerr { /* ... */ };
ccllaassss C
Cyycclliicc__bbuuffffeerr : ppuubblliicc B
Buuffffeerr {
// ...
vvooiidd ff(L
Liinnkkeedd__bbuuffffeerr* pp) {
aa[00] = 00;
// ok: access to cyclic_buffer’s own protected member
pp->aa[00] = 00; // error: access to protected member of different type
}
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.3.1
Protected Members
405
This prevents subtle errors that would otherwise occur when one derived class corrupts data
belonging to other derived classes.
15.3.1.1 Use of Protected Members [hier.protected.use]
The simple private/public model of data hiding serves the notion of concrete types (§10.3) well.
However, when derived classes are used, there are two kinds of users of a class: derived classes and
‘‘the general public.’’ The members and friends that implement the operations on the class operate
on the class objects on behalf of these users. The private/public model allows the programmer to
distinguish clearly between the implementers and the general public, but it does not provide a way
of catering specifically to derived classes.
Members declared pprrootteecctteedd are far more open to abuse than members declared pprriivvaattee. In
particular, declaring data members protected is usually a design error. Placing significant amounts
of data in a common class for all derived classes to use leaves that data open to corruption. Worse,
protected data, like public data, cannot easily be restructured because there is no good way of finding every use. Thus, protected data becomes a software maintenance problem.
Fortunately, you don’t have to use protected data; pprriivvaattee is the default in classes and is usually
the better choice. In my experience, there have always been alternatives to placing significant
amounts of information in a common base class for derived classes to use directly.
Note that none of these objections are significant for protected member functions; pprrootteecctteedd is a
fine way of specifying operations for use in derived classes. The IIvvaall__sslliiddeerr in §12.4.2 is an example of this. Had the implementation class been pprriivvaattee in this example, further derivation would
have been infeasible.
Technical examples illustrating access to members can be found in §C.11.1.
15.3.2 Access to Base Classes [hier.base.access]
Like a member, a base class can be declared pprriivvaattee, pprrootteecctteedd, or ppuubblliicc. For example:
ccllaassss X : ppuubblliicc B { /* ... */ };
ccllaassss Y : pprrootteecctteedd B { /* ... */ };
ccllaassss Z : pprriivvaattee B { /* ... */ };
Public derivation makes the derived class a subtype of its base; this is the most common form of
derivation. Protected and private derivation are used to represent implementation details. Protected
bases are useful in class hierarchies in which further derivation is the norm; the IIvvaall__sslliiddeerr from
§12.4.2 is a good example of that. Private bases are most useful when defining a class by restricting the interface to a base so that stronger guarantees can be provided. For example, V
Veecc adds
range checking to its private base vveeccttoorr (§3.7.1) and the lliisstt of pointers template adds type checking to its lliisstt<vvooiidd*> base (§13.5).
The access specifier for a base class can be left out. In that case, the base defaults to a private
base for a ccllaassss and a public base for a ssttrruucctt. For example:
ccllaassss X
XX
X : B { /* ... */ };
ssttrruucctt Y
YY
Y : B { /* ... */ };
// B is a private base
// B is a public base
For readability, it is best always to use an explicit access specifier.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
406
Class Hierarchies
Chapter 15
The access specifier for a base class controls the access to members of the base class and the
conversion of pointers and references from the derived class type to the base class type. Consider a
class D derived from a base class B
B:
– If B is a pprriivvaattee base, its public and protected members can be used only by member functions and friends of D
D. Only friends and members of D can convert a D
D* to a B
B*.
– If B is a pprrootteecctteedd base, its public and protected members can be used only by member
functions and friends of D and by member functions and friends of classes derived from D
D.
Only friends and members of D and friends and members of classes derived from D can
convert a D
D* to a B
B*.
– If B is a ppuubblliicc base, its public members can be used by any function. In addition, its protected members can be used by members and friends of D and members and friends of
classes derived from D
D. Any function can convert a D
D* to a B
B*.
This basically restates the rules for member access (§15.3). We choose access for bases in the same
way as for members. For example, I chose to make B
BB
Bw
wiinnddoow
w a pprrootteecctteedd base of IIvvaall__sslliiddeerr
(§12.4.2) because B
BB
Bw
wiinnddoow
w was part of the implementation of IIvvaall__sslliiddeerr rather than part of its
interface. However, I couldn’t completely hide B
BB
Bw
wiinnddoow
w by making it a private base because I
wanted to be able to derive further classes from IIvvaall__sslliiddeerr, and those derived classes would need
access to the implementation.
Technical examples illustrating access to bases can be found in §C.11.2.
15.3.2.1 Multiple Inheritance and Access Control [hier.mi.access]
If a name or a base class can be reached through multiple paths in a multiple inheritance lattice, it is
accessible if it is accessible through any path. For example:
ssttrruucctt B {
iinntt m
m;
ssttaattiicc iinntt ssm
m;
// ...
};
ccllaassss D
D11 : ppuubblliicc vviirrttuuaall B { /* ... */ } ;
ccllaassss D
D22 : ppuubblliicc vviirrttuuaall B { /* ... */ } ;
ccllaassss D
DD
D : ppuubblliicc D
D11, pprriivvaattee D
D22 { /* ... */ };
D
DD
D* ppdd = nneew
w D
DD
D;
B
B* ppbb = ppdd;
iinntt ii11 = ppdd->m
m;
// ok: accessible through D1
// ok: accessible through D1
If a single entity is reachable through several paths, we can still refer to it without ambiguity. For
example:
ccllaassss X
X11 : ppuubblliicc B { /* ... */ } ;
ccllaassss X
X22 : ppuubblliicc B { /* ... */ } ;
ccllaassss X
XX
X : ppuubblliicc X
X11, ppuubblliicc X
X22 { /* ... */ };
X
XX
X* ppxxxx = nneew
w X
XX
X;
iinntt ii11 = ppxxxx->m
m;
// error, ambiguous: XX::X1::B::m or XX::X2::B::m
iinntt ii22 = ppxxxx->ssm
m;
// ok: there is only one B::sm in an XX
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.3.2.1
Multiple Inheritance and Access Control
407
15.3.2.2 Using-Declarations and Access Control [hier.access.using]
A using-declaration cannot be used to gain access to additional information. It is simply a mechanism for making accessible information more convenient to use. On the other hand, once access is
available, it can be granted to other users. For example:
ccllaassss B {
pprriivvaattee:
iinntt aa;
pprrootteecctteedd:
iinntt bb;
ppuubblliicc:
iinntt cc;
};
ccllaassss D : ppuubblliicc B {
ppuubblliicc:
uussiinngg B
B::aa;
// error: B::a is private
uussiinngg B
B::bb;
// make B::b publically available through D
};
When a using-declaration is combined with private or protected derivation, it can be used to specify interfaces to some, but not all, of the facilities usually offered by a class. For example:
ccllaassss B
BB
B : pprriivvaattee B {
uussiinngg B
B::bb;
uussiinngg B
B::cc;
};
// give access to B::b and B::c, but not B::a
See also §15.2.2.
15.4 Run-Time Type Information [hier.rtti]
A plausible use of the IIvvaall__bbooxxes defined in §12.4 would be to hand them to a system that controlled a screen and have that system hand objects back to the application program whenever some
activity had occurred. This is how many user-interfaces work. However, a user-interface system
will not know about our IIvvaall__bbooxxes. The system’s interfaces will be specified in terms of the
system’s own classes and objects rather than our application’s classes. This is necessary and
proper. However, it does have the unpleasant effect that we lose information about the type of
objects passed to the system and later returned to us.
Recovering the ‘‘lost’’ type of an object requires us to somehow ask the object to reveal its
type. Any operation on an object requires us to have a pointer or reference of a suitable type for the
object. Consequently, the most obvious and useful operation for inspecting the type of an object at
run time is a type conversion operation that returns a valid pointer if the object is of the expected
type and a null pointer if it isn’t. The ddyynnaam
miicc__ccaasstt operator does exactly that. For example,
assume that ‘‘the system’’ invokes m
myy__eevveenntt__hhaannddlleerr() with a pointer to a B
BB
Bw
wiinnddoow
w, where an
activity has occurred. I then might invoke my application code using IIvvaall__bbooxx’s ddoo__ssoom
meetthhiinngg():
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
408
Class Hierarchies
Chapter 15
vvooiidd m
myy__eevveenntt__hhaannddlleerr(B
BB
Bw
wiinnddoow
w* ppw
w)
{
iiff (IIvvaall__bbooxx* ppbb = ddyynnaam
miicc__ccaasstt<IIvvaall__bbooxx*>(ppw
w))
ppbb->ddoo__ssoom
meetthhiinngg();
eellssee {
// Oops! unexpected event
}
}
// does pw point to an Ival_box?
One way of explaining what is going on is that ddyynnaam
miicc__ccaasstt translates from the implementationoriented language of the user-interface system to the language of the application. It is important to
note what is not mentioned in this example: the actual type of the object. The object will be a particular kind of IIvvaall__bbooxx, say an IIvvaall__sslliiddeerr, implemented by a particular kind of B
BB
Bw
wiinnddoow
w, say a
B
BB
Bsslliiddeerr. It is neither necessary nor desirable to make the actual type of the object explicit in this
interaction between ‘‘the system’’ and the application. An interface exists to represent the essentials of an interaction. In particular, a well-designed interface hides inessential details.
Graphically, the action of
ppbb = ddyynnaam
miicc__ccaasstt<IIvvaall__bbooxx*>(ppw
w)
can be represented like this:
ppw
w. . . . . . . . . . . B
BB
Bw
wiinnddoow
w
B
BB
Bsslliiddeerr
IIvvaall__bbooxx
. . . . . . . . . . .p
pbb
IIvvaall__sslliiddeerr
..
B
BB
B__iivvaall__sslliiddeerr
The arrows from ppw
w and ppbb represent the pointers into the object passed, whereas the rest of the
arrows represent the inheritance relationships between the different parts of the object passed.
The use of type information at run time is conventionally referred to as ‘‘run-time type information’’ and often abbreviated to RTTI.
Casting from a base class to a derived class is often called a downcast because of the convention
of drawing inheritance trees growing from the root down. Similarly, a cast from a derived class to
a base is called an upcast. A cast that goes from a base to a sibling class, like the cast from B
BB
Bw
wiinn-ddoow
w to IIvvaall__bbooxx, is called a crosscast.
15.4.1 Dynamic_cast [hier.dynamic.cast]
The ddyynnaam
miicc__ccaasstt operator takes two operands, a type bracketed by < and >, and a pointer or reference bracketed by ( and ).
Consider first the pointer case:
ddyynnaam
miicc__ccaasstt<T
T*>(pp)
If p is of type T
T* or an accessible base class of T
T, the result is exactly as if we had simply assigned
p to a T
T*. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.4.1
Dynamic_cast
409
ccllaassss B
BB
B__iivvaall__sslliiddeerr : ppuubblliicc IIvvaall__sslliiddeerr, pprrootteecctteedd B
BB
Bsslliiddeerr {
// ...
};
vvooiidd ff(B
BB
B__iivvaall__sslliiddeerr* pp)
{
IIvvaall__sslliiddeerr* ppii11 = pp;
// ok
IIvvaall__sslliiddeerr* ppii22 = ddyynnaam
miicc__ccaasstt<IIvvaall__sslliiddeerr*>(pp);
// ok
B
BB
Bsslliiddeerr* ppbbbb11 = pp;
// error: BBslider is a protected base
B
BB
Bsslliiddeerr* ppbbbb22 = ddyynnaam
miicc__ccaasstt<B
BB
Bsslliiddeerr*>(pp);
// ok: pbb2 becomes 0
}
That is the uninteresting case. However, it is reassuring to know that ddyynnaam
miicc__ccaasstt doesn’t allow
accidental violation of the protection of private and protected base classes.
The purpose of ddyynnaam
miicc__ccaasstt is to deal with the case in which the correctness of the conversion
cannot be determined by the compiler. In that case,
ddyynnaam
miicc__ccaasstt<T
T*>(pp)
looks at the object pointed to by p (if any). If that object is of class T or has a unique base class of
type T
T, then ddyynnaam
miicc__ccaasstt returns a pointer of type T
T* to that object; otherwise, 0 is returned. If
the value of p is 00, ddyynnaam
miicc__ccaasstt<T
T*>(pp) returns 00. Note the requirement that the conversion
must be to a uniquely identified object. It is possible to construct examples where the conversion
fails and 0 is returned because the object pointed to by p has more than one sub-object representing
bases of type T (see §15.4.2).
A ddyynnaam
miicc__ccaasstt requires a pointer or a reference to a polymorphic type to do a downcast or a
crosscast. For example:
ccllaassss M
Myy__sslliiddeerr: ppuubblliicc IIvvaall__sslliiddeerr { // polymorphic base (Ival_slider has virtual functions)
// ...
};
ccllaassss M
Myy__ddaattee : ppuubblliicc D
Daattee { // base not polymorphic (Date has no virtual functions)
// ...
};
vvooiidd gg(IIvvaall__bbooxx* ppbb, D
Daattee* ppdd)
{
M
Myy__sslliiddeerr* ppdd11 = ddyynnaam
miicc__ccaasstt<M
Myy__sslliiddeerr*>(ppbb);
M
Myy__ddaattee* ppdd22 = ddyynnaam
miicc__ccaasstt<M
Myy__ddaattee*>(ppdd);
}
// ok
// error: Date not polymorphic
Requiring the pointer’s type to be polymorphic simplifies the implementation of ddyynnaam
miicc__ccaasstt
because it makes it easy to find a place to hold the necessary information about the object’s type. A
typical implementation will attach a ‘‘type information object’’ to an object by placing a pointer to
the type information in the object’s virtual function table (§2.5.5). For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
410
Class Hierarchies
Chapter 15
M
Myy__bbooxx::
......
vvppttrr
......
...
...
...
vvttbbll::
.
.....
ttyyppee__iinnffoo::
.
""M
Myy__bbooxx""
ttyyppe.e__iinnffoo::
bbaasseess . . . . . . . . . . .
""IIvvaall__sslliiddeerr""
.....
.
The dashed arrow represents an offset that allows the start of the complete object to be found given
only a pointer to a polymorphic sub-object. It is clear that ddyynnaam
miicc__ccaasstt can be efficiently implemented. All that is involved are a few comparisons of ttyyppee__iinnffoo objects representing base classes;
no expensive lookups or string comparisons are needed.
Restricting ddyynnaam
miicc__ccaasstt to polymorphic types also makes sense from a logical point of view.
This is, if an object has no virtual functions, it cannot safely be manipulated without knowledge of
its exact type. Consequently, care should be taken not to get such an object into a context in which
its type isn’t known. If its type is known, we don’t need to use ddyynnaam
miicc__ccaasstt.
The target type of ddyynnaam
miicc__ccaasstt need not be polymorphic. This allows us to wrap a concrete
type in a polymorphic type, say for transmission through an object I/O system (see §25.4.1), and
then ‘‘unwrap’’ the concrete type later. For example:
ccllaassss IIoo__oobbjj {
// base class for object I/O system
vviirrttuuaall IIoo__oobbjj* cclloonnee() = 00;
};
ccllaassss IIoo__ddaattee : ppuubblliicc D
Daattee, ppuubblliicc IIoo__oobbjj { };
vvooiidd ff(IIoo__oobbjj* ppiioo)
{
D
Daattee* ppdd = ddyynnaam
miicc__ccaasstt<D
Daattee*>(ppiioo);
// ...
}
A ddyynnaam
miicc__ccaasstt to vvooiidd* can be used to determine the address of the beginning of an object of
polymorphic type. For example:
vvooiidd gg(IIvvaall__bbooxx* ppbb, D
Daattee* ppdd)
{
vvooiidd* ppdd11 = ddyynnaam
miicc__ccaasstt<vvooiidd*>(ppbb);
vvooiidd* ppdd22 = ddyynnaam
miicc__ccaasstt<vvooiidd*>(ppdd);
}
// ok
// error: Date not polymorphic
This is only useful for interaction with very low-level functions.
15.4.1.1 Dynamic_cast of References [hier.re.cast]
To get polymorphic behavior, an object must be manipulated through a pointer or a reference.
When a ddyynnaam
miicc__ccaasstt is used for a pointer type, a 0 indicates failure. That is neither feasible nor
desirable for references.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.4.1.1
Dynamic_cast of References
411
Given a pointer result, we must consider the possibility that the result is 00; that is, that the
pointer doesn’t point to an object. Consequently, the result of a ddyynnaam
miicc__ccaasstt of a pointer should
always be explicitly tested. For a pointer pp, ddyynnaam
miicc__ccaasstt<T
T*>(pp) can be seen as the question,
‘‘Is the object pointed to by p of type T
T?’’
On the other hand, we may legitimately assume that a reference refers to an object. Consequently, ddyynnaam
miicc__ccaasstt<T
T&>(rr) of a reference r is not a question but an assertion: ‘‘The object
referred to by r is of type T
T.’’ The result of a ddyynnaam
miicc__ccaasstt for a reference is implicitly tested by
the implementation of ddyynnaam
miicc__ccaasstt itself. If the operand of a ddyynnaam
miicc__ccaasstt to a reference isn’t of
the expected type, a bbaadd__ccaasstt exception is thrown. For example:
vvooiidd ff(IIvvaall__bbooxx* pp, IIvvaall__bbooxx& rr)
{
iiff (IIvvaall__sslliiddeerr* iiss = ddyynnaam
miicc__ccaasstt<IIvvaall__sslliiddeerr*>(pp)) {
// use ‘is’
}
eellssee {
// *p not a slider
}
IIvvaall__sslliiddeerr& iiss = ddyynnaam
miicc__ccaasstt<IIvvaall__sslliiddeerr&>(rr);
// use ‘is’
// does p point to an Ival_slider?
// r references an Ival_slider!
}
The difference in results of a failed dynamic pointer cast and a failed dynamic reference cast
reflects a fundamental difference between references and pointers. If a user wants to protect against
bad casts to references, a suitable handler must be provided. For example:
vvooiidd gg()
{
ttrryy {
ff(nneew
w B
BB
B__iivvaall__sslliiddeerr,*nneew
w B
BB
B__iivvaall__sslliiddeerr);
ff(nneew
w B
BB
Bddiiaall,*nneew
w B
BB
Bddiiaall);
}
ccaattcchh (bbaadd__ccaasstt) {
// ...
}
// arguments passed as Ival_boxs
// arguments passed as Ival_boxs
// §14.10
}
The first call to ff() will return normally, while the second will cause a bbaadd__ccaasstt exception that
will be caught by gg().
Explicit tests against 0 can be – and therefore occasionally will be – accidentally omitted. If
that worries you, you can write a conversion function that throws an exception instead returning 0
(§15.8[1]) in case of failure.
15.4.2 Navigating Class Hierarchies [hier.navigate]
When only single inheritance is used, a class and its base classes constitute a tree rooted in a single
base class. This is simple but often constraining. When multiple inheritance is used, there is no
single root. This in itself doesn’t complicate matters much. However, if a class appears more than
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
412
Class Hierarchies
Chapter 15
once in a hierarchy, we must be a bit careful when we refer to the object or objects that represent
that class.
Naturally, we try to keep hierarchies as simple as our application allows (and no simpler).
However, once a nontrivial hierarchy has been made we soon need to navigate it to find an appropriate class to use as an interface. This need occurs in two variants. That is, sometimes, we want to
explicitly name an object of a base class or a member of a base class; §15.2.3 and §15.2.4.1 are
examples of this. At other times, we want to get a pointer to the object representing a base or
derived class of an object given a pointer to a complete object or some sub-object; §15.4 and
§15.4.1 are examples of this.
Here, we consider how to navigate a class hierarchy using type conversions (casts) to gain a
pointer of the desired type. To illustrate the mechanisms available and the rules that guide them,
consider a lattice containing both a replicated base and a virtual base:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
C
Coom
mppoonneenntt : ppuubblliicc vviirrttuuaall SSttoorraabbllee { /* ... */ };
R
Reecceeiivveerr : ppuubblliicc C
Coom
mppoonneenntt { /* ... */ };
T
Trraannssm
miitttteerr : ppuubblliicc C
Coom
mppoonneenntt { /* ... */ };
R
Raaddiioo : ppuubblliicc R
Reecceeiivveerr, ppuubblliicc T
Trraannssm
miitttteerr { /* ... */ };
Or graphically:
SSttoor
.raabbllee
C
Coom
mppoonneenntt
C
Coom
mppoonneenntt
R
Reecceeiivveerr
T
Trraannssm
miitttteerr
R
Raaddiioo
Here, a R
Raaddiioo object has two sub-objects of class C
Coom
mppoonneenntt. Consequently, a ddyynnaam
miicc__ccaasstt
from SSttoorraabbllee to C
Coom
mppoonneenntt within a R
Raaddiioo will be ambiguous and return a 00. There is simply no
way of knowing which C
Coom
mppoonneenntt the programmer wanted:
vvooiidd hh11(R
Raaddiioo& rr)
{
SSttoorraabbllee* ppss = &rr;
// ...
C
Coom
mppoonneenntt* ppcc = ddyynnaam
miicc__ccaasstt<C
Coom
mppoonneenntt*>(ppss); // pc = 0
}
This ambiguity is not in general detectable at compile time:
vvooiidd hh22(SSttoorraabbllee* ppss)
// ps might or might not point to a Component
{
C
Coom
mppoonneenntt* ppcc = ddyynnaam
miicc__ccaasstt<C
Coom
mppoonneenntt*>(ppss);
// ...
}
This kind of run-time ambiguity detection is needed only for virtual bases. For ordinary bases,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.4.2
Navigating Class Hierarchies
413
there is always a unique sub-object of a given cast (or none) when downcasting (that is, towards a
derived class; §15.4). The equivalent ambiguity occurs when upcasting (that is, towards a base;
§15.4) and such ambiguities are caught at compile time.
15.4.2.1 Static and Dynamic Casts [hier.static.cast]
A ddyynnaam
miicc__ccaasstt can cast from a polymorphic virtual base class to a derived class or a sibling class
(§15.4.1). A ssttaattiicc__ccaasstt (§6.2.7) does not examine the object it casts from, so it cannot:
vvooiidd gg(R
Raaddiioo& rr)
{
R
Reecceeiivveerr* pprreecc = &rr;
R
Raaddiioo* pprr = ssttaattiicc__ccaasstt<R
Raaddiioo*>(pprreecc);
pprr = ddyynnaam
miicc__ccaasstt<R
Raaddiioo*>(pprreecc);
SSttoorraabbllee* ppss = &rr;
pprr = ssttaattiicc__ccaasstt<R
Raaddiioo*>(ppss);
pprr = ddyynnaam
miicc__ccaasstt<R
Raaddiioo*>(ppss);
// Receiver is ordinary base of Radio
// ok, unchecked
// ok, run-time checked
// Storable is virtual base of Radio
// error: cannot cast from virtual base
// ok, run-time checked
}
The ddyynnaam
miicc__ccaasstt requires a polymorphic operand because there is no information stored in a nonpolymorphic object that can be used to find the objects for which it represents a base. In particular,
an object of a type with layout constraints determined by some other language – such as Fortran or
C – may be used as a virtual base class. For objects of such types, only static type information will
be available. However, the information needed to provide run-time type identification includes the
information needed to implement the ddyynnaam
miicc__ccaasstt.
Why would anyone want to use a ssttaattiicc__ccaasstt for class hierarchy navigation? There is a small
run-time cost associated with the use of a ddyynnaam
miicc__ccaasstt (§15.4.1). More significantly, there are
millions of lines of code that were written before ddyynnaam
miicc__ccaasstt became available. This code relies
on alternative ways of making sure that a cast is valid, so the checking done by ddyynnaam
miicc__ccaasstt is
seen as redundant. However, such code is typically written using the C-style cast (§6.2.7); often
obscure errors remain. Where possible, use the safer ddyynnaam
miicc__ccaasstt.
The compiler cannot assume anything about the memory pointed to by a vvooiidd*. This implies
that ddyynnaam
miicc__ccaasstt – which must look into an object to determine its type – cannot cast from a
vvooiidd*. For that, a ssttaattiicc__ccaasstt is needed. For example:
R
Raaddiioo* ff(vvooiidd* pp)
{
SSttoorraabbllee* ppss = ssttaattiicc__ccaasstt<SSttoorraabbllee*>(pp); // trust the programmer
rreettuurrnn ddyynnaam
miicc__ccaasstt<R
Raaddiioo*>(ppss);
}
Both ddyynnaam
miicc__ccaasstt and ssttaattiicc__ccaasstt respect ccoonnsstt and access controls. For example:
ccllaassss U
Usseerrss : pprriivvaattee sseett<P
Peerrssoonn> { /* ... */ };
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
414
Class Hierarchies
vvooiidd ff(U
Usseerrss* ppuu, ccoonnsstt R
Reecceeiivveerr* ppccrr)
{
ssttaattiicc__ccaasstt<sseett<P
Peerrssoonn>*>(ppuu);
ddyynnaam
miicc__ccaasstt<sseett<P
Peerrssoonn>*>(ppuu);
ssttaattiicc__ccaasstt<R
Reecceeiivveerr*>(ppccrr);
ddyynnaam
miicc__ccaasstt<R
Reecceeiivveerr*>(ppccrr);
Chapter 15
// error: access violation
// error: access violation
// error: can’t cast away const
// error: can’t cast away const
R
Reecceeiivveerr* pprr = ccoonnsstt__ccaasstt<R
Reecceeiivveerr*>(ppccrr);
// ...
// ok
}
It is not possible to cast to a private base class, and ‘‘casting away ccoonnsstt’’ requires a ccoonnsstt__ccaasstt
(§6.2.7). Even then, using the result is safe only provided the object wasn’t originally declared
ccoonnsstt (§10.2.7.1) .
15.4.3 Class Object Construction and Destruction [hier.class.obj]
A class object is more than simply a region of memory (§4.9.6). A class object is built from ‘‘raw
memory’’ by its constructors and it reverts to ‘‘raw memory’’ as its destructors are executed. Construction is bottom up, destruction is top down, and a class object is an object to the extent that it
has been constructed or destroyed. This is reflected in the rules for RTTI, exception handling
(§14.4.7), and virtual functions.
It is extremely unwise to rely on details of the order of construction and destruction, but that
order can be observed by calling virtual functions, ddyynnaam
miicc__ccaasstt, or ttyyppeeiidd (§15.4.4) at a point
where the object isn’t complete. For example, if the constructor for C
Coom
mppoonneenntt in the hierarchy
from §15.4.2 calls a virtual function, it will invoke a version defined for SSttoorraabbllee or C
Coom
mppoonneenntt,
but not one from R
Reecceeiivveerr, T
Trraannssm
miitttteerr, or R
Raaddiioo. At that point of construction, the object isn’t
yet a R
Raaddiioo; it is merely a partially constructed object. It is best to avoid calling virtual functions
during construction and destruction.
15.4.4 Typeid and Extended Type Information [hier.typeid]
The ddyynnaam
miicc__ccaasstt operator serves most needs for information about the type of an object at run
time. Importantly, it ensures that code written using it works correctly with classes derived from
those explicitly mentioned by the programmer. Thus, ddyynnaam
miicc__ccaasstt preserves flexibility and
extensibility in a manner similar to virtual functions.
However, it is occasionally essential to know the exact type of an object. For example, we
might like to know the name of the object’s class or its layout. The ttyyppeeiidd operator serves this purpose by yielding an object representing the type of its operand. Had ttyyppeeiidd() been a function, its
declaration would have looked something like this:
ccllaassss ttyyppee__iinnffoo;
ccoonnsstt ttyyppee__iinnffoo& ttyyppeeiidd(ttyyppee__nnaam
mee) tthhrroow
w(bbaadd__ttyyppeeiidd);
ccoonnsstt ttyyppee__iinnffoo& ttyyppeeiidd(eexxpprreessssiioonn);
// pseudo declaration
// pseudo declaration
That is, ttyyppeeiidd() returns a reference to a standard library type called ttyyppee__iinnffoo defined in <ttyyppee-iinnffoo>. Given a type-name as its operand, ttyyppeeiidd() returns a reference to a ttyyppee__iinnffoo that represents the type-name. Given an expression as its operand, ttyyppeeiidd() returns a reference to a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.4.4
Typeid and Extended Type Information
415
ttyyppee__iinnffoo that represents the type of the object denoted by the expression. A ttyyppeeiidd() is most
commonly used to find the type of an object referred to by a reference or a pointer:
vvooiidd ff(SShhaappee& rr, SShhaappee* pp)
{
ttyyppeeiidd(rr);
// type of object referred to by r
ttyyppeeiidd(*pp);
// type of object pointed to by p
ttyyppeeiidd(pp);
// type of pointer, that is, Shape* (uncommon, except as a mistake)
}
If the value of a pointer or a reference operand is 00, ttyyppeeiidd() throws a bbaadd__ttyyppeeiidd exception.
The implementation-independent part of ttyyppee__iinnffoo looks like this:
ccllaassss ttyyppee__iinnffoo {
ppuubblliicc:
vviirrttuuaall ~ttyyppee__iinnffoo();
// is polymorphic
bbooooll ooppeerraattoorr==(ccoonnsstt ttyyppee__iinnffoo&) ccoonnsstt; // can be compared
bbooooll ooppeerraattoorr!=(ccoonnsstt ttyyppee__iinnffoo&) ccoonnsstt;
bbooooll bbeeffoorree(ccoonnsstt ttyyppee__iinnffoo&) ccoonnsstt;
// ordering
ccoonnsstt cchhaarr* nnaam
mee() ccoonnsstt;
pprriivvaattee:
ttyyppee__iinnffoo(ccoonnsstt ttyyppee__iinnffoo&);
ttyyppee__iinnffoo& ooppeerraattoorr=(ccoonnsstt ttyyppee__iinnffoo&);
// ...
};
// name of type
// prevent copying
// prevent copying
The bbeeffoorree() function allows ttyyppee__iinnffoos to be sorted. There is no relation between the relationships defined by bbeeffoorree and inheritance relationships.
It is not guaranteed that there is only one ttyyppee__iinnffoo object for each type in the system. In fact,
where dynamically linked libraries are used it can be hard for an implementation to avoid duplicate
ttyyppee__iinnffoo objects. Consequently, we should use == on ttyyppee__iinnffoo objects to test equality, rather
than == on pointers to such objects.
We sometimes want to know the exact type of an object so as to perform some standard service
on the whole object (and not just on some base of the object). Ideally, such services are presented
as virtual functions so that the exact type needn’t be known. In some cases, no common interface
can be assumed for every object manipulated, so the detour through the exact type becomes necessary (§15.4.4.1). Another, much simpler, use has been to obtain the name of a class for diagnostic
output:
#iinncclluuddee<ttyyppeeiinnffoo>
vvooiidd gg(C
Coom
mppoonneenntt* pp)
{
ccoouutt << ttyyppeeiidd(*pp).nnaam
mee();
}
The character representation of a class’ name is implementation-defined. This C-style string
resides in memory owned by the system, so the programmer should not attempt to ddeelleettee[] it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
416
Class Hierarchies
Chapter 15
15.4.4.1 Extended Type Information [hier.extended]
Typically, finding the exact type of an object is simply the first step to acquiring and using moredetailed information about that type.
Consider how an implementation or a tool could make information about types available to
users at run time. Suppose I have a tool that generates descriptions of object layouts for each class
used. I can put these descriptors into a m
maapp to allow user code to find the layout information:
m
maapp<ccoonnsstt cchhaarr*, L
Laayyoouutt> llaayyoouutt__ttaabbllee;
vvooiidd ff(B
B* pp)
{
L
Laayyoouutt& x = llaayyoouutt__ttaabbllee[ttyyppeeiidd(*pp).nnaam
mee()];
// use x
}
Someone else might provide a completely different kind of information:
ssttrruucctt T
TII__eeqq {
bbooooll ooppeerraattoorr()(ccoonnsstt ttyyppee__iinnffoo* pp, ccoonnsstt ttyyppee__iinnffoo* qq) { rreettuurrnn *pp==*qq; }
};
ssttrruucctt T
TII__hhaasshh {
iinntt ooppeerraattoorr()(ccoonnsstt ttyyppee__iinnffoo* pp); // compute hash value (§17.6.2.2)
};
hhaasshh__m
maapp<ttyyppee__iinnffoo*,IIccoonn,hhaasshh__ffcctt,T
TII__hhaasshh,T
TII__eeqq> iiccoonn__ttaabbllee;
// §17.6
vvooiidd gg(B
B* pp)
{
IIccoonn& i = iiccoonn__ttaabbllee[&ttyyppeeiidd(*pp)];
// use i
}
This way of associating ttyyppeeiidds with information allows several people or tools to associate different information with types totally independently of each other:
llaayyoouutt__ttaabbllee::
............
""T
T""
...
iiccoonn_
. _ttaabbllee::
...
&
&ttyyppeeiidd((T
T))
...
.............
.
........
........
.......
.
object
layout
icon
representation
of
type
This is most important because the likelihood is zero that someone can come up with a single set of
information that satisfies every user.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.4.5
Uses and Misuses of RTTI
417
15.4.5 Uses and Misuses of RTTI [hier.misuse]
One should use explicit run-time type information only when necessary. Static (compile-time)
checking is safer, implies less overhead, and – where applicable – leads to better-structured programs. For example, RTTI can be used to write thinly disguised switch-statements:
// misuse of run-time type information:
vvooiidd rroottaattee(ccoonnsstt SShhaappee& rr)
{
iiff (ttyyppeeiidd(rr) == ttyyppeeiidd(C
Ciirrccllee)) {
// do nothing
}
eellssee iiff (ttyyppeeiidd(rr) == ttyyppeeiidd(T
Trriiaannggllee)) {
// rotate triangle
}
eellssee iiff (ttyyppeeiidd(rr) == ttyyppeeiidd(SSqquuaarree)) {
// rotate square
}
// ...
}
Using ddyynnaam
miicc__ccaasstt rather than ttyyppeeiidd would improve this code only marginally.
Unfortunately, this is not a strawman example; such code really does get written. For many
people trained in languages such as C, Pascal, Modula-2, and Ada, there is an almost irresistible
urge to organize software as a set of switch-statements. This urge should usually be resisted. Use
virtual functions (§2.5.5, §12.2.6) rather than RTTI to handle most cases when run-time discrimination based on type is needed.
Many examples of proper use of RTTI arise when some service code is expressed in terms of
one class and a user wants to add functionality through derivation. The use of IIvvaall__bbooxx in §15.4 is
an example of this. If the user is willing and able to modify the definitions of the library classes,
say B
BB
Bw
wiinnddoow
w, then the use of RTTI can be avoided; otherwise, it is needed. Even if the user is
willing to modify the base classes, such modification may cause its own problems. For example, it
may be necessary to introduce dummy implementations of virtual functions in classes for which
those functions are not needed or not meaningful. This problem is discussed in some detail in
§24.4.3. A use of RTTI to implement a simple object I/O system can be found in §25.4.1.
For people with a background in languages that rely heavily on dynamic type checking, such as
Smalltalk or Lisp, it is tempting to use RTTI in conjunction with overly general types. Consider:
// misuse of run-time type information:
ccllaassss O
Obbjjeecctt { /* ... */ }; // polymorphic
ccllaassss C
Coonnttaaiinneerr : ppuubblliicc O
Obbjjeecctt {
ppuubblliicc:
vvooiidd ppuutt(O
Obbjjeecctt*);
O
Obbjjeecctt* ggeett();
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
418
Class Hierarchies
Chapter 15
ccllaassss SShhiipp : ppuubblliicc O
Obbjjeecctt { /* ... */ };
SShhiipp* ff(SShhiipp* ppss, C
Coonnttaaiinneerr* cc)
{
cc->ppuutt(ppss);
// ...
O
Obbjjeecctt* p = cc->ggeett();
iiff (SShhiipp* q = ddyynnaam
miicc__ccaasstt<SShhiipp*>(pp)) { // run-time check
rreettuurrnn qq;
}
eellssee {
// do something else (typically, error handling)
}
}
Here, class O
Obbjjeecctt is an unnecessary implementation artifact. It is overly general because it does
not correspond to an abstraction in the application domain and forces the application programmer
to use an implementation-level abstraction. Problems of this kind are often better solved by using
container templates that hold only a single kind of pointer:
SShhiipp* ff(SShhiipp* ppss, lliisstt<SShhiipp*>& cc)
{
cc.ppuusshh__ffrroonntt(ppss);
// ...
rreettuurrnn cc.ppoopp__ffrroonntt();
}
Combined with the use of virtual functions, this technique handles most cases.
15.5 Pointers to Members [hier.ptom]
Many classes provide simple, very general interfaces intended to be invoked in several different
ways. For example, many ‘‘object-oriented’’ user-interfaces define a set of requests to which every
object represented on the screen should be prepared to respond. In addition, such requests can be
presented directly or indirectly from programs. Consider a simple variant of this idea:
ccllaassss SSttdd__iinntteerrffaaccee {
ppuubblliicc:
vviirrttuuaall vvooiidd ssttaarrtt() = 00;
vviirrttuuaall vvooiidd ssuussppeenndd() = 00;
vviirrttuuaall vvooiidd rreessuum
mee() = 00;
vviirrttuuaall vvooiidd qquuiitt() = 00;
vviirrttuuaall vvooiidd ffuullll__ssiizzee() = 00;
vviirrttuuaall vvooiidd ssm
maallll() = 00;
vviirrttuuaall ~SSttdd__iinntteerrffaaccee() {}
};
The exact meaning of each operation is defined by the object on which it is invoked. Often, there is
a layer of software between the person or program issuing the request and the object receiving it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.5
Pointers to Members
419
Ideally, such intermediate layers of software should not have to know anything about the individual
operations such as rreessuum
mee() and ffuullll__ssiizzee(). If they did, the intermediate layers would have to
be updated each time the set of operations changed. Consequently, such intermediate layers simply
transmit some data representing the operation to be invoked from the source of the request to its
recipient.
One simple way of doing that is to send a ssttrriinngg representing the operation to be invoked. For
example, to invoke ssuussppeenndd() we could send the string ""ssuussppeenndd"". However, someone has to create that string and someone has to decode it to determine to which operation it corresponds – if
any. Often, that seems indirect and tedious. Instead, we might simply send an integer representing
the operation. For example, 2 might be used to mean ssuussppeenndd(). However, while an integer may
be convenient for machines to deal with, it can get pretty obscure for people. We still have to write
code to determine that 2 means ssuussppeenndd() and to invoke ssuussppeenndd().
C++ offers a facility for indirectly referring to a member of a class. A pointer to a member is a
value that identifies a member of a class. You can think of it as the position of the member in an
object of the class, but of course an implementation takes into account the differences between data
members, virtual functions, non-virtual functions, etc.
Consider SSttdd__iinntteerrffaaccee. If I want to invoke ssuussppeenndd() for some object without mentioning
ssuussppeenndd() directly, I need a pointer to member referring to SSttdd__iinntteerrffaaccee::ssuussppeenndd(). I also
need a pointer or reference to the object I want to suspend. Consider a trivial example:
ttyyppeeddeeff vvooiidd (SSttdd__iinntteerrffaaccee::* P
Pssttdd__m
meem
m)();
// pointer to member type
vvooiidd ff(SSttdd__iinntteerrffaaccee* pp)
{
P
Pssttdd__m
meem
m s = &SSttdd__iinntteerrffaaccee::ssuussppeenndd;
pp->ssuussppeenndd();
// direct call
(pp->*ss)();
// call through pointer to member
}
A pointer to member can be obtained by applying the address-of operator & to a fully qualified
class member name, for example, &SSttdd__iinntteerrffaaccee::ssuussppeenndd. A variable of type ‘‘pointer to member of class X
X’’ is declared using a declarator of the form X
X::*.
The use of ttyyppeeddeeff to compensate for the lack of readability of the C declarator syntax is typical. However, please note how the X
X::* declarator matches the traditional * declarator exactly.
A pointer to member m can be used in combination with an object. The operators ->* and .*
allow the programmer to express such combinations. For example, pp->*m
m binds m to the object
pointed to by pp, and oobbjj.*m
m binds m to the object oobbjj. The result can be used in accordance with
m
m’s type. It is not possible to store the result of a ->* or a .* operation for later use.
Naturally, if we knew which member we wanted to call we would invoke it directly rather than
mess with pointers to members. Just like ordinary pointers to functions, pointers to member functions are used when we need to refer to a function without having to know its name. However, a
pointer to member isn’t a pointer to a piece of memory the way a pointer to a variable or a pointer
to a function is. It is more like an offset into a structure or an index into an array. When a pointer
to member is combined with a pointer to an object of the right type, it yields something that identifies a particular member of a particular object.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
420
Class Hierarchies
Chapter 15
This can be represented graphically like this:
p
.
s
.
.
.
.
.
vvttbbll::
.
X
X::::ssttaarrtt
.
.
X
X::::ssuussppeenndd
Because a pointer to a virtual member (ss in this example) is a kind of offset, it does not depend on
an object’s location in memory. A pointer to a virtual member can therefore safely be passed
between different address spaces as long as the same object layout is used in both. Like pointers to
ordinary functions, pointers to non-virtual member functions cannot be exchanged between address
spaces.
Note that the function invoked through the pointer to function can be vviirrttuuaall. For example,
when we call ssuussppeenndd() through a pointer to function, we get the right ssuussppeenndd() for the object to
which the pointer to function is applied. This is an essential aspect of pointers to functions.
An interpreter might use pointers to members to invoke functions presented as strings:
m
maapp<ssttrriinngg,SSttdd__iinntteerrffaaccee*> vvaarriiaabbllee;
m
maapp<ssttrriinngg,P
Pssttdd__m
meem
m> ooppeerraattiioonn;
vvooiidd ccaallll__m
meem
mbbeerr(ssttrriinngg vvaarr, ssttrriinngg ooppeerr)
{
(vvaarriiaabbllee[vvaarr]->*ooppeerraattiioonn[ooppeerr])();
}
// var.oper()
A critical use of pointers to member functions is found in m
meem
m__ffuunn() (§3.8.5, §18.4).
A static member isn’t associated with a particular object, so a pointer to a static member is simply an ordinary pointer. For example:
ccllaassss T
Taasskk {
// ...
ssttaattiicc vvooiidd sscchheedduullee();
};
vvooiidd (*pp)() = &T
Taasskk::sscchheedduullee;
// ok
vvooiidd (T
Taasskk::* ppm
m)() = &T
Taasskk::sscchheedduullee; // error: ordinary pointer assigned
// to pointer to member
Pointers to data members are described in §C.12.
15.5.1 Base and Derived Classes [hier.contravariance]
A derived class has at least the members that it inherits from its base classes. Often it has more.
This implies that we can safely assign a pointer to a member of a base class to a pointer to a member of a derived class, but not the other way around. This property is often called contravariance.
For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.5.1
Base and Derived Classes
421
ccllaassss tteexxtt : ppuubblliicc SSttdd__iinntteerrffaaccee {
ppuubblliicc:
vvooiidd ssttaarrtt();
vvooiidd ssuussppeenndd();
// ...
vviirrttuuaall vvooiidd pprriinntt();
pprriivvaattee:
vveeccttoorr ss;
};
vvooiidd (SSttdd__iinntteerrffaaccee::* ppm
mii)() = &tteexxtt::pprriinntt; // error
vvooiidd (tteexxtt::*ppm
mtt)() = &SSttdd__iinntteerrffaaccee::ssttaarrtt; // ok
This contravariance rule appears to be the opposite of the rule that says we can assign a pointer to a
derived class to a pointer to its base class. In fact, both rules exist to preserve the fundamental
guarantee that a pointer may never point to an object that doesn’t at least have the properties that
the pointer promises. In this case, SSttdd__iinntteerrffaaccee::* can be applied to any SSttdd__iinntteerrffaaccee, and most
such objects presumably are not of type tteexxtt. Consequently, they do not have the member
tteexxtt::pprriinntt with which we tried to initialize ppm
mii. By refusing the initialization, the compiler saves
us from a run-time error.
15.6 Free Store [hier.free]
It is possible to take over memory management for a class by defining ooppeerraattoorr nneew
w() and ooppeerraa-ttoorr ddeelleettee() (§6.2.6.2). However, replacing the global ooppeerraattoorr nneew
w() and ooppeerraattoorr ddeelleettee() is
not for the fainthearted. After all, someone else might rely on some aspect of the default behavior
or might even have supplied other versions of these functions.
A more selective, and often better, approach is to supply these operations for a specific class.
This class might be the base for many derived classes. For example, we might like to have the
E
Em
mppllooyyeeee class from §12.2.6 provide a specialized allocator and deallocator for itself and all of its
derived classes:
ccllaassss E
Em
mppllooyyeeee {
// ...
ppuubblliicc:
// ...
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt);
vvooiidd ooppeerraattoorr ddeelleettee(vvooiidd*, ssiizzee__tt);
};
Member ooppeerraattoorr nneew
w()s and ooppeerraattoorr ddeelleettee()s are implicitly ssttaattiicc members. Consequently,
they don’t have a tthhiiss pointer and do not modify an object. They provide storage that a constructor
can initialize and a destructor can clean up.
vvooiidd* E
Em
mppllooyyeeee::ooppeerraattoorr nneew
w(ssiizzee__tt ss)
{
// allocate ‘s’ bytes of memory and return a pointer to it
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
422
Class Hierarchies
Chapter 15
vvooiidd E
Em
mppllooyyeeee::ooppeerraattoorr ddeelleettee(vvooiidd* pp, ssiizzee__tt ss)
{
// assume ‘p’ points to ‘s’ bytes of memory allocated by Employee::operator new()
// and free that memory for reuse
}
The use of the hitherto mysterious ssiizzee__tt argument now becomes obvious. It is the size of the
object actually deleted. Deleting a ‘‘plain’’ E
Em
mppllooyyeeee gives an argument value of
ssiizzeeooff(E
Em
mppllooyyeeee); deleting a M
Maannaaggeerr gives an argument value of ssiizzeeooff(M
Maannaaggeerr). This
allows a class-specific allocator to avoid storing size information with each allocation. Naturally, a
class-specific allocator can store such information (like a general-purpose allocator must) and
ignore the ssiizzee__tt argument to ooppeerraattoorr ddeelleettee(). However, that makes it harder to improve significantly on the speed and memory consumption of a general-purpose allocator.
How does a compiler know how to supply the right size to ooppeerraattoorr ddeelleettee()? As long as the
type specified in the ddeelleettee operation matches the actual type of the object, this is easy. However,
that is not always the case:
ccllaassss M
Maannaaggeerr : ppuubblliicc E
Em
mppllooyyeeee {
iinntt lleevveell;
// ...
};
vvooiidd ff()
{
E
Em
mppllooyyeeee* p = nneew
w M
Maannaaggeerr;
ddeelleettee pp;
}
// trouble (the exact type is lost)
In this case, the compiler will not get the size right. As when an array is deleted, the user must help.
This is done by adding a virtual destructor to the base class, E
Em
mppllooyyeeee:
ccllaassss E
Em
mppllooyyeeee {
ppuubblliicc:
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt);
vvooiidd ooppeerraattoorr ddeelleettee(vvooiidd*, ssiizzee__tt);
vviirrttuuaall ~E
Em
mppllooyyeeee();
// ...
};
Even an empty destructor will do:
E
Em
mppllooyyeeee::~E
Em
mppllooyyeeee() { }
In principle, deallocation is then done from within the destructor (which knows the size). Furthermore, the presence of a destructor in E
Em
mppllooyyeeee ensures that every class derived from it will be supplied with a destructor (thus getting the size right), even if the derived class doesn’t have a userdefined destructor. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.6
Free Store
423
vvooiidd ff()
{
E
Em
mppllooyyeeee* p = nneew
w M
Maannaaggeerr;
ddeelleettee pp;
// now fine (Employee is polymorphic)
}
Allocation is done by a (compiler-generated) call:
E
Em
mppllooyyeeee::ooppeerraattoorr nneew
w(ssiizzeeooff(M
Maannaaggeerr))
and deallocation by a (compiler-generated) call:
E
Em
mppllooyyeeee::ooppeerraattoorr ddeelleettee(pp,ssiizzeeooff(M
Maannaaggeerr))
In other words, if you want to supply an allocator/deallocator pair that works correctly for derived
classes, you must either supply a virtual destructor in the base class or refrain from using the ssiizzee__tt
argument in the deallocator. Naturally, the language could have been designed to save you from
such concerns. However, that can be done only by also ‘‘saving’’ you from the benefits of the optimizations possible in the less safe system.
15.6.1 Array Allocation [hier.array]
The ooppeerraattoorr nneew
w() and ooppeerraattoorr ddeelleettee() functions allow a user to take over allocation and
deallocation of individual objects; ooppeerraattoorr nneew
w[]() and ooppeerraattoorr ddeelleettee[]() serve exactly the
same role for the allocation and deallocation of arrays. For example:
ccllaassss E
Em
mppllooyyeeee {
ppuubblliicc:
vvooiidd* ooppeerraattoorr nneew
w[](ssiizzee__tt);
vvooiidd ooppeerraattoorr ddeelleettee[](vvooiidd*, ssiizzee__tt);
// ...
};
vvooiidd ff(iinntt ss)
{
E
Em
mppllooyyeeee* p = nneew
w E
Em
mppllooyyeeee[ss];
// ...
ddeelleettee[] pp;
}
Here, the memory needed will be obtained by a call,
E
Em
mppllooyyeeee::ooppeerraattoorr nneew
w[](ssiizzeeooff(E
Em
mppllooyyeeee)*ss+ddeellttaa)
where ddeellttaa is some minimal implementation-defined overhead, and released by a call:
E
Em
mppllooyyeeee::ooppeerraattoorr ddeelleettee[](pp,ss*ssiizzeeooff(E
Em
mppllooyyeeee)+ddeellttaa)
The number of elements (ss) is ‘‘remembered’’ by the system.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
424
Class Hierarchies
Chapter 15
15.6.2 “Virtual Constructors” [hier.vctor]
After hearing about virtual destructors, the obvious question is, ‘‘Can constructors be virtual?’’
The short answer is no; a slightly longer one is, no, but you can easily get the effect you are looking
for.
To construct an object, a constructor needs the exact type of the object it is to create. Consequently, a constructor cannot be virtual. Furthermore, a constructor is not quite an ordinary function. In particular, it interacts with memory management routines in ways ordinary member functions don’t. Consequently, you cannot have a pointer to a constructor.
Both of these restrictions can be circumvented by defining a function that calls a constructor
and returns a constructed object. This is fortunate because creating a new object without knowing
its exact type is often useful. The IIvvaall__bbooxx__m
maakkeerr (§12.4.4) is an example of a class designed
specifically to do that. Here, I present a different variant of that idea, where objects of a class can
provide users with a clone (copy) of themselves or a new object of their type. Consider:
ccllaassss E
Exxpprr {
ppuubblliicc:
E
Exxpprr();
E
Exxpprr(ccoonnsstt E
Exxpprr&);
// default constructor
// copy constructor
vviirrttuuaall E
Exxpprr* nneew
w__eexxpprr() { rreettuurrnn nneew
w E
Exxpprr(); }
vviirrttuuaall E
Exxpprr* cclloonnee() { rreettuurrnn nneew
w E
Exxpprr(*tthhiiss); }
// ...
};
Because functions such as nneew
w__eexxpprr() and cclloonnee() are virtual and they (indirectly) construct
objects, they are often called ‘‘virtual constructors’’ – by a strange misuse of the English language.
Each simply uses a constructor to create a suitable object.
A derived class can override nneew
w__eexxpprr() and/or cclloonnee() to return an object of its own type:
ccllaassss C
Coonndd : ppuubblliicc E
Exxpprr {
ppuubblliicc:
C
Coonndd();
C
Coonndd(ccoonnsstt C
Coonndd&);
C
Coonndd* nneew
w__eexxpprr() { rreettuurrnn nneew
w C
Coonndd(); }
C
Coonndd* cclloonnee() { rreettuurrnn nneew
w C
Coonndd(*tthhiiss); }
// ...
};
This means that given an object of class E
Exxpprr, a user can create a new object of ‘‘just the same
type.’’ For example:
vvooiidd uusseerr(E
Exxpprr* pp)
{
E
Exxpprr* pp22 = pp->nneew
w__eexxpprr();
// ...
}
The pointer assigned to pp22 is of an appropriate, but unknown, type.
The return type of C
Coonndd::nneew
w__eexxpprr() and C
Coonndd::cclloonnee() was C
Coonndd* rather than E
Exxpprr*.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 15.6.2
“Virtual Constructors”
425
This allows a C
Coonndd to be cloned without loss of type information. For example:
vvooiidd uusseerr22(C
Coonndd* ppcc, E
Exxpprr* ppee)
{
C
Coonndd* pp22 = ppcc->cclloonnee();
C
Coonndd* pp33 = ppee->cclloonnee(); // error
// ...
}
The type of an overriding function must be the same as the type of the virtual function it overrides,
except that the return type may be relaxed. That is, if the original return type was B
B*, then the
return type of the overriding function may be D
D*, provided B is a public base of D
D. Similarly, a
return type of B
B& may be relaxed to D
D&.
Note that a similar relaxation of the rules for argument types would lead to type violations (see
§15.8 [12]).
15.7 Advice [hier.advice]
[1] Use ordinary multiple inheritance to express a union of features; §15.2, §15.2.5.
[2] Use multiple inheritance to separate implementation details from an interface; §15.2.5.
[3] Use a vviirrttuuaall base to represent something common to some, but not all, classes in a hierarchy;
§15.2.5.
[4] Avoid explicit type conversion (casts); §15.4.5.
[5] Use ddyynnaam
miicc__ccaasstt where class hierarchy navigation is unavoidable; §15.4.1.
[6] Prefer ddyynnaam
miicc__ccaasstt over ttyyppeeiidd; §15.4.4.
[7] Prefer pprriivvaattee to pprrootteecctteedd; §15.3.1.1.
[8] Don’t declare data members pprrootteecctteedd; §15.3.1.1.
[9] If a class defines ooppeerraattoorr ddeelleettee(), it should have a virtual destructor; §15.6.
[10] Don’t call virtual functions during construction or destruction; §15.4.3.
[11] Use explicit qualification for resolution of member names sparingly and preferably use it in
overriding functions; §15.2.1
15.8 Exercises [hier.exercises]
1. (∗1) Write a template ppttrr__ccaasstt that works like ddyynnaam
miicc__ccaasstt, except that it throws bbaadd__ccaasstt
rather than returning 00.
2. (∗2) Write a program that illustrates the sequence of constructor calls at the state of an object
relative to RTTI during construction. Similarly illustrate destruction.
3. (∗3.5) Implement a version of a Reversi/Othello board game. Each player can be either a
human or the computer. Focus on getting the program correct and (then) getting the computer
player ‘‘smart’’ enough to be worth playing against.
4. (∗3) Improve the user interface of the game from §15.8[3].
5. (∗3) Define a graphical object class with a plausible set of operations to serve as a common base
class for a library of graphical objects; look at a graphics library to see what operations were
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
426
Class Hierarchies
Chapter 15
supplied there. Define a database object class with a plausible set of operations to serve as a
common base class for objects stored as sequences of fields in a database; look at a database
library to see what operations were supplied there. Define a graphical database object with and
without the use of multiple inheritance and discuss the relative merits of the two solutions.
6. (∗2) Write a version of the cclloonnee() operation from §15.6.2 that can place its cloned object in
an A
Arreennaa (see §10.4.11) passed as an argument. Implement a simple A
Arreennaa as a class derived
from A
Arreennaa.
7. (∗2) Without looking in the book, write down as many C++ keywords you can.
8. (∗2) Write a standards-conforming C++ program containing a sequence of at least ten consecutive keywords not separated by identifiers, operators, punctuation characters, etc.
9. (∗2.5) Draw a plausible memory layout for a R
Raaddiioo as defined in §15.2.3.1. Explain how a virtual function call could be implemented.
10. (∗2) Draw a plausible memory layout for a R
Raaddiioo as defined in §15.2.4. Explain how a virtual
function call could be implemented.
11. (∗3) Consider how ddyynnaam
miicc__ccaasstt might be implemented. Define and implement a ddccaasstt template that behaves like ddyynnaam
miicc__ccaasstt but relies on functions and data you define only. Make
sure that you can add new classes to the system without having to change the definitions of
ddccaasstt or previously-written classes.
12. (∗2) Assume that the type-checking rules for arguments were relaxed in a way similar to the
relaxation for return types so that a function taking a D
Deerriivveedd* could overwrite a B
Baassee*. Then
write a program that would corrupt an object of class D
Deerriivveedd without using a cast. Describe a
safe relaxation of the overriding rules for argument types.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Part III
The Standard Library
This part describes the C++ standard library. It presents the design of the library and
key techniques used in its implementation. The aim is to provide understanding of
how to use the library, to demonstrate generally useful design and programming techniques, and to show how to extend the library in the ways in which it was intended to
be extended.
Chapters
16
17
18
19
20
21
22
Library Organization and Containers
Standard Containers
Algorithms and Function Objects
Iterators and Allocators
Strings
Streams
Numerics
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
428
The Standard Library
Part III
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
16
________________________________________
________________________________________________________________________________________________________________________________________________________________
Library Organization and Containers
It was new. It was singular.
It was simple. It must succeed!
– H. Nelson
Design criteria for the standard library — library organization — standard headers —
language support — container design — iterators — based containers — STL containers
— vveeccttoorr — iterators — element access — constructors — modifiers — list operations
— size and capacity — vveeccttoorr<bbooooll>— advice — exercises.
16.1 Standard Library Design [org.intro]
What ought to be in the standard C++ library? One ideal is for a programmer to be able to find
every interesting, significant, and reasonably general class, function, template, etc., in a library.
However, the question here is not, ‘‘What ought to be in some library?’’ but ‘‘What ought to be in
the standard library?’’ The answer ‘‘Everything!’’ is a reasonable first approximation to an answer
to the former question but not to the latter. A standard library is something that every implementer
must supply so that every programmer can rely on it.
The C++ standard library:
[1] Provides support for language features, such as memory management (§6.2.6) and runtime type information (§15.4).
[2] Supplies information about implementation-defined aspects of the language, such as the
largest ffllooaatt value (§22.2).
[3] Supplies functions that cannot be implemented optimally in the language itself for every
system, such as ssqqrrtt() (§22.3) and m
meem
mm
moovvee() (§19.4.6).
[4] Supplies nonprimitive facilities that a programmer can rely on for portability, such as lists
(§17.2.2), maps (§17.4.1), sort functions (§18.7.1), and I/O streams (Chapter 21).
[5] Provides a framework for extending the facilities it provides, such as conventions and
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
430
Library Organization and Containers
Chapter 16
support facilities that allow a user to provide I/O of a user-defined type in the style of I/O
for built-in types.
[6] Provides the common foundation for other libraries.
In addition, a few facilities – such as random-number generators (§22.7) – are provided by the
standard library simply because it is conventional and useful to do so.
The design of the library is primarily determined by the last three roles. These roles are closely
related. For example, portability is commonly an important design criterion for a specialized
library, and common container types such as lists and maps are essential for convenient communication between separately developed libraries.
The last role is especially important from a design perspective because it helps limit the scope
of the standard library and places constraints on its facilities. For example, string and list facilities
are provided in the standard library. If they were not, separately developed libraries could communicate only by using built-in types. However, pattern matching and graphics facilities are not provided. Such facilities are obviously widely useful, but they are rarely directly involved in communication between separately developed libraries.
Unless a facility is somehow needed to support these roles, it can be left to some library outside
the standard. For good and bad, leaving something out of the standard library opens the opportunity for different libraries to offer competing realizations of an idea.
16.1.1 Design Constraints [org.constraints]
The roles of a standard library impose several constraints on its design. The facilities offered by
the C++ standard library are designed to be:
[1] Invaluable and affordable to essentially every student and professional programmer,
including the builders of other libraries.
[2] Used directly or indirectly by every programmer for everything within the scope of the
library.
[3] Efficient enough to provide genuine alternatives to hand-coded functions, classes, and templates in the implementation of further libraries.
[4] Either policy-free or give the user the option to supply policies as arguments.
[5] Primitive in the mathematical sense. That is, a component that serves two weakly related
roles will almost certainly suffer overheads compared to individual components designed
to perform only a single role.
[6] Convenient, efficient, and reasonably safe for common uses.
[7] Complete at what they do. The standard library may leave major functions to other
libraries, but if it takes on a task, it must provide enough functionality so that individual
users or implementers need not replace it to get the basic job done.
[8] Blend well with and augment built-in types and operations.
[9] Type safe by default.
[10] Supportive of commonly accepted programming styles.
[11] Extensible to deal with user-defined types in ways similar to the way built-in types and
standard-library types are handled.
For example, building the comparison criteria into a sort function is unacceptable because the same
data can be sorted according to different criteria. This is why the C standard library qqssoorrtt() takes
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.1.1
Design Constraints
431
a comparison function as an argument rather than relying on something fixed, say, the < operator
(§7.7). On the other hand, the overhead imposed by a function call for each comparison compromises qqssoorrtt() as a building block for further library building. For almost every data type, it is
easy to do a comparison without imposing the overhead of a function call.
Is that overhead serious? In most cases, probably not. However, the function call overhead can
dominate the execution time for some algorithms and cause users to seek alternatives. The technique of supplying comparison criteria through a template argument described in §13.4 solves that
problem. The example illustrates the tension between efficiency and generality. A standard library
is not just required to perform its tasks. It must also perform them efficiently enough not to tempt
users to supply their own mechanisms. Otherwise, implementers of more advanced features are
forced to bypass the standard library in order to remain competitive. This would add a burden to
the library developer and seriously complicate the lives of users wanting to stay platformindependent or to use several separately developed libraries.
The requirements of ‘‘primitiveness’’ and ‘‘convenience of common uses’’ appear to conflict.
The former requirement precludes exclusively optimizing the standard library for common cases.
However, components serving common, but nonprimitive, needs can be included in the standard
library in addition to the primitive facilities, rather than as replacements. The cult of orthogonality
must not prevent us from making life convenient for the novice and the casual user. Nor should it
cause us to leave the default behavior of a component obscure or dangerous.
16.1.2 Standard Library Organization [org.org]
The facilities of the standard library are defined in the ssttdd namespace and presented as a set of
headers. The headers identify the major parts of the library. Thus, listing them gives an overview
of the library and provides a guide to the description of the library in this and subsequent chapters.
The rest of this subsection is a list of headers grouped by function, accompanied by brief explanations and annotated by references to where they are discussed. The grouping is chosen to match
the organization of the standard. A reference to the standard (such as §s.18.1) means that the facility is not discussed here.
A standard header with a name starting with the letter c is equivalent to a header in the C standard library. For every header <ccX
X> defining names in the ssttdd namespace, there is a header <X
X.hh>
defining the same names in the global namespace (see §9.2.2).
____________________________________________
Containers
_____________________________________________
___________________________________________
<vveeccttoorr>
> one-dimensional array of T
§16.3
<
<lliisstt>
>
doubly-linked list of T
§17.2.2
<
<ddeeqquuee>
>
double-ended queue of T
§17.2.3
<
<
<qquueeuuee>
>
queue of T
§17.3.2
<
<ssttaacckk>
>
stack of T
§17.3.1
<m
maapp>
>
associative array of T
§17.4.1
<
<sseett>
>
set of T
§17.4.3
<
____________________________________________
<
<bbiittsseett>
>
array of booleans
§17.5.3
The associative containers m
muullttiim
maapp and m
muullttiisseett can be found in <m
maapp> and <sseett>, respectively.
The pprriioorriittyy__qquueeuuee is declared in <qquueeuuee>.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
432
Library Organization and Containers
Chapter 16
______________________________________________________
General Utilities
_______________________________________________________
_____________________________________________________
<uuttiilliittyy>
>
operators and pairs
§17.1.4, §17.4.1.2
<
<ffuunnccttiioonnaall>
>
function objects
§18.4
<
<
<m
meem
moorryy>
>
allocators for containers
§19.4.4
______________________________________________________
<
<ccttiim
mee>
>
C-style date and time
§s.20.5
The <m
meem
moorryy> header also contains the aauuttoo__ppttrr template that is primarily used to smooth the
interaction between pointers and exceptions (§14.4.2).
__________________________________________________
___________________________________________________
Iterators
_________________________________________________
<
<iitteerraattoorr>
> iterators and iterator support
Chapter 19
__________________________________________________
Iterators provide the mechanism to make standard algorithms generic over the standard containers
and similar types (§2.7.2, §19.2.1).
____________________________________________
_____________________________________________
Algorithms
___________________________________________
<aallggoorriitthhm
m>
>
general algorithms
Chapter 18
<
<
<ccssttddlliibb>
>
bbsseeaarrcchh() qqssoorrtt()
§18.11
____________________________________________
A typical general algorithm can be applied to any sequence (§3.8, §18.3) of any type of elements.
The C standard library functions bbsseeaarrcchh() and qqssoorrtt() apply to built-in arrays with elements of
types without user-defined copy constructors and destructors only (§7.7).
____________________________________________
_____________________________________________
Diagnostics
___________________________________________
<eexxcceeppttiioonn>
> exception class
§14.10
<
<ssttddeexxcceepptt>
>
standard exceptions
§14.10
<
<ccaasssseerrtt>
>
assert macro
§24.3.7.2
<
<
<cceerrrrnnoo>
>
C-style error handling
§20.4.1
____________________________________________
Assertions relying on exceptions are described in §24.3.7.1.
_________________________________________________________
__________________________________________________________
Strings
________________________________________________________
<ssttrriinngg>
>
string of T
Chapter 20
<
<ccccttyyppee>
>
character classification
§20.4.2
<
<ccw
wttyyppee>
>
wide-character classification
§20.4.2
<
<
<ccssttrriinngg>
>
C-style string functions
§20.4.1
<
<ccw
wcchhaarr>
> C-style wide-character string functions
§20.4
<
<ccssttddlliibb>
>
C-style string functions
§20.4.1
_________________________________________________________
The <ccssttrriinngg> header declares the ssttrrlleenn(), ssttrrccppyy(), etc., family of functions. The <ccssttddlliibb>
declares aattooff() and aattooii() that convert C-style strings to numeric values.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.1.2
Standard Library Organization
433
____________________________________________________________
Input/Output
___________________________________________________________
_____________________________________________________________
<iioossffw
wdd>
>
forward declarations of I/O facilities
§21.1
<
<iioossttrreeaam
m>
>
standard iostream objects and operations
§21.2.1
<
<
<iiooss>
>
iostream bases
§21.2.1
<
<ssttrreeaam
mbbuuff>
>
stream buffers
§21.6
<
<iissttrreeaam
m>
>
input stream template
§21.3.1
<oossttrreeaam
m>
>
output stream template
§21.2.1
<
<iioom
maanniipp>
>
manipulators
§21.4.6.2
<
<
<ssssttrreeaam
m>
>
streams to/from strings
§21.5.3
<
<ccssttddlliibb>
>
character classification functions
§20.4.2
<
<ffssttrreeaam
m>
>
streams to/from files
§21.5.1
<ccssttddiioo>
>
pprriinnttff() family of I/O
§21.8
<
<
<ccw
wcchhaarr>
>
pprriinnttff()-style I/O of wide characters
§21.8
____________________________________________________________
Manipulators are objects used to manipulate the state of a stream (e.g., changing the format of
floating-point output) by applying them to the stream (§21.4.6).
___________________________________________________
____________________________________________________
Localization
__________________________________________________
<llooccaallee>
>
represent cultural differences
§21.7
<
<
<ccllooccaallee>
>
represent cultural differences C-style
§21.7
___________________________________________________
A llooccaallee localizes differences such as the output format for dates, the symbol used to represent currency, and string collation criteria that vary among different natural languages and cultures.
____________________________________________________________
Language Support
___________________________________________________________
_____________________________________________________________
<lliim
miittss>
>
numeric limits
§22.2
<
<cclliim
miittss>
>
C-style numeric scalar-limit macros
§22.2.1
<
<ccffllooaatt>
>
C-style numeric floating-point limit macros
§22.2.1
<
<
<nneew
w>
>
dynamic memory management
§16.1.3
<
<ttyyppeeiinnffoo>
>
run-time type identification support
§15.4.1
<eexxcceeppttiioonn>
>
exception-handling support
§14.10
<
<ccssttddddeeff>
>
C library language support
§6.2.1
<
<ccssttddaarrgg>
>
variable-length function argument lists
§7.6
<
<
<ccsseettjjm
mpp>
>
C-style stack unwinding
§s.18.7
<
<ccssttddlliibb>
>
program termination
§9.4.1.1
<ccttiim
mee>
>
system clock
§s.18.7
<
<
<ccssiiggnnaall>
>
C-style signal handling
§s.18.7
____________________________________________________________
The <ccssttddddeeff> header defines the type of values returned by ssiizzeeooff(), ssiizzee__tt, the type of the result
of pointer subtraction, ppttrrddiiffff__tt (§6.2.1), and the infamous N
NU
UL
LL
L macro (§5.1.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
434
Library Organization and Containers
Chapter 16
_________________________________________________
__________________________________________________
Numerics
________________________________________________
<ccoom
mpplleexx>
>
complex numbers and operations
§22.5
<
<vvaallaarrrraayy>
>
numeric vectors and operations
§22.4
<
<
<nnuum
meerriicc>
>
generalized numeric operations
§22.6
<
<ccm
maatthh>
>
standard mathematical functions
§22.3
<
<ccssttddlliibb>
>
C-style random numbers
§22.7
_________________________________________________
For historical reasons, aabbss(), ffaabbss(), and ddiivv() are found in <ccssttddlliibb> rather than in <ccm
maatthh>
with the rest of the mathematical functions (§22.3).
A user or a library implementer is not allowed to add or subtract declarations from the standard
headers. Nor is it acceptable to try to change the contents of headers by defining macros before
they are included or to try to change the meaning of the declarations in the headers by declarations
in their context (§9.2.3). Any program or implementation that plays such games does not conform
to the standard, and programs that rely on such tricks are not portable. Even if they work today, the
next release of any part of an implementation may break them. Avoid such trickery.
For a standard library facility to be used its header must be included. Writing out the relevant
declarations yourself is not a standards-conforming alternative. The reason is that some implementations optimize compilation based on standard header inclusion and others provide optimized
implementations of standard library facilities triggered by the headers. In general, implementers
use standard headers in ways programmers cannot predict and shouldn’t have to know about.
A programmer can, however, specialize utility templates, such as ssw
waapp() (§16.3.9), for
nonstandard-library, user-defined types.
16.1.3 Language Support [org.lang]
A small part of the standard library is language support; that is, facilities that must be present for a
program to run because language features depend on them.
The library functions supporting operators nneew
w and ddeelleettee are discussed in §6.2.6, §10.4.11,
§14.4.4, and §15.6; they are presented in <nneew
w>.
Run-time type identification relies on class ttyyppee__iinnffoo, which is described in §15.4.4 and presented in <ttyyppeeiinnffoo>.
The standard exception classes are discussed in §14.10 and presented in <nneew
w>, <ttyyppeeiinnffoo>,
<iiooss>, <eexxcceeppttiioonn>, and <ssttddeexxcceepptt>.
Program start and termination are discussed in §3.2, §9.4, and §10.4.9.
16.2 Container Design [org.cont]
A container is an object that holds other objects. Examples are lists, vectors, and associative arrays.
In general, you can add objects to a container and remove objects from it.
Naturally, this idea can be presented to users in many different ways. The C++ standard library
containers were designed to meet two criteria: to provide the maximum freedom in the design of an
individual container, while at the same time allowing containers to present a common interface to
users. This allows optimal efficiency in the implementation of containers and enables users to
write code that is independent of the particular container used.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.2
Container Design
435
Container designs typically meet just one or the other of these two design criteria. The container and algorithms part of the standard library (often called the STL) can be seen as a solution to
the problem of simultaneously providing generality and efficiency. The following sections present
the strengths and weaknesses of two traditional styles of containers as a way of approaching the
design of the standard containers.
16.2.1 Specialized Containers and Iterators [org.specialized]
The obvious approach to providing a vector and a list is to define each in the way that makes the
most sense for its intended use:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr {
// optimal
ppuubblliicc:
eexxpplliicciitt V
Veeccttoorr(ssiizzee__tt nn); // initialize to hold n objects with value T()
T
T& ooppeerraattoorr[](ssiizzee__tt);
// ...
// subscripting
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt { // optimal
ppuubblliicc:
ccllaassss L
Liinnkk { /* ... */ };
L
Liisstt();
// initially empty
vvooiidd ppuutt(T
T*); // put before current element
T
T* ggeett();
// get current element
// ...
};
Each class provides operations that are close to ideal for their use, and for each class we can choose
a suitable representation without worrying about other kinds of containers. This allows the implementations of operations to be close to optimal. In particular, the most common operations such as
ppuutt() for a L
Liisstt and ooppeerraattoorr[]() for a V
Veeccttoorr are small and easily inlined.
A common use of most kinds of containers is to iterate through the container looking at the elements one after the other. This is typically done by defining an iterator class appropriate to the
kind of container (see §11.5 and §11.14[7]).
However, a user iterating over a container often doesn’t care whether data is stored in a L
Liisstt or a
V
Veeccttoorr. In that case, the code iterating should not depend on whether a L
Liisstt or a V
Veeccttoorr was used.
Ideally, the same piece of code should work in both cases.
A solution is to define an iterator class that provides a get-next-element operation that can be
implemented for any container. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss IIttoorr { // common interface (abstract class §2.5.4, §12.3)
ppuubblliicc:
// return 0 to indicate no-more-elements
vviirrttuuaall T
T* ffiirrsstt() = 00;
vviirrttuuaall T
T* nneexxtt() = 00;
// pointer to first element
// pointer to next element
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
436
Library Organization and Containers
Chapter 16
We can now provide implementations for V
Veeccttoorrs and L
Liisstts:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr__iittoorr : ppuubblliicc IIttoorr<T
T> {
// Vector implementation
V
Veeccttoorr<T
T>& vv;
ssiizzee__tt iinnddeexx; // index of current element
ppuubblliicc:
V
Veeccttoorr__iittoorr(V
Veeccttoorr<T
T>& vvvv) :vv(vvvv), iinnddeexx(00) { }
T
T* ffiirrsstt() { rreettuurrnn (vv.ssiizzee()) ? &vv[iinnddeexx=00] : 00; }
T
T* nneexxtt() { rreettuurrnn (++iinnddeexx<vv.ssiizzee()) ? &vv[iinnddeexx] : 00; }
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt__iittoorr : ppuubblliicc IIttoorr<T
T> {
L
Liisstt<T
T>& llsstt;
L
Liisstt<T
T>::L
Liinnkk pp;
// points to current element
ppuubblliicc:
L
Liisstt__iittoorr(L
Liisstt<T
T>&);
T
T* ffiirrsstt();
T
T* nneexxtt();
};
// List implementation
Or graphically, using dashed lines to represent ‘‘implemented using:’’
V
Veeccttoorr
L
Liisstt
IIttoorr
.
.
V
Veeccttoorr__iittoorr
L
Liisstt__iittoorr
The internal structure of the two iterators is quite different, but that doesn’t matter to users. We can
now write code that iterates over anything for which we can implement an IIttoorr. For example:
iinntt ccoouunntt(IIttoorr<cchhaarr>& iiii, cchhaarr tteerrm
m)
{
iinntt c = 00;
ffoorr (cchhaarr* p = iiii.ffiirrsstt(); pp; pp=iiii.nneexxtt()) iiff (*pp==tteerrm
m) cc++;
rreettuurrnn cc;
}
There is a snag, however. The operations on an IIttoorr iterator are simple, yet they incur the overhead
of a (virtual) function call. In many situations, this overhead is minor compared to what else is
being done. However, iterating through a simple container is the critical operation in many highperformance systems and a function call is many times more expensive than the integer addition or
pointer dereferencing that implements nneexxtt() for a vveeccttoorr and a lliisstt. Consequently, this model is
unsuitable, or at least not ideal, for a standard library.
However, this container-and-iterator model has been successfully used in many systems. For
years, it was my favorite for most applications. Its strengths and weaknesses can be summarized
like this:
+ Individual containers are simple and efficient.
+ Little commonality is required of containers. Iterators and wrapper classes (§25.7.1) can be
used to fit independently developed containers into a common framework.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.2.1
Specialized Containers and Iterators
437
+ Commonality of use is provided through iterators (rather than through a general container
type; §16.2.2).
+ Different iterators can be defined to serve different needs for the same container.
+ Containers are by default type safe and homogeneous (that is, all elements in a container are
of the same type). A heterogeneous container can be provided as a homogeneous container
of pointers to a common base.
+ The containers are non-intrusive (that is, an object need not have a special base class or link
field to be a member of a container). Non-intrusive containers work well with built-in types
and with ssttrruucctts with externally-imposed layouts.
– Each iterator access incurs the overhead of a virtual function call. The time overhead can be
serious compared to simple inlined access functions.
– A hierarchy of iterator classes tends to get complicated.
– There is nothing in common for every container and nothing in common for every object in
every container. This complicates the provision of universal services such as persistence
and object I/O.
A + indicates an advantage and a - indicates a disadvantage.
I consider the flexibility provided by iterators especially important. A common interface, such
as IIttoorr, can be provided long after the design and implementation of containers (here, V
Veeccttoorr and
L
Liisstt). When we design, we typically first invent something fairly concrete. For example, we
design an array and invent a list. Only later do we discover an abstraction that covers both arrays
and lists in a given context.
As a matter of fact, we can do this ‘‘late abstraction’’ several times. Suppose we want to represent a set. A set is a very different abstraction from IIttoorr, yet we can provide a SSeett interface to
V
Veeccttoorr and L
Liisstt in much the same way that I provided IIttoorr as an interface to V
Veeccttoorr and L
Liisstt:
V
Veeccttoorr
L
Liisstt
SSeett
IIttoorr
.
.
V
Veeccttoorr__sseett
V
Veeccttoorr__iittoorr
L
Liisstt__sseett
L
Liisstt__iittoorr
Thus, late abstraction using abstract classes allows us to provide different implementations of a
concept even when there is no significant similarity between the implementations. For example,
lists and vectors have some obvious commonality, but we could easily implement an IIttoorr for an
iissttrreeaam
m.
Logically, the last two points on the list are the main weaknesses of the approach. That is, even
if the function call overhead for iterators and similar interfaces to containers were eliminated (as is
possible in some contexts), this approach would not be ideal for a standard library.
Non-intrusive containers incur a small overhead in time and space for some containers compared with intrusive containers. I have not found this a problem. Should it become a problem, an
iterator such as IIttoorr can be provided for an intrusive container (§16.5[11]).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
438
Library Organization and Containers
Chapter 16
16.2.2 Based Containers [org.based]
One can define an intrusive container without relying on templates or any other way of parameterizing a type declaration. For example:
ssttrruucctt L
Liinnkk {
L
Liinnkk* pprree;
L
Liinnkk* ssuucc;
// ...
};
ccllaassss L
Liisstt {
L
Liinnkk* hheeaadd;
L
Liinnkk* ccuurrrr;
ppuubblliicc:
L
Liinnkk* ggeett();
vvooiidd ppuutt(L
Liinnkk*);
// ...
};
// current element
// remove and return current element
// insert before current element
AL
Liisstt is now a list of L
Liinnkks, and it can hold objects of any type derived from L
Liinnkk. For example:
ccllaassss SShhiipp : ppuubblliicc L
Liinnkk { /* ... */ };
vvooiidd ff(L
Liisstt* llsstt)
{
w
whhiillee (L
Liinnkk* ppoo = llsstt->ggeett()) {
iiff (SShhiipp* ppss = ddyynnaam
miicc__ccaasstt<SShhiipp*>(ppoo)) {
// use ship
}
eellssee {
// Oops, do something else
}
}
}
// Ship must be polymorphic (§15.4.1)
Simula defined its standard containers in this style, so this approach can be considered the original
for languages supporting object-oriented programming. These days, a common class for all objects
is usually called O
Obbjjeecctt or something similar. An O
Obbjjeecctt class typically provides other common
services in addition to serving as a link for containers.
Often, but not necessarily, this approach is extended to provide a common container type:
ccllaassss C
Coonnttaaiinneerr : ppuubblliicc O
Obbjjeecctt {
ppuubblliicc:
vviirrttuuaall O
Obbjjeecctt* ggeett();
// remove and return current element
vviirrttuuaall vvooiidd ppuutt(O
Obbjjeecctt*);
// insert before current element
vviirrttuuaall O
Obbjjeecctt*& ooppeerraattoorr[](ssiizzee__tt); // subscripting
// ...
};
Note that the operations provided by C
Coonnttaaiinneerr are virtual so that individual containers can override them appropriately:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.2.2
Based Containers
439
ccllaassss L
Liisstt : ppuubblliicc C
Coonnttaaiinneerr {
ppuubblliicc:
O
Obbjjeecctt* ggeett();
vvooiidd ppuutt(O
Obbjjeecctt*);
// ...
};
ccllaassss V
Veeccttoorr : ppuubblliicc C
Coonnttaaiinneerr {
ppuubblliicc:
O
Obbjjeecctt*& ooppeerraattoorr[](ssiizzee__tt);
// ...
};
One problem arises immediately. What operations do we want C
Coonnttaaiinneerr to provide? We could
provide only the operations that every container can support. However, the intersection of the sets
of operations on all containers is a ridiculously narrow interface. In fact, in many interesting cases
that intersection is empty. So, realistically, we must provide the union of essential operations on
the variety of containers we intend to support. Such a union of interfaces to a set of concepts is
called a fat interface (§24.4.3).
We can either provide default implementations of the functions in the fat interface or force
every derived class to implement every function by making them pure virtual functions. In either
case, we end up with a lot of functions that simply report a run-time error. For example:
ccllaassss C
Coonnttaaiinneerr : ppuubblliicc O
Obbjjeecctt {
ppuubblliicc:
ssttrruucctt B
Baadd__oopp { // exception class
ccoonnsstt cchhaarr* pp;
B
Baadd__oopp(ccoonnsstt cchhaarr* pppp) :pp(pppp) { }
};
vviirrttuuaall vvooiidd ppuutt(O
Obbjjeecctt*) { tthhrroow
w B
Baadd__oopp("ppuutt"); }
vviirrttuuaall O
Obbjjeecctt* ggeett() { tthhrroow
w B
Baadd__oopp("ggeett"); }
vviirrttuuaall O
Obbjjeecctt*& ooppeerraattoorr[](iinntt) { tthhrroow
w B
Baadd__oopp("[]"); }
// ...
};
If we want to protect against the possibility of a container that does not support ggeett(), we must
catch C
Coonnttaaiinneerr::B
Baadd__oopp somewhere. We could now write the SShhiipp example like this:
ccllaassss SShhiipp : ppuubblliicc O
Obbjjeecctt { /* ... */ };
vvooiidd ff11(C
Coonnttaaiinneerr* ppcc)
{
ttrryy {
w
whhiillee (O
Obbjjeecctt* ppoo = ppcc->ggeett()) {
iiff (SShhiipp* ppss = ddyynnaam
miicc__ccaasstt<SShhiipp*>(ppoo)) {
// use ship
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
440
Library Organization and Containers
Chapter 16
eellssee {
// Oops, do something else
}
}
}
ccaattcchh (C
Coonnttaaiinneerr::B
Baadd__oopp& bbaadd) {
// Oops, do something else
}
}
This is tedious, so the checking for B
Baadd__oopp will typically be elsewhere. By relying on exceptions
caught elsewhere, we can reduce the example to:
vvooiidd ff22(C
Coonnttaaiinneerr* ppcc)
{
w
whhiillee (O
Obbjjeecctt* ppoo = ppcc->ggeett()) {
SShhiipp& s = ddyynnaam
miicc__ccaasstt<SShhiipp&>(*ppoo);
// use ship
}
}
However, I find unnecessary reliance on run-time checking distasteful and inefficient. In this kind
of case, I prefer the statically-checked alternative:
vvooiidd ff33(IIttoorr<SShhiipp>* ii)
{
w
whhiillee (SShhiipp* ppss = ii->nneexxtt()) {
// use ship
}
}
The strengths and weakness of the ‘‘based object’’ approach to container design can be summarized
like this (see also §16.5[10]):
– Operations on individual containers incur virtual function overhead.
– All containers must be derived from C
Coonnttaaiinneerr. This implies the use of fat interfaces,
requires a large degree of foresight, and relies on run-time type checking. Fitting an independently developed container into the common framework is awkward at best (see
§16.5[12]).
+ The common base C
Coonnttaaiinneerr makes it easy to use containers that supply similar sets of
operations interchangeably.
– Containers are heterogeneous and not type safe by default (all we can rely on is that elements are of type O
Obbjjeecctt*). When desired, type-safe and homogeneous containers can be
defined using templates.
– The containers are intrusive (that is, every element must be of a type derived from O
Obbjjeecctt).
Objects of built-in types and structs with externally imposed layouts cannot be placed
directly in containers.
– An element retrieved from a container must be given a proper type using explicit type conversion before it can be used.
+ Class C
Coonnttaaiinneerr and class O
Obbjjeecctt are handles for implementing services for every object or
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.2.2
Based Containers
441
every container. This greatly eases the provision of universal services such as persistence
and object I/O.
As before (§16.2.1), + indicates an advantage and - indicates a disadvantage.
Compared to the approach using unrelated containers and iterators, the based-object approach
unnecessarily pushes complexity onto the user, imposes significant run-time overheads, and
restricts the kinds of objects that can be placed in a container. In addition, for many classes, to
derive from O
Obbjjeecctt is to expose an implementation detail. Thus, this approach is far from ideal for
a standard library.
However, the generality and flexibility of this approach should not be underestimated. Like its
alternatives, it has been used successfully in many applications. Its strengths lie in areas in which
efficiency is less important than the simplicity afforded by a single C
Coonnttaaiinneerr interface and services such as object I/O.
16.2.3 STL Containers [org.stl]
The standard library containers and iterators (often called the STL framework, §3.10) can be understood as an approach to gain the best of the two traditional models described previously. That
wasn’t the way the STL was designed, though. The STL was the result of a single-minded search
for uncompromisingly efficient and generic algorithms.
The aim of efficiency rules out hard-to-inline virtual functions for small, frequently-used access
functions. Therefore, we cannot present a standard interface to containers or a standard iterator
interface as an abstract class. Instead, each kind of container supports a standard set of basic operations. To avoid the problems of fat interfaces (§16.2.2, §24.4.3), operations that cannot be efficiently implemented for all containers are not included in the set of common operations. For example, subscripting is provided for vveeccttoorr but not for lliisstt. In addition, each kind of container provides
its own iterators that support a standard set of iterator operations.
The standard containers are not derived from a common base. Instead, every container implements all of the standard container interface. Similarly, there is no common iterator base class. No
explicit or implicit run-time type checking is involved in using the standard containers and iterators.
The important and difficult issue of providing common services for all containers is handled
through ‘‘allocators’’ passed as template arguments (§19.4.3) rather than through a common base.
Before I go into details and code examples, the strengths and weaknesses of the STL approach
can be summarized:
+ Individual containers are simple and efficient (not quite as simple as truly independent containers can be, but just as efficient).
+ Each container provides a set of standard operations with standard names and semantics.
Additional operations are provided for a particular container type as needed. Furthermore,
wrapper classes (§25.7.1) can be used to fit independently developed containers into a common framework (§16.5[14]).
+ Additional commonality of use is provided through standard iterators. Each container provides iterators that support a set of standard operations with standard names and semantics.
An iterator type is defined for each particular container type so that these iterators are as
simple and efficient as possible.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
442
Library Organization and Containers
Chapter 16
+ To serve different needs for containers, different iterators and other generalized interfaces
can be defined in addition to the standard iterators.
+ Containers are by default type-safe and homogeneous (that is, all elements in a container are
of the same type). A heterogeneous container can be provided as a homogeneous container
of pointers to a common base.
+ The containers are non-intrusive (that is, an object need not have a special base class or link
field to be a member of a container). Non-intrusive containers work well with built-in types
and with ssttrruucctts with externally imposed layouts.
+ Intrusive containers can be fitted into the general framework. Naturally, an intrusive container will impose constraints on its element types.
+ Each container takes an argument, called an aallllooccaattoorr, which can be used as a handle for
implementing services for every container. This greatly eases the provision of universal services such as persistence and object I/O (§19.4.3).
– There is no standard run-time representation of containers or iterators that can be passed as a
function argument (although it is easy to define such representations for the standard containers and iterators where needed for a particular application; §19.3).
As before (§16.2.1), + indicates an advantage and - indicates a disadvantage.
In other words, containers and iterators do not have fixed standard representations. Instead,
each container provides a standard interface in the form of a set of operations so that containers can
be used interchangeably. Iterators are handled similarly. This implies minimal overheads in time
and space while allowing users to exploit commonality both at the level of containers (as with the
based-object approach) and at the level of iterators (as with the specialized container approach).
The STL approach relies heavily on templates. To avoid excessive code replication, partial specialization to provide shared implementations for containers of pointers is usually required (§13.5).
16.3 Vector [org.vector]
Here, vveeccttoorr is described as an example of a complete standard container. Unless otherwise stated,
what is said about vveeccttoorr holds for every standard container. Chapter 17 describes features peculiar
to lliisstts, sseetts, m
maapps, etc. The facilities offered by vveeccttoorr – and similar containers – are described in
some detail. The aim is to give an understanding both of the possible uses of vveeccttoorr and of its role
in the overall design of the standard library.
An overview of the standard containers and the facilities they offer can be found in §17.1.
Below, vveeccttoorr is introduced in stages: member types, iterators, element access, constructors, stack
operations, list operations, size and capacity, helper functions, and vveeccttoorr<bbooooll>.
16.3.1 Types [org.types]
The standard vveeccttoorr is a template defined in namespace ssttdd and presented in <vveeccttoorr>. It first
defines a set of standard names of types:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss ssttdd::vveeccttoorr {
ppuubblliicc:
// types:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.1
Types
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
T vvaalluuee__ttyyppee;
// type of element
A aallllooccaattoorr__ttyyppee;
// type of memory manager
ttyyppeennaam
mee A
A::ssiizzee__ttyyppee ssiizzee__ttyyppee;
ttyyppeennaam
mee A
A::ddiiffffeerreennccee__ttyyppee ddiiffffeerreennccee__ttyyppee;
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
implementation_dependent1 iitteerraattoorr;
// T*
implementation_dependent2 ccoonnsstt__iitteerraattoorr;
// const T*
ssttdd::rreevveerrssee__iitteerraattoorr<iitteerraattoorr> rreevveerrssee__iitteerraattoorr;
ssttdd::rreevveerrssee__iitteerraattoorr<ccoonnsstt__iitteerraattoorr> ccoonnsstt__rreevveerrssee__iitteerraattoorr;
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeennaam
mee
ttyyppeennaam
mee
ttyyppeennaam
mee
ttyyppeennaam
mee
443
A
A::ppooiinntteerr ppooiinntteerr;
// pointer to element
A
A::ccoonnsstt__ppooiinntteerr ccoonnsstt__ppooiinntteerr;
A
A::rreeffeerreennccee rreeffeerreennccee;
// reference to element
A
A::ccoonnsstt__rreeffeerreennccee ccoonnsstt__rreeffeerreennccee;
// ...
};
Every standard container defines these typenames as members. Each defines them in the way most
appropriate to its implementation.
The type of the container’s elements is passed as the first template argument and is known as its
vvaalluuee__ttyyppee. The aallllooccaattoorr__ttyyppee, which is optionally supplied as the second template argument,
defines how the vvaalluuee__ttyyppee interacts with various memory management mechanisms. In particular,
an allocator supplies the functions that a container uses to allocate and deallocate memory for its
elements. Allocators are discussed in §19.4. In general, ssiizzee__ttyyppee specifies the type used for
indexing into the container, and ddiiffffeerreennccee__ttyyppee is the type of the result of subtracting two iterators
for a container. For most containers, they correspond to ssiizzee__tt and ppttrrddiiffff__tt (§6.2.1).
Iterators were introduced in §2.7.2 and are described in detail in Chapter 19. They can be
thought of as pointers to elements of the container. Every container provides a type called iitteerraattoorr
for pointing to elements. It also provides a ccoonnsstt__iitteerraattoorr type for use when elements don’t need
to be modified. As with pointers, we use the safer ccoonnsstt version unless there is a reason to do otherwise. The actual types of vveeccttoorr’s iterators are implementation-defined. The obvious definitions
for a conventionally-defined vveeccttoorr would be T
T* and ccoonnsstt T
T*, respectively.
The reverse iterator types for vveeccttoorr are constructed from the standard rreevveerrssee__iitteerraattoorr templates (§19.2.5). They present a sequence in the reverse order.
As shown in §3.8.1, these member typenames allow a user to write code using a container without having to know about the actual types involved. In particular, they allow a user to write code
that will work for any standard container. For example:
tteem
mppllaattee<ccllaassss C
C> ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee ssuum
m(ccoonnsstt C
C& cc)
{
ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee s = 00;
ttyyppeennaam
mee C
C::ccoonnsstt__iitteerraattoorr p = cc.bbeeggiinn();
// start at the beginning
w
whhiillee (pp!=cc.eenndd()) {
// continue until the end
s += *pp;
// get value of element
++pp;
// make p point to next element
}
rreettuurrnn ss;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
444
Library Organization and Containers
Chapter 16
Having to add ttyyppeennaam
mee before the names of member types of a template parameter is a nuisance.
However, the compiler isn’t psychic. There is no general way for it to know whether a member of
a template argument type is a typename (§C.13.5).
As for pointers, prefix * means dereference the iterator (§2.7.2, §19.2.1) and ++ means increment the iterator.
16.3.2 Iterators [org.begin]
As shown in the previous subsection, iterators can be used to navigate containers without the programmers having to know the actual type used to identify elements. A few key member functions
allow the programmer to get hold of the ends of the sequence of elements:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
// iterators:
iitteerraattoorr bbeeggiinn();
ccoonnsstt__iitteerraattoorr bbeeggiinn() ccoonnsstt;
iitteerraattoorr eenndd();
ccoonnsstt__iitteerraattoorr eenndd() ccoonnsstt;
// points to first element
// points to one-past-last element
rreevveerrssee__iitteerraattoorr rrbbeeggiinn();
// points to first element of reverse sequence
ccoonnsstt__rreevveerrssee__iitteerraattoorr rrbbeeggiinn() ccoonnsstt;
rreevveerrssee__iitteerraattoorr rreenndd();
// points to one-past-last element of reverse sequence
ccoonnsstt__rreevveerrssee__iitteerraattoorr rreenndd() ccoonnsstt;
// ...
};
The bbeeggiinn()/eenndd() pair gives the elements of the container in the ordinary element order. That
is, element 0 is followed by element 11, element 22, etc. The rrbbeeggiinn()/rreenndd() pair gives the elements in the reverse order. That is, element nn-11 is followed by element nn-22, element nn-33, etc.
For example, a sequence seen like this using an iitteerraattoorr:
bbeeggiinn()
eenndd()
A
B
C
.....
.
.
.
.
.
.
.....
can be viewed like this using a rreevveerrssee__iitteerraattoorr (§19.2.5):
rrbbeeggiinn()
rreenndd()
C
B
A
.....
.
.
.
.
.
.
.....
This allows us to use algorithms in a way that views a sequence in the reverse order. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.2
Iterators
445
tteem
mppllaattee<ccllaassss C
C>
ttyyppeennaam
mee C
C::iitteerraattoorr ffiinndd__llaasstt(ccoonnsstt C
C& cc, ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee vv)
{
rreettuurrnn ffiinndd__ffiirrsstt(cc.rrbbeeggiinn(),cc.rreenndd(),vv).bbaassee();
}
The bbaassee() function returns an iitteerraattoorr corresponding to the rreevveerrssee__iitteerraattoorr (§19.2.5). Without
reverse iterators, we could have had to write something like:
tteem
mppllaattee<ccllaassss C
C>
ttyyppeennaam
mee C
C::iitteerraattoorr ffiinndd__llaasstt(ccoonnsstt C
C& cc, ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee vv)
{
ttyyppeennaam
mee C
C::iitteerraattoorr p = cc.eenndd(); //search backwards from end
w
whhiillee (pp!=cc.bbeeggiinn()) {
--pp;
iiff (*pp==vv) rreettuurrnn pp;
}
rreettuurrnn pp;
}
A reverse iterator is a perfectly ordinary iterator, so we could have written:
tteem
mppllaattee<ccllaassss C
C>
ttyyppeennaam
mee C
C::rreevveerrssee__iitteerraattoorr ffiinndd__llaasstt(ccoonnsstt C
C& cc, ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee vv)
{
ttyyppeennaam
mee C
C::rreevveerrssee__iitteerraattoorr p = cc.rrbbeeggiinn(); // view sequence in reverse order
w
whhiillee (pp!=cc.rreenndd()) {
iiff (*pp==vv) rreettuurrnn pp;
++pp;
// note: not decrement (--)
}
rreettuurrnn pp;
}
16.3.3 Element Access [org.element]
One important aspect of a vveeccttoorr compared with other containers is that one can easily and efficiently access individual elements in any order:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
// element access:
rreeffeerreennccee ooppeerraattoorr[](ssiizzee__ttyyppee nn);
// unchecked access
ccoonnsstt__rreeffeerreennccee ooppeerraattoorr[](ssiizzee__ttyyppee nn) ccoonnsstt;
rreeffeerreennccee aatt(ssiizzee__ttyyppee nn);
ccoonnsstt__rreeffeerreennccee aatt(ssiizzee__ttyyppee nn) ccoonnsstt;
// checked access
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
446
Library Organization and Containers
rreeffeerreennccee ffrroonntt();
ccoonnsstt__rreeffeerreennccee ffrroonntt() ccoonnsstt;
rreeffeerreennccee bbaacckk();
ccoonnsstt__rreeffeerreennccee bbaacckk() ccoonnsstt;
Chapter 16
// first element
// last element
// ...
};
Indexing is done by ooppeerraattoorr[]() and aatt(); ooppeerraattoorr[]() provides unchecked access, whereas
aatt() does a range check and throws oouutt__ooff__rraannggee if an index is out of range. For example:
vvooiidd ff(vveeccttoorr<iinntt>& vv, iinntt ii11, iinntt ii22)
ttrryy {
ffoorr(iinntt i = 00; i < vv.ssiizzee(); ii++) {
// range already checked: use unchecked v[i] here
}
vv.aatt(ii11) = vv.aatt(ii22); // check range on access
// ...
}
ccaattcchh(oouutt__ooff__rraannggee) {
// oops: out-of-range error
}
This illustrates one idea for use. That is, if the range has already been checked, the unchecked subscripting operator can be used safely; otherwise, it is wise to use the range-checked aatt() function.
This distinction is important when efficiency is at a premium. When that is not the case or when it
is not perfectly obvious whether a range has been correctly checked, it is safer to use a vector with a
checked [] operator (such as V
Veecc from §3.7.1) or a checked iterator (§19.3).
The default access is unchecked to match arrays. Also, you can build a safe (checked) facility
on top of a fast one but not a faster facility on top of a slower one.
The access operations return values of type rreeffeerreennccee or ccoonnsstt__rreeffeerreennccee depending on
whether or not they are applied to a ccoonnsstt object. A reference is some suitable type for accessing
elements. For the simple and obvious implementation of vveeccttoorr<X
X>, rreeffeerreennccee is simply X
X& and
ccoonnsstt__rreeffeerreennccee is simply ccoonnsstt X
X&. The effect of trying to create an out-of-range reference is
undefined. For example:
vvooiidd ff(vveeccttoorr<ddoouubbllee>& vv)
{
ddoouubbllee d = vv[vv.ssiizzee()]; // undefined: bad index
lliisstt<cchhaarr> llsstt;
cchhaarr c = llsstt.ffrroonntt();
// undefined: list is empty
}
Of the standard sequences, only vveeccttoorr and ddeeqquuee (§17.2.3) support subscripting. The reason is the
desire not to confuse users by providing fundamentally inefficient operations. For example, subscripting could have been provided for lliisstt (§17.2.2), but doing that would have been dangerously
inefficient (that is, O
O(nn)).
The members ffrroonntt() and bbaacckk() return references to the first and last element, respectively.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.3
Element Access
447
They are most useful where these elements are known to exist and in code where these elements are
of particular interest. A vveeccttoorr used as a ssttaacckk (§16.3.5) is an obvious example. Note that ffrroonntt()
returns a reference to the element to which bbeeggiinn() returns an iterator. I often think of ffrroonntt() as
the first element and bbeeggiinn() as a pointer to the first element. The correspondence between
bbaacckk() and eenndd() is less simple: bbaacckk() is the last element and eenndd() points to the last-plus-one
element position.
16.3.4 Constructors [org.ctor]
Naturally, vveeccttoorr provides a complete set (§11.7) of constructors, destructor, and copy operations:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
// constructors, etc.:
eexxpplliicciitt vveeccttoorr(ccoonnsstt A
A& = A
A());
eexxpplliicciitt vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall = T
T(), ccoonnsstt A
A& = A
A()); // n copies of val
tteem
mppllaattee <ccllaassss IInn>
// In must be an input iterator (§19.2.1)
vveeccttoorr(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt A
A& = A
A()); // copy from [first:last[
vveeccttoorr(ccoonnsstt vveeccttoorr& xx);
~vveeccttoorr();
vveeccttoorr& ooppeerraattoorr=(ccoonnsstt vveeccttoorr& xx);
tteem
mppllaattee <ccllaassss IInn>
vvooiidd aassssiiggnn(IInn ffiirrsstt, IInn llaasstt);
vvooiidd aassssiiggnn(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall);
// In must be an input iterator (§19.2.1)
// copy from [first:last[
// n copies of val
// ...
};
A vveeccttoorr provides fast access to arbitrary elements, but changing its size is relatively expensive.
Consequently, we typically give an initial size when we create a vveeccttoorr. For example:
vveeccttoorr<R
Reeccoorrdd> vvrr(1100000000);
vvooiidd ff(iinntt ss11, iinntt ss22)
{
vveeccttoorr<iinntt> vvii(ss11);
vveeccttoorr<ddoouubbllee>* p = nneew
w vveeccttoorr<ddoouubbllee>(ss22);
}
Elements of a vector allocated this way are initialized by the default constructor for the element
type. That is, each of vvrr’s 1100000000 elements is initialized by R
Reeccoorrdd() and each of vvii’s ss11 elements
is initialized by iinntt(). Note that the default constructor for a built-in type performs initialization to
0 of the appropriate type (§4.9.5, §10.4.2).
If a type does not have a default constructor, it is not possible to create a vector with elements
of that type without explicitly providing the value of each element. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
448
Library Organization and Containers
Chapter 16
ccllaassss N
Nuum
m{
// infinite precision
ppuubblliicc:
N
Nuum
m(lloonngg);
// no default constructor
// ...
};
vveeccttoorr<N
Nuum
m> vv11(11000000);
vveeccttoorr<N
Nuum
m> vv22(11000000,N
Nuum
m(00));
// error: no default Num
// ok
Since a vveeccttoorr cannot have a negative number of elements, its size must be non-negative. This is
reflected in the requirement that vveeccttoorr’s ssiizzee__ttyyppee must be an uunnssiiggnneedd type. This allows a
greater range of vector sizes on some architectures. However, it can also lead to surprises:
vvooiidd ff(iinntt ii)
{
vveeccttoorr<cchhaarr> vvcc00(-11);
vveeccttoorr<cchhaarr> vvcc11(ii);
}
vvooiidd gg()
{
ff(-11);
}
// fairly easy for compiler to warn against
// trick f() into accepting a large positive number!
In the call ff(-11), -11 is converted into a (rather large) positive integer (§C.6.3). If we are lucky,
the compiler will find a way of complaining.
The size of a vveeccttoorr can also be provided implicitly by giving the initial set of elements. This is
done by supplying the constructor with a sequence of values from which to construct the vveeccttoorr.
For example:
vvooiidd ff(ccoonnsstt lliisstt<X
X>& llsstt)
{
vveeccttoorr<X
X> vv11(llsstt.bbeeggiinn(),llsstt.eenndd());
cchhaarr pp[] = "ddeessppaaiirr";
vveeccttoorr<cchhaarr> vv22(pp,&pp[ssiizzeeooff(pp)-11]);
// copy elements from list
// copy characters from C-style string
}
In each case, the vveeccttoorr constructor adjusts the size of the vveeccttoorr as it copies elements from its
input sequence.
The vveeccttoorr constructors that can be invoked with a single argument are declared eexxpplliicciitt to prevent accidental conversions (§11.7.1). For example:
vveeccttoorr<iinntt> vv11(1100);
vveeccttoorr<iinntt> vv22 = vveeccttoorr<iinntt>(1100);
vveeccttoorr<iinntt> vv33 = vv22;
vveeccttoorr<iinntt> vv44 = 1100;
// ok: vector of 10 ints
// ok: vector of 10 ints
// ok: v3 is a copy of v2
// error: attempted implicit conversion of 10 to vector<int>
The copy constructor and the copy-assignment operators copy the elements of a vveeccttoorr. For a
vveeccttoorr with many elements, that can be an expensive operation, so vveeccttoorrs are typically passed by
reference. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.4
Constructors
vvooiidd ff11(vveeccttoorr<iinntt>&);
vvooiidd ff22(ccoonnsstt vveeccttoorr<iinntt>&);
vvooiidd ff33(vveeccttoorr<iinntt>);
449
// common style
// common style
// rare style
vvooiidd hh()
{
vveeccttoorr<iinntt> vv(1100000000);
// ...
ff11(vv);
ff22(vv);
ff33(vv);
// pass a reference
// pass a reference
// copy the 10000 elements into a new vector for f3() to use
}
The aassssiiggnn functions exist to provide counterparts to the multi-argument constructors. They are
needed because = takes a single right-hand operand, so aassssiiggnn() is used where a default argument
value or a range of values is needed. For example:
ccllaassss B
Booookk {
// ...
};
vvooiidd ff(vveeccttoorr<N
Nuum
m>& vvnn, vveeccttoorr<cchhaarr>& vvcc, vveeccttoorr<B
Booookk>& vvbb, lliisstt<B
Booookk>& llbb)
{
vvnn.aassssiiggnn(1100,N
Nuum
m(00));
// assign vector of 10 copies of Num(0) to vn
cchhaarr ss[] = "lliitteerraall";
vvcc.aassssiiggnn(ss,&ss[ssiizzeeooff(ss)-11]);
// assign "literal" to vc
vvbb.aassssiiggnn(llbb.bbeeggiinn(),llbb.eenndd());
// assign list elements
// ...
}
Thus, we can initialize a vveeccttoorr with any sequence of its element type and similarly assign any such
sequence. Importantly, this is done without explicitly introducing a multitude of constructors and
conversion functions. Note that assignment completely changes the elements of a vector. Conceptually, all old elements are erased and the new ones are inserted. After assignment, the size of a
vveeccttoorr is the number of elements assigned. For example:
vvooiidd ff()
{
vveeccttoorr<cchhaarr> vv(1100,´xx´);
vv.aassssiiggnn(55,´aa´);
// ...
}
// v.size()==10, each element has the value ’x’
// v.size()==5, each element has the value ’a’
Naturally, what aassssiiggnn() does could be done indirectly by first creating a suitable vveeccttoorr and then
assigning that. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
450
Library Organization and Containers
Chapter 16
vvooiidd ff22(vveeccttoorr<B
Booookk>& vvhh, lliisstt<B
Booookk>& llbb)
{
vveeccttoorr<B
Booookk> vvtt(llbb.bbeeggiinn(),llbb.eenndd());
vvhh = vvtt;
// ...
}
However, this can be both ugly and inefficient.
Constructing a vveeccttoorr with two arguments of the same type can lead to an apparent ambiguity:
vveeccttoorr<iinntt> vv(1100,5500);
// vector(size,value) or vector(iterator1,iterator2)? vector(size,value)!
However, an iinntt isn’t an iterator and the implementation must ensure that this actually invokes
vveeccttoorr(vveeccttoorr<iinntt>::ssiizzee__ttyyppee, ccoonnsstt iinntt&, ccoonnsstt vveeccttoorr<iinntt>::aallllooccaattoorr__ttyyppee&);
rather than
vveeccttoorr(vveeccttoorr<iinntt>::iitteerraattoorr, vveeccttoorr<iinntt>::iitteerraattoorr, ccoonnsstt vveeccttoorr<iinntt>::aallllooccaattoorr__ttyyppee&);
The library achieves this by suitable overloading of the constructors and handles the equivalent
ambiguities for aassssiiggnn() and iinnsseerrtt() (§16.3.6) similarly.
16.3.5 Stack Operations [org.stack]
Most often, we think of a vveeccttoorr as a compact data structure that we can index to access elements.
However, we can ignore this concrete notion and view vveeccttoorr as an example of the more abstract
notion of a sequence. Looking at a vveeccttoorr this way, and observing common uses of arrays and
vveeccttoorrs, it becomes obvious that stack operations make sense for a vveeccttoorr:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
// stack operations:
vvooiidd ppuusshh__bbaacckk(ccoonnsstt T
T& xx);
vvooiidd ppoopp__bbaacckk();
// ...
// add to end
// remove last element
};
These functions treat a vveeccttoorr as a stack by manipulating its end. For example:
vvooiidd ff(vveeccttoorr<cchhaarr>& ss)
{
ss.ppuusshh__bbaacckk(´aa´);
ss.ppuusshh__bbaacckk(´bb´);
ss.ppuusshh__bbaacckk(´cc´);
ss.ppoopp__bbaacckk();
iiff (ss[ss.ssiizzee()-11] != ´bb´) eerrrroorr("iim
mppoossssiibbllee!");
ss.ppoopp__bbaacckk();
iiff (ss.bbaacckk() != ´aa´) eerrrroorr("sshhoouulldd nneevveerr hhaappppeenn!");
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.5
Stack Operations
451
Each time ppuusshh__bbaacckk() is called, the vveeccttoorr s grows by one element and that element is added at
the end. So ss[ss.ssiizzee()-11], also known as ss.bbaacckk() (§16.3.3), is the element most recently
pushed onto the vveeccttoorr.
Except for the word vveeccttoorr instead of ssttaacckk, there is nothing unusual in this. The suffix __bbaacckk
is used to emphasize that elements are added to the end of the vveeccttoorr rather than to the beginning.
Adding an element to the end of a vveeccttoorr could be an expensive operation because extra memory
needs to be allocated to hold it. However, an implementation must ensure that repeated stack operations incur growth-related overhead only infrequently.
Note that ppoopp__bbaacckk() does not return a value. It just pops, and if we want to know what was
on the top of the stack before the pop, we must look. This happens not to be my favorite style of
stack (§2.5.3, §2.5.4), but it’s arguably more efficient and it’s the standard.
Why would one do stack-like operations on a vveeccttoorr? An obvious reason is to implement a
ssttaacckk (§17.3.1), but a more common reason is to construct a vveeccttoorr incrementally. For example,
we might want to read a vveeccttoorr of points from input. However, we don’t know how many points
will be read, so we can’t allocate a vector of the right size and then read into it. Instead, we might
write:
vveeccttoorr<P
Pooiinntt> cciittiieess;
vvooiidd aadddd__ppooiinnttss(P
Pooiinntt sseennttiinneell)
{
P
Pooiinntt bbuuff;
w
whhiillee (cciinn >> bbuuff) {
iiff (bbuuff == sseennttiinneell) rreettuurrnn;
// check new point
cciittiieess.ppuusshh__bbaacckk(bbuuff);
}
}
This ensures that the vveeccttoorr expands as needed. If all we needed to do with a new point were to put
it into the vveeccttoorr, we might have initialized cciittiieess directly from input in a constructor (§16.3.4).
However, it is common to do a bit of processing on input and expand a data structure gradually as a
program progresses; ppuusshh__bbaacckk() supports that.
In C programs, this is one of the most common uses of the C standard library function rreeaall-lloocc(). Thus, vveeccttoorr – and, in general, any standard container – provides a more general, more
elegant, and no less efficient alternative to rreeaalllloocc().
The ssiizzee() of a vveeccttoorr is implicitly increased by ppuusshh__bbaacckk() so the vveeccttoorr cannot overflow
(as long as there is memory available to acquire; see §19.4.1). However, a vveeccttoorr can underflow:
vvooiidd ff()
{
vveeccttoorr<iinntt> vv;
vv.ppoopp__bbaacckk();
vv.ppuusshh__bbaacckk(77);
}
// undefined effect: the state of v becomes undefined
// undefined effect (the state of v is undefined), probably bad
The effect of underflow is undefined, but the obvious implementation of ppoopp__bbaacckk() causes memory not owned by the vveeccttoorr to be overwritten. Like overflow, underflow must be avoided.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
452
Library Organization and Containers
Chapter 16
16.3.6 List Operations [org.list]
The ppuusshh__bbaacckk(), ppoopp__bbaacckk(), and bbaacckk() operations (§16.3.5) allow a vveeccttoorr to be used effectively as a stack. However, it is sometimes also useful to add elements in the middle of a vveeccttoorr
and to remove elements from a vveeccttoorr:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
// list operations:
iitteerraattoorr iinnsseerrtt(iitteerraattoorr ppooss, ccoonnsstt T
T& xx);
// add x before ’pos’
vvooiidd iinnsseerrtt(iitteerraattoorr ppooss, ssiizzee__ttyyppee nn, ccoonnsstt T
T& xx);
tteem
mppllaattee <ccllaassss IInn>
// In must be an input iterator (§19.2.1)
vvooiidd iinnsseerrtt(iitteerraattoorr ppooss, IInn ffiirrsstt, IInn llaasstt);
// insert elements from sequence
iitteerraattoorr eerraassee(iitteerraattoorr ppooss);
iitteerraattoorr eerraassee(iitteerraattoorr ffiirrsstt, iitteerraattoorr llaasstt);
vvooiidd cclleeaarr();
// remove element at pos
// erase sequence
// erase all elements
// ...
};
To see how these operations work, let’s do some (nonsensical) manipulation of a vveeccttoorr of names
of fruit. First, we define the vveeccttoorr and populate it with some names:
vveeccttoorr<ssttrriinngg> ffrruuiitt;
ffrruuiitt.ppuusshh__bbaacckk("ppeeaacchh");
ffrruuiitt.ppuusshh__bbaacckk("aappppllee");
ffrruuiitt.ppuusshh__bbaacckk("kkiiw
wiiffrruuiitt");
ffrruuiitt.ppuusshh__bbaacckk("ppeeaarr");
ffrruuiitt.ppuusshh__bbaacckk("ssttaarrffrruuiitt");
ffrruuiitt.ppuusshh__bbaacckk("ggrraappee");
If I take a dislike to fruits whose names start with the letter pp, I can remove those names like this:
ssoorrtt(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd());
vveeccttoorr<ssttrriinngg>::iitteerraattoorr pp11 = ffiinndd__iiff(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),iinniittiiaall(´pp´));
vveeccttoorr<ssttrriinngg>::iitteerraattoorr pp22 = ffiinndd__iiff(pp11,ffrruuiitt.eenndd(),iinniittiiaall__nnoott(´pp´));
ffrruuiitt.eerraassee(pp11,pp22);
In other words, sort the vveeccttoorr, find the first and the last fruit with a name that starts with the letter
pp, and erase those elements from ffrruuiitt. How to write predicate functions such as iinniittiiaall(xx) (is the
initial letter xx?) and iinniittiiaall__nnoott() (is the initial letter different from pp?) is explained in §18.4.2.
The eerraassee(pp11,pp22) operation removes elements starting from pp11 up to and not including pp22.
This can be illustrated graphically:
ffrruuiitt[]:
aappppllee
ggrraappee
kkiiw
wiiffrruuiitt
pp11
|
v
ppeeaacchh
ppeeaarr
pp22
|
v
ssttaarrffrruuiitt
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.6
List Operations
453
The eerraassee(pp11,pp22) removes ppeeaacchh and ppeeaarr, yielding:
ffrruuiitt[]:
aappppllee
ggrraappee
kkiiw
wiiffrruuiitt
ssttaarrffrruuiitt
As usual, the sequence specified by the user is from the beginning to one-past-the-end of the
sequence affected by the operation.
It would be tempting to write:
vveeccttoorr<ssttrriinngg>::iitteerraattoorr pp11 = ffiinndd__iiff(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),iinniittiiaall(´pp´));
vveeccttoorr<ssttrriinngg>::rreevveerrssee__iitteerraattoorr pp22 = ffiinndd__iiff(ffrruuiitt.rrbbeeggiinn(),ffrruuiitt.rreenndd(),iinniittiiaall(´pp´));
ffrruuiitt.eerraassee(pp11,pp22+11);
// oops!: type error
However, vveeccttoorr<ffrruuiitt>::iitteerraattoorr and vveeccttoorr<ffrruuiitt>::rreevveerrssee__iitteerraattoorr need not be the same
type, so we couldn’t rely on the call of eerraassee() to compile. To be used with an iitteerraattoorr, a
rreevveerrssee__iitteerraattoorr must be explicitly converted:
ffrruuiitt.eerraassee(pp11,pp22.bbaassee());
// extract iterator from reverse_iterator (§19.2.5)
Erasing an element from a vveeccttoorr changes the size of the vveeccttoorr, and the elements after the erased
elements are copied into the freed positions. In this example, ffrruuiitt.ssiizzee() becomes 4 and the ssttaarr-ffrruuiitt that used to be ffrruuiitt[55] is now ffrruuiitt[33].
Naturally, it is also possible to eerraassee() a single element. In that case, only an iterator for that
element is needed (rather than a pair of iterators). For example,
ffrruuiitt.eerraassee(ffiinndd(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),"ssttaarrffrruuiitt"));
ffrruuiitt.eerraassee(ffrruuiitt.bbeeggiinn()+11);
gets rid of the ssttaarrffrruuiitt and the ggrraappee, thus leaving ffrruuiitt with two elements:
ffrruuiitt[]:
aappppllee
kkiiw
wiiffrruuiitt
It is also possible to insert elements into a vector. For example:
ffrruuiitt.iinnsseerrtt(ffrruuiitt.bbeeggiinn()+11,"cchheerrrryy");
ffrruuiitt.iinnsseerrtt(ffrruuiitt.eenndd(),"ccrraannbbeerrrryy");
The new element is inserted before the position mentioned, and the elements from there to the end
are moved to make space. We get:
ffrruuiitt[]:
aappppllee
cchheerrrryy
kkiiw
wiiffrruuiitt
ccrraannbbeerrrryy
Note that ff.iinnsseerrtt(ff.eenndd(),xx) is equivalent to ff.ppuusshh__bbaacckk(xx).
We can also insert whole sequences:
ffrruuiitt.iinnsseerrtt(ffrruuiitt.bbeeggiinn()+22,cciittrruuss.bbeeggiinn(),cciittrruuss.eenndd());
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
454
Library Organization and Containers
Chapter 16
If cciittrruuss is a container
cciittrruuss[]:
lleem
moonn
ggrraappeeffrruuiitt
oorraannggee
lliim
mee
we get:
ffrruuiitt[]:
aappppllee
cchheerrrryy
lleem
moonn
ggrraappeeffrruuiitt
oorraannggee
lliim
mee
kkiiw
wiiffrruuiitt
ccrraannbbeerrrryy
The elements of cciittrruuss are copied into ffrruuiitt by iinnsseerrtt(). The value of cciittrruuss is unchanged.
Clearly, iinnsseerrtt() and eerraassee() are more general than are operations that affect only the tail end
of a vveeccttoorr (§16.3.5). They can also be more expensive. For example, to make room for a new element, iinnsseerrtt() may have to reallocate every element to a new part of memory. If insertions into
and deletions from a container are common, maybe that container should be a lliisstt rather than a
vveeccttoorr. A lliisstt is optimized for iinnsseerrtt() and eerraassee() rather than for subscripting (§16.3.3).
Insertion into and erasure from a vveeccttoorr (but not a lliisstt or an associative container such as m
maapp)
potentially move elements around. Consequently, an iterator pointing to an element of a vveeccttoorr
may after an iinnsseerrtt() or eerraassee() point to another element or to no element at all. Never access an
element through an invalid iterator; the effect is undefined and quite likely disastrous. In particular,
beware of using the iterator that was used to indicate where an insertion took place; iinnsseerrtt()
makes its first argument invalid. For example:
vvooiidd dduupplliiccaattee__eelleem
meennttss(vveeccttoorr<ssttrriinngg>& ff)
{
ffoorr(vveeccttoorr<ssttrriinngg>::iitteerraattoorr p = ff.bbeeggiinn(); pp!=ff.eenndd(); ++pp) ff.iinnsseerrtt(pp,*pp);// No!
}
Just think of it (§16.5[15]). A vveeccttoorr implementation would move all elements – or at least all elements after p – to make room for the new element.
The operation cclleeaarr() erases all elements of a container. Thus, cc.cclleeaarr() is a shorthand for
cc.eerraassee(cc.bbeeggiinn(),cc.eenndd()). After cc.cclleeaarr(), cc.ssiizzee() is 00.
16.3.7 Addressing Elements [org.addressing]
Most often, the target of an eerraassee() or iinnsseerrtt() is a well-known place (such as bbeeggiinn() or
eenndd()), the result of a search operation (such as ffiinndd()), or a location found during an iteration.
In such cases, we have an iterator pointing to the relevant element. However, we often refer to elements of a vveeccttoorr by subscripting. How do we get an iterator suitable as an argument for eerraassee()
or iinnsseerrtt() for the element with index 7 of a container cc? Since that element is the 7th element
after the beginning, cc.bbeeggiinn()+77 is a good answer. Other alternatives that may seem plausible by
analogy to arrays should be avoided. Consider:
tteem
mppllaattee<ccllaassss C
C> vvooiidd ff(C
C& cc)
{
cc.eerraassee(cc.bbeeggiinn()+77);
cc.eerraassee(&cc[77]);
// ok
// not general
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.7
Addressing Elements
455
cc.eerraassee(cc+77);
cc.eerraassee(cc.bbaacckk());
cc.eerraassee(cc.eenndd()-22);
cc.eerraassee(cc.rrbbeeggiinn()+22);
// error: adding 7 to a container makes no sense
// error: c.back() is a reference, not an iterator
// ok (second to last element)
// error: vector::reverse_iterator and vector::iterator
// are different types
cc.eerraassee((cc.rrbbeeggiinn()+22).bbaassee()); // obscure, but ok (see §19.2.5)
}
The most tempting alternative, &cc[77], actually happens to work with the obvious implementation
of vveeccttoorr, where cc[77] refers directly to the element and its address is a valid iterator. However,
this is not true for other containers. For example, a lliisstt or m
maapp iterator is almost certainly not a
simple pointer to an element. Consequently, their iterators do not support []. Therefore, &cc[77]
would be an error that the compiler catches.
The alternatives cc+77 and cc.bbaacckk() are simple type errors. A container is not a numeric variable to which we can add 77, and cc.bbaacckk() is an element with a value like "ppeeaarr" that does not
identify the pear’s location in the container cc.
16.3.8 Size and Capacity [org.size]
So far, vveeccttoorr has been described with minimal reference to memory management. A vveeccttoorr grows
as needed. Usually, that is all that matters. However, it is possible to ask directly about the way a
vveeccttoorr uses memory, and occasionally it is worthwhile to affect it directly. The operations are:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
// capacity:
ssiizzee__ttyyppee ssiizzee() ccoonnsstt;
// number of elements
bbooooll eem
mppttyy() ccoonnsstt { rreettuurrnn ssiizzee()==00; }
ssiizzee__ttyyppee m
maaxx__ssiizzee() ccoonnsstt;
// size of the largest possible vector
vvooiidd rreessiizzee(ssiizzee__ttyyppee sszz, T vvaall = T
T()); // added elements initialized by val
ssiizzee__ttyyppee ccaappaacciittyy() ccoonnsstt;
vvooiidd rreesseerrvvee(ssiizzee__ttyyppee nn);
// size of the memory (in number of elements) allocated
// make room for a total of n elements; don’t initialize
// throw a length_error if n>max_size()
// ...
};
At any given time, a vveeccttoorr holds a number of elements. This number can be obtained by calling
ssiizzee() and can be changed using rreessiizzee(). Thus, a user can determine the size of a vector and
change it if it seems insufficient or excessive. For example:
ccllaassss H
Hiissttooggrraam
m{
vveeccttoorr<iinntt> ccoouunntt;
ppuubblliicc:
H
Hiissttooggrraam
m(iinntt hh) : ccoouunntt(m
maaxx(hh,88)) {}
vvooiidd rreeccoorrdd(iinntt ii);
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
456
Library Organization and Containers
vvooiidd H
Hiissttooggrraam
m::rreeccoorrdd(iinntt ii)
{
iiff (ii<00) i = 00;
iiff (ccoouunntt.ssiizzee()<=ii) ccoouunntt.rreessiizzee(ii+ii);
ccoouunntt[ii]++;
}
Chapter 16
// make lots of room
Using rreessiizzee() on a vveeccttoorr is very similar to using the C standard library function rreeaalllloocc() on a
C array allocated on the free store.
When a vveeccttoorr is resized to accommodate more (or fewer) elements, all of its elements may be
moved to new locations. Consequently, it is a bad idea to keep pointers to elements in a vveeccttoorr that
might be resized; after rreessiizzee(), such pointers could point to deallocated memory. Instead, we can
keep indices. Note that ppuusshh__bbaacckk(), iinnsseerrtt(), and eerraassee() implicitly resize a vveeccttoorr.
In addition to the elements held, an application may keep some space for potential expansion.
A programmer who knows that expansion is likely can tell the vveeccttoorr implementation to rreesseerrvvee()
space for future expansion. For example:
ssttrruucctt L
Liinnkk {
L
Liinnkk* nneexxtt;
L
Liinnkk(L
Liinnkk* n =00) : nneexxtt(nn) {}
// ...
};
vveeccttoorr<L
Liinnkk> vv;
vvooiidd cchhaaiinn(ssiizzee__tt nn) // fill v with n Links so that each Link points to its predecessor
{
vv.rreesseerrvvee(nn);
vv.ppuusshh__bbaacckk(L
Liinnkk(00));
ffoorr (iinntt i = 11; ii<nn; ii++) vv.ppuusshh__bbaacckk(L
Liinnkk(&vv[ii-11]));
// ...
}
A call vv.rreesseerrvvee(nn) ensures that no allocation will be needed when the size of v is increased until
vv.ssiizzee() exceeds nn.
Reserving space in advance has two advantages. First, even a simple-minded implementation
can then allocate sufficient space in one operation rather than slowly acquiring enough memory
along the way. However, in many cases there is a logical advantage that outweighs the potential
efficiency gain. The elements of a container are potentially relocated when a vveeccttoorr grows. Thus,
the links built between the elements of v in the previous example are guaranteed only because the
call of rreesseerrvvee() ensures that there are no allocations while the vector is being built. That is, in
some cases rreesseerrvvee() provides a guarantee of correctness in addition to whatever efficiency
advantages it gives.
That same guarantee can be used to ensure that potential memory exhaustion and potentially
expensive reallocation of elements take place at predictable times. For programs with stringent
real-time constraints, this can be of great importance.
Note that rreesseerrvvee() doesn’t change the size of a vveeccttoorr. Thus, it does not have to initialize any
new elements. In both respects, it differs from rreessiizzee().
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.3.8
Size and Capacity
457
In the same way as ssiizzee() gives the current number of elements, ccaappaacciittyy() gives the current
number of reserved memory slots; cc.ccaappaacciittyy()-cc.ssiizzee() is the number of elements that can be
inserted without causing reallocation.
Decreasing the size of a vveeccttoorr doesn’t decrease its capacity. It simply leaves room for the
vveeccttoorr to grow into later. If you want to give memory back to the system, assign a new value to the
vveeccttoorr. For example:
v = vveeccttoorr<iinntt>(44,9999);
A vveeccttoorr gets the memory it needs for its elements by calling member functions of its allocator
(supplied as a template parameter). The default allocator, called aallllooccaattoorr (§19.4.1), uses nneew
w to
obtain storage so that it will throw bbaadd__aalllloocc if no more storage is obtainable. Other allocators can
use different strategies (see §19.4.2).
The rreesseerrvvee() and ccaappaacciittyy() functions are unique to vveeccttoorr and similar compact containers.
Containers such as lliisstt do not provide equivalents.
16.3.9 Other Member Functions [org.etc]
Many algorithms – including important sort algorithms – involve swapping elements. The obvious
way of swapping (§13.5.2) simply copies elements. However, a vveeccttoorr is typically implemented
with a structure that acts as a handle (§13.5, §17.1.3) to the elements. Thus, two vveeccttoorrs can be
swapped much more efficiently by interchanging the handles; vveeccttoorr::ssw
waapp() does that. The
time difference between this and the default ssw
waapp() is orders of magnitude in important cases:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
// ...
vvooiidd ssw
waapp(vveeccttoorr&);
aallllooccaattoorr__ttyyppee ggeett__aallllooccaattoorr() ccoonnsstt;
};
The ggeett__aallllooccaattoorr() function gives the programmer a chance to get hold of a vveeccttoorr’s allocator
(§16.3.1, §16.3.4). Typically, the reason for this is to ensure that data from an application that is
related to a vveeccttoorr is allocated similarly to the vveeccttoorr itself (§19.4.1).
16.3.10 Helper Functions [org.algo]
Two vveeccttoorrs can be compared using == and <:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A
A>
bbooooll ssttdd::ooppeerraattoorr==(ccoonnsstt vveeccttoorr<T
T,A
A>& xx, ccoonnsstt vveeccttoorr<T
T,A
A>& yy);
tteem
mppllaattee <ccllaassss T
T, ccllaassss A
A>
bbooooll ssttdd::ooppeerraattoorr<(ccoonnsstt vveeccttoorr<T
T,A
A>& xx, ccoonnsstt vveeccttoorr<T
T,A
A>& yy);
Two vveeccttoorrs vv11 and vv22 compare equal if vv11.ssiizzee()==vv22.ssiizzee() and vv11[nn]==vv22[nn] for every
valid index nn. Similarly, < is a lexicographical ordering. In other words, < for vveeccttoorrs could be
defined like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
458
Library Organization and Containers
Chapter 16
tteem
mppllaattee <ccllaassss T
T, ccllaassss A
A>
iinnlliinnee bbooooll ssttdd::ooppeerraattoorr<(ccoonnsstt vveeccttoorr<T
T,A
A>& xx, ccoonnsstt vveeccttoorr<T
T,A
A>& yy)
{
rreettuurrnn lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(xx.bbeeggiinn(),xx.eenndd(),yy.bbeeggiinn(),yy.eenndd());// see §18.9
}
This means that x is less than y if the first element xx[ii] that is not equal to the corresponding element yy[ii] is less than yy[ii], or xx.ssiizzee()<yy.ssiizzee() with every xx[ii] equal to its corresponding
yy[ii].
The standard library also provides !=, <=, >, and >=, with definitions that correspond to those
of == and <.
Because ssw
waapp() is a member, it is called using the vv11.ssw
waapp(vv22) syntax. However, not every
type has a ssw
waapp() member, so generic algorithms use the conventional ssw
waapp(aa,bb) syntax. To
make that work for vveeccttoorrs also, the standard library provides the specialization:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A
A> vvooiidd ssttdd::ssw
waapp(vveeccttoorr<T
T,A
A>& xx, vveeccttoorr<T
T,A
A>& yy)
{
xx.ssw
waapp(yy);
}
16.3.11 Vector<bool> [org.vector.bool]
The specialization (§13.5) vveeccttoorr<bbooooll> is provided as a compact vveeccttoorr of bbooooll. A bbooooll variable
is addressable, so it takes up at least one byte. However, it is easy to implement vveeccttoorr<bbooooll> so
that each element takes up only a bit.
The usual vveeccttoorr operations work for vveeccttoorr<bbooooll> and retain their usual meanings. In particular, subscripting and iteration work as expected. For example:
vvooiidd ff(vveeccttoorr<bbooooll>& vv)
{
ffoorr (iinntt i = 00; ii<vv.ssiizzee(); ++ii) cciinn >> vv[ii];
ttyyppeeddeeff vveeccttoorr<bbooooll>::ccoonnsstt__iitteerraattoorr V
VII;
ffoorr (V
VII p = vv.bbeeggiinn(); pp!=vv.eenndd(); ++pp) ccoouutt<<*pp;
// iterate using subscripting
// iterate using iterators
}
To achieve this, an implementation must simulate addressing of a single bit. Since a pointer cannot
address a unit of memory smaller than a byte, vveeccttoorr<bbooooll>::iitteerraattoorr cannot be a pointer. In particular, one cannot rely on bbooooll* as an iterator for a vveeccttoorr<bbooooll>:
vvooiidd ff(vveeccttoorr<bbooooll>& vv)
{
bbooooll* p = vv.bbeeggiinn(); // error: type mismatch
// ...
}
A technique for addressing a single bit is outlined in §17.5.3.
The library also provides bbiittsseett as a set of Boolean values with Boolean set operations
(§17.5.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 16.4
Advice
459
16.4 Advice [org.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
Use standard library facilities to maintain portability; §16.1.
Don’t try to redefine standard library facilities; §16.1.2.
Don’t believe that the standard library is best for everything.
When building a new facility, consider whether it can be presented within the framework
offered by the standard library; §16.3.
Remember that standard library facilities are defined in namespace ssttdd; §16.1.2.
Declare standard library facilities by including its header, not by explicit declaration; §16.1.2.
Take advantage of late abstraction; §16.2.1.
Avoid fat interfaces; §16.2.2.
Prefer algorithms with reverse iterators over explicit loops dealing with reverse order; §16.3.2.
Use bbaassee() to extract an iitteerraattoorr from a rreevveerrssee__iitteerraattoorr; §16.3.2.
Pass containers by reference; §16.3.4.
Use iterator types, such as lliisstt<cchhaarr>::iitteerraattoorr, rather than pointers to refer to elements of a
container; §16.3.1.
Use ccoonnsstt iterators where you don’t need to modify the elements of a container; §16.3.1.
Use aatt(), directly or indirectly, if you want range checking; §16.3.3.
Use ppuusshh__bbaacckk() or rreessiizzee() on a container rather than rreeaalllloocc() on an array; §16.3.5.
Don’t use iterators into a resized vveeccttoorr; §16.3.8.
Use rreesseerrvvee() to avoid invalidating iterators; §16.3.8.
When necessary, use rreesseerrvvee() to make performance predictable; §16.3.8.
16.5 Exercises [org.exercises]
The solutions to several exercises for this chapter can be found by looking at the source text of an
implementation of the standard library. Do yourself a favor: try to find your own solutions before
looking to see how your library implementer approached the problems.
1. (∗1.5) Create a vveeccttoorr<cchhaarr> containing the letters of the alphabet in order. Print the elements
of that vector in order and in reverse order.
2. (∗1.5) Create a vveeccttoorr<ssttrriinngg> and read a list of names of fruits from cciinn into it. Sort the list
and print it.
3. (∗1.5) Using the vveeccttoorr from §16.5[2], write a loop to print the names of all fruits with the initial letter aa.
4. (∗1) Using the vveeccttoorr from §16.5[2], write a loop to delete all fruits with the initial letter aa.
5. (∗1) Using the vveeccttoorr from §16.5[2], write a loop to delete all citrus fruits.
6. (∗1.5) Using the vveeccttoorr from §16.5[2], write a loop to delete all fruits that you don’t like.
7. (∗2) Complete the V
Veeccttoorr, L
Liisstt, and IIttoorr classes from §16.2.1.
8. (∗2.5) Given an IIttoorr class, consider how to provide iterators for forwards iteration, backwards
iteration, iteration over a container that might change during an iteration, and iteration over an
immutable container. Organize this set of containers so that a user can interchangeably use iterators that provide sufficient functionality for an algorithm. Minimize replication of effort in the
implementation of the containers. What other kinds of iterators might a user need? List the
strengths and weaknesses of your approach.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
460
Library Organization and Containers
Chapter 16
9. (∗2) Complete the C
Coonnttaaiinneerr, V
Veeccttoorr, and L
Liisstt classes from §16.2.2.
10. (∗2.5) Generate 10,000 uniformly distributed random numbers in the range 0 to 1,023 and store
them in (a) an standard library vveeccttoorr, (b) a V
Veeccttoorr from §16.5[7], and (3) a V
Veeccttoorr from
§16.5[9]. In each case, calculate the arithmetic mean of the elements of the vector (as if you
didn’t know it already). Time the resulting loops. Estimate, measure, and compare the memory
consumption for the three styles of vectors.
11. (∗1.5) Write an iterator to allow V
Veeccttoorr from §16.2.2 to be used as a container in the style of
§16.2.1.
12. (∗1.5) Write a class derived from C
Coonnttaaiinneerr to allow V
Veeccttoorr from §16.2.1 to be used as a container in the style of §16.2.2.
13. (∗2) Write classes to allow V
Veeccttoorr from §16.2.1 and V
Veeccttoorr from §16.2.2 to be used as standard
containers.
14. (∗2) Write a template that implements a container with the same member functions and member
types as the standard vveeccttoorr for an existing (nonstandard, non-student-exercise) container type.
Do not modify the (pre)existing container type. How would you deal with functionality offered
by the nonstandard vveeccttoorr but not by the standard vveeccttoorr?
15. (∗1.5) Outline the possible behavior of dduupplliiccaattee__eelleem
meennttss() from §16.3.6 for a
vveeccttoorr<ssttrriinngg> with the three elements ddoonn´tt ddoo tthhiiss.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
17
________________________________________
________________________________________________________________________________________________________________________________________________________________
Standard Containers
Now is a good time to put your work
on a firm theoretical foundation.
– Sam Morgan
Standard containers — container and operation summaries — efficiency — representation — element requirements — sequences — vveeccttoorr — lliisstt — ddeeqquuee — adapters —
ssttaacckk — qquueeuuee — pprriioorriittyy__qquueeuuee — associative containers — m
maapp — comparisons —
m
muullttiim
maapp — sseett — m
muullttiisseett — ‘‘almost containers’’ — bbiittsseett — arrays — hash tables
— implementing a hhaasshh__m
maapp — advice — exercises.
17.1 Standard Containers [cont.intro]
The standard library defines two kinds of containers: sequences and associative containers. The
sequences are all much like vveeccttoorr (§16.3). Except where otherwise stated, the member types and
functions mentioned for vveeccttoorr can also be used for any other container and produce the same
effect. In addition, associative containers provide element access based on keys (§3.7.4).
Built-in arrays (§5.2), ssttrriinnggs (Chapter 20), vvaallaarrrraayys (§22.4), and bbiittsseetts (§17.5.3) hold elements and can therefore be considered containers. However, these types are not fully-developed
standard containers. If they were, that would interfere with their primary purpose. For example, a
built-in array cannot both hold its own size and remain layout-compatible with C arrays.
A key idea for the standard containers is that they should be logically interchangeable wherever
reasonable. The user can then choose between them based on efficiency concerns and the need for
specialized operations. For example, if lookup based on a key is common, a m
maapp (§17.4.1) can be
used. On the other hand, if general list operations dominate, a lliisstt (§17.2.2) can be used. If many
additions and removals of elements occur at the ends of the container, a ddeeqquuee (double-ended
queue, §17.2.3), a ssttaacckk (§17.3.1), or a qquueeuuee (§17.3.2) should be considered. In addition, a user
can design additional containers to fit into the framework provided by the standard containers
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
462
Standard Containers
Chapter 17
(§17.6). By default, a vveeccttoorr (§16.3) should be used; it will be implemented to perform well over a
wide range of uses.
The idea of treating different kinds of containers – and more generally all kinds of information
sources – in uniform ways leads to the notion of generic programming (§2.7.2, §3.8). The standard
library provides many generic algorithms to support this idea (Chapter 18). Such algorithms can
save the programmer from having to deal directly with details of individual containers.
17.1.1 Operations Summary [cont.operations]
This section lists the common and almost common members of the standard containers. For more
details, read your standard headers (<vveeccttoorr>, <lliisstt>, <m
maapp>, etc.; §16.1.2).
___________________________________________________________________________
__________________________________________________________________________
Member Types (§16.3.1)
____________________________________________________________________________
Type of element.
vvaalluuee__ttyyppee
Type of memory manager.
aallllooccaattoorr__ttyyppee
ssiizzee__ttyyppee
Type of subscripts, element counts, etc.
ddiiffffeerreennccee__ttyyppee
Type of difference between iterators.
iitteerraattoorr
Behaves like vvaalluuee__ttyyppee*.
Behaves like ccoonnsstt vvaalluuee__ttyyppee*.
ccoonnsstt__iitteerraattoorr
View container in reverse order; like vvaalluuee__ttyyppee*.
rreevveerrssee__iitteerraattoorr
ccoonnsstt__rreevveerrssee__iitteerraattoorr
View container in reverse order; like ccoonnsstt vvaalluuee__ttyyppee*.
rreeffeerreennccee
Behaves like vvaalluuee__ttyyppee&.
ccoonnsstt__rreeffeerreennccee
Behaves like ccoonnsstt vvaalluuee__ttyyppee&.
Type of key (for associative containers only).
kkeeyy__ttyyppee
maappppeedd__ttyyppee
Type of m
maappppeedd__vvaalluuee (for associative containers only).
m
kkeeyy__ccoom
mppaarree
Type of comparison criterion (for associative containers only).
___________________________________________________________________________
A container can be viewed as a sequence either in the order defined by the container’s iitteerraattoorr or in
reverse order. For an associative container, the order is based on the container’s comparison criterion (by default <):
______________________________________________________
Iterators (§16.3.2)
_____________________________________________________
_______________________________________________________
Points to first element.
bbeeggiinn(())
Points to one-past-last element.
eenndd(())
Points to first element of reverse sequence.
rrbbeeggiinn(())
______________________________________________________
rreenndd(())
Points to one-past-last element of reverse sequence.
Some elements can be accessed directly:
_______________________________________________
Element Access (§16.3.3)
______________________________________________
________________________________________________
First element.
ffrroonntt(())
Last element.
bbaacckk(())
Subscripting, unchecked access (not for list).
[[]]
_______________________________________________
aatt(())
Subscripting, checked access (not for list).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.1.1
Operations Summary
463
Most containers provide efficient operations at the end (back) of their sequence of elements. In
addition, lists and deques provide the equivalent operations on the start (front) of their sequences:
______________________________________________________
Stack and Queue Operations (§16.3.5, §17.2.2.2)
_____________________________________________________
_______________________________________________________
Add to end.
ppuusshh__bbaacckk(())
Remove last element.
ppoopp__bbaacckk(())
ppuusshh__ffrroonntt(())
Add new first element (for list and deque only).
______________________________________________________
ppoopp__ffrroonntt(())
Remove first element (for list and deque only).
Containers provide list operations:
____________________________________________________
____________________________________________________
List Operations (§16.3.6)
____________________________________________________
Add x before pp.
iinnsseerrtt((pp,,xx))
Add n copies of x before pp.
iinnsseerrtt((pp,,nn,,xx))
iinnsseerrtt((pp,,ffiirrsstt,,llaasstt))
Add elements from [ffiirrsstt:llaasstt[ before pp.
eerraassee((pp))
Remove element at pp.
eerraassee((ffiirrsstt,,llaasstt))
Erase [ffiirrsstt:llaasstt[.
cclleeaarr(())
Erase all elements.
____________________________________________________
All containers provide operations related to the number of elements and a few other operations:
_________________________________________________________________
Other Operations (§16.3.8, §16.3.9, §16.3.10)
__________________________________________________________________
________________________________________________________________
Number of elements.
ssiizzee(())
mppttyy(())
Is the container empty?
eem
maaxx__ssiizzee(())
Size of the largest possible container.
m
ccaappaacciittyy(())
Space allocated for vveeccttoorr (for vector only).
rreesseerrvvee(())
Reserve space for future expansion (for vector only).
Change size of container (for vector, list, and deque only).
rreessiizzee(())
waapp(())
Swap elements of two containers.
ssw
Get a copy of the container’s allocator.
ggeett__aallllooccaattoorr(())
=
==
=
Is the content of two containers the same?
!!=
=
Is the content of two containers different?
<
Is one container lexicographically before another?
_________________________________________________________________
Containers provide a variety of constructors and assignment operations:
___________________________________________________________________
____________________________________________________________________
Constructors, etc. (§16.3.4)
__________________________________________________________________
Empty container.
ccoonnttaaiinneerr(())
n elements default value (not for associative containers).
ccoonnttaaiinneerr((nn))
n copies of x (not for associative containers).
ccoonnttaaiinneerr((nn,,xx))
ccoonnttaaiinneerr((ffiirrsstt,,llaasstt))
Initial elements from [ffiirrsstt:llaasstt[.
ccoonnttaaiinneerr((xx))
Copy constructor; initial elements from container xx.
˜˜ccoonnttaaiinneerr(())
Destroy the container and all of its elements.
___________________________________________________________________
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
464
Standard Containers
Chapter 17
_____________________________________________________________
______________________________________________________________
Assignments (§16.3.4)
____________________________________________________________
=((xx))
Copy assignment; elements from container xx.
ooppeerraattoorr=
Assign n copies of x (not for associative containers).
aassssiiggnn((nn,,xx))
aassssiiggnn((ffiirrsstt,,llaasstt))
Assign from [ffiirrsstt:llaasstt[.
_____________________________________________________________
Associative containers provide lookup based on keys:
_______________________________________________________________________
Associative Operations (§17.4.1)
______________________________________________________________________
________________________________________________________________________
Access the element with key k (for containers with unique keys).
ooppeerraattoorr[[]]((kk))
Find the element with key kk.
ffiinndd((kk))
lloow
weerr__bboouunndd((kk))
Find the first element with key kk.
uuppppeerr__bboouunndd((kk))
Find the first element with key greater than kk.
eeqquuaall__rraannggee((kk))
Find the lloow
weerr__bboouunndd and uuppppeerr__bboouunndd of elements with key kk.
mpp(())
Copy of the key comparison object.
kkeeyy__ccoom
vvaalluuee__ccoom
mpp(())
Copy of the m
maappppeedd__vvaalluuee comparison object.
_______________________________________________________________________
In addition to these common operations, most containers provide a few specialized operations.
17.1.2 Container Summary [cont.summary]
The standard containers can be summarized like this:
________________________________________________________________________
Standard Container Operations
_______________________________________________________________________
_________________________________________________________________________
[]
List
Front
Back (Stack) Iterators
Operations Operations Operations
§16.3.3
§16.3.6
§17.2.2.2
§16.3.5
§19.2.1
§17.4.1.3
§20.3.9
§20.3.9
§20.3.12
________________________________________________________________________
const
O(n)+
const+
Ran
vveeccttoorr
const
const
const
Bi
lliisstt
ddeeqquuee
const
O(n)
const
const
Ran
________________________________________________________________________
ssttaacckk
const+
const
const+
qquueeuuee
pprriioorriittyy__qquueeuuee
O(log(n))
O(log(n))
________________________________________________________________________
m
maapp
O(log(n)) O(log(n))+
Bi
m
muullttiim
maapp
O(log(n))+
Bi
O(log(n))+
Bi
sseett
m
muullttiisseett
O(log(n))+
Bi
________________________________________________________________________
ssttrriinngg
const
O(n)+
O(n)+
const+
Ran
aarrrraayy
const
Ran
const
Ran
vvaallaarrrraayy
bbiittsseett
const
________________________________________________________________________
In the iterators column, R
Raann means random-access iterator and B
Bii means bidirectional iterator; the
operations for a bidirectional operator are a subset of those of a random-access iterator (§19.2.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.1.2
Container Summary
465
Other entries are measures of the efficiency of the operations. A ccoonnsstt entry means the operation
takes an amount of time that does not depend on the number of elements in the container. Another
conventional notation for constant time is O
O(11). An O
O(nn) entry means the entry takes time proportional to the number of elements involved. A + suffix indicates that occasionally a significant
extra cost is incurred. For example, inserting an element into a lliisstt has a fixed cost (so it is listed as
ccoonnsstt), whereas the same operation on a vveeccttoorr involves moving the elements following the insertion point (so it is listed as O
O(nn)). Occasionally, all elements must be relocated (so I added a +).
The ‘‘big O’’ notation is conventional. I added the + for the benefit of programmers who care
about predictability in addition to average performance. A conventional term for O
O(nn)+ is
amortized linear time.
Naturally, if a constant is large it can dwarf a small cost proportional to the number of elements.
However, for large data structures ccoonnsstt tends to mean ‘‘cheap,’’ O
O(nn) to mean ‘‘expensive,’’ and
O
O(lloogg(nn)) to mean ‘‘fairly cheap.’’ For even moderately large values of nn, O
O(lloogg(nn)) is closer
to constant time than to O
O(nn). People who care about cost must take a closer look. In particular,
they must understand what elements are counted to get the nn. No basic operation is ‘‘very expensive,’’ that is, O
O(nn*nn) or worse.
Except for ssttrriinngg, the measures of costs listed here reflect requirements in the standard. The
ssttrriinngg estimates are my assumptions.
These measures of complexity and cost are upper bounds. The measures exist to give users
some guidance as to what they can expect from implementations. Naturally, implementers will try
to do better in important cases.
17.1.3 Representation [cont.rep]
The standard doesn’t prescribe a particular representation for each standard container. Instead, the
standard specifies the container interfaces and some complexity requirements. Implementers will
choose appropriate and often cleverly optimized implementations to meet the general requirements.
A container will almost certainly be represented by a data structure holding the elements accessed
through a handle holding size and capacity information. For a vveeccttoorr, the element data structure is
most likely an array:
vveeccttoorr:
ssiizzee
rreepp
.
elements
... . . . . . . . . . . . . . .
.
.
. extra space .
.
.
................
Similarly, a lliisstt is most likely represented by a set of links pointing to the elements:
lliisstt:
rreepp
elements:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
466
Standard Containers
Chapter 17
Am
maapp is most likely implemented as a (balanced) tree of nodes pointing to (key,value) pairs:
rreepp
...
m
maapp:
node
node
(key,value) pairs:
A ssttrriinngg might be implemented as outlined in §11.12 or maybe as a sequence of arrays holding a
few characters each:
ssttrriinngg:
rreepp
segment descriptors
string segments:
17.1.4 Element Requirements [cont.elem]
Elements in a container are copies of the objects inserted. Thus, to be an element of a container, an
object must be of a type that allows the container implementation to copy it. The container may
copy elements using a copy constructor or an assignment; in either case, the result of the copy must
be an equivalent object. This roughly means that any test for equality that you can devise on the
value of the objects must deem the copy equal to the original. In other words, copying an element
must work much like an ordinary copy of built-in types (including pointers). For example,
X
X& X
X::ooppeerraattoorr=(ccoonnsstt X
X& aa) // proper assignment operator
{
// copy all of a’s members to *this
rreettuurrnn *tthhiiss;
}
makes X acceptable as an element type for a standard container, but
vvooiidd Y
Y::ooppeerraattoorr=(ccoonnsstt Y
Y& aa) // improper assignment operator
{
// zero out all of a’s members
}
renders Y unsuitable because Y
Y’s assignment has neither the conventional return type nor the conventional semantics.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.1.4
Element Requirements
467
Some violations of the rules for standard containers can be detected by a compiler, but others
cannot and might then cause unexpected behavior. For example, a copy operation that throws an
exception might leave a partially copied element behind. It could even leave the container itself in
a state that could cause trouble later. Such copy operations are themselves bad design (§14.4.6.1).
When copying elements isn’t right, the alternative is to put pointers to objects into containers
instead of the objects themselves. The most obvious example is polymorphic types (§2.5.4,
§12.2.6). For example, we use vveeccttoorr<SShhaappee*> rather than vveeccttoorr<SShhaappee> to preserve polymorphic behavior.
17.1.4.1 Comparisons [cont.comp]
Associative containers require that their elements can be ordered. So do many operations that can
be applied to containers (for example ssoorrtt()). By default, the < operator is used to define the
order. If < is not suitable, the programmer must provide an alternative (§17.4.1.5, §18.4.2). The
ordering criterion must define a strict weak ordering. Informally, this means that both less-than
and equality must be transitive. That is, for an ordering criterion ccm
mpp:
[1] ccm
mpp(xx,xx) is ffaallssee.
[2] If ccm
mpp(xx,yy) and ccm
mpp(yy,zz), then ccm
mpp(xx,zz).
[3] Define eeqquuiivv(xx,yy) to be !(ccm
mpp(xx,yy)||ccm
mpp(yy,xx)). If eeqquuiivv(xx,yy) and eeqquuiivv(yy,zz),
then eeqquuiivv(xx,zz).
Consider:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt);
// use < for comparison
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);// use cmp
The first version uses < and the second uses a user-supplied comparison ccm
mpp. For example, we
might decide to sort ffrruuiitt using a comparison that isn’t case-sensitive. We do that by defining a
function object (§11.9, §18.4) that does the comparison when invoked for a pair of ssttrriinnggs:
ccllaassss N
Nooccaassee {
// case-insensitive string compare
ppuubblliicc:
bbooooll ooppeerraattoorr()(ccoonnsstt ssttrriinngg&, ccoonnsstt ssttrriinngg&) ccoonnsstt;
};
bbooooll N
Nooccaassee::ooppeerraattoorr()(ccoonnsstt ssttrriinngg& xx, ccoonnsstt ssttrriinngg& yy) ccoonnsstt
// return true if x is lexicographically less than y, not taking case into account
{
ssttrriinngg::ccoonnsstt__iitteerraattoorr p = xx.bbeeggiinn();
ssttrriinngg::ccoonnsstt__iitteerraattoorr q = yy.bbeeggiinn();
w
whhiillee (pp!=xx.eenndd() && qq!=yy.eenndd() && ttoouuppppeerr(*pp)==ttoouuppppeerr(*qq)) {
++pp;
++qq;
}
iiff (pp == xx.eenndd()) rreettuurrnn q != yy.eenndd();
rreettuurrnn ttoouuppppeerr(*pp) < ttoouuppppeerr(*qq);
}
We can call ssoorrtt() using that comparison criterion. For example, given:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
468
Standard Containers
ffrruuiitt:
aappppllee
ppeeaarr
A
Appppllee
Chapter 17
P
Peeaarr
lleem
moonn
Sorting using ssoorrtt(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),N
Nooccaassee()) would yield:
ffrruuiitt:
A
Appppllee
aappppllee
lleem
moonn
P
Peeaarr
ppeeaarr
whereas plain ssoorrtt(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd()) would give:
ffrruuiitt:
A
Appppllee
P
Peeaarr
aappppllee
lleem
moonn
ppeeaarr
assuming a character set in which uppercase letters precede lowercase letters.
Beware that < on C-style strings (that is, cchhaarr*) does not define lexicographical order
(§13.5.2). Thus, associative containers will not work as most people would expect them to when
C-style strings are used as keys. To make them work properly, a less-than operation that compares
based on lexicographical order must be used. For example:
ssttrruucctt C
Cssttrriinngg__lleessss {
bbooooll ooppeerraattoorr()(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq) ccoonnsstt { rreettuurrnn ssttrrccm
mpp(pp,qq)<00; }
};
m
maapp<cchhaarr*,iinntt,C
Cssttrriinngg__lleessss> m
m;
// map that uses strcmp() to compare const char* keys
17.1.4.2 Other Relational Operators [cont.relops]
By default, containers and algorithms use < when they need to do a less-than comparison. When
the default isn’t right, a programmer can supply a comparison criterion. However, no mechanism is
provided for also passing an equality test. Instead, when a programmer supplies a comparison ccm
mpp,
equality is tested using two comparisons. For example:
iiff (xx == yy) // not done where the user supplied a comparison
iiff (!ccm
mpp(xx,yy) && !ccm
mpp(yy,xx)) // done where the user supplied a comparison cmp
This saves us from having to add an equality parameter to every associative container and most
algorithms. It may look expensive, but the library doesn’t check for equality very often, and in
50% of the cases, only a single call of ccm
mpp() is needed.
Using an equivalence relationship defined by less-than (by default <) rather than equality (by
default ==) also has practical uses. For example, associative containers (§17.4) compare keys
using an equivalence test !(ccm
mpp(xx,yy)||ccm
mpp(yy,xx)). This implies that equivalent keys need not
be equal. For example, a m
muullttiim
maapp (§17.4.2) that uses case-insensitive comparison as its comparison criteria will consider the strings L
Laasstt, llaasstt, llA
Asstt, llaaSStt, and llaassT
T equivalent, even though == for
strings deems them different. This allows us to ignore differences we consider insignificant when
sorting.
Given < and ==, we can easily construct the rest of the usual comparisons. The standard library
defines them in the namespace ssttdd::rreell__ooppss and presents them in <uuttiilliittyy>:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.1.4.2
Other Relational Operators
tteem
mppllaattee<ccllaassss
tteem
mppllaattee<ccllaassss
tteem
mppllaattee<ccllaassss
tteem
mppllaattee<ccllaassss
T
T> bbooooll
T
T> bbooooll
T
T> bbooooll
T
T> bbooooll
469
rreell__ooppss::ooppeerraattoorr!=(ccoonnsstt T
T& xx, ccoonnsstt T
T& yy) { rreettuurrnn !(xx==yy); }
rreell__ooppss::ooppeerraattoorr>(ccoonnsstt T
T& xx, ccoonnsstt T
T& yy) { rreettuurrnn yy<xx; }
rreell__ooppss::ooppeerraattoorr<=(ccoonnsstt T
T& xx, ccoonnsstt T
T& yy) { rreettuurrnn !(yy<xx); }
rreell__ooppss::ooppeerraattoorr>=(ccoonnsstt T
T& xx, ccoonnsstt T
T& yy) { rreettuurrnn !(xx<yy); }
Placing these operations in rreell__ooppss ensures that they are easy to use when needed, yet they don’t
get created implicitly unless extracted from that namespace:
vvooiidd ff()
{
uussiinngg nnaam
meessppaaccee ssttdd;
// !=, >, etc., not generated by default
}
vvooiidd gg()
{
uussiinngg nnaam
meessppaaccee ssttdd;
uussiinngg nnaam
meessppaaccee ssttdd::rreell__ooppss;
// !=, >, etc., generated by default
}
The !=, etc., operations are not defined directly in ssttdd because they are not always needed and
sometimes their definition would interfere with user code. For example, if I were writing a generalized math library, I would want my relational operators and not the standard library versions.
17.2 Sequences [cont.seq]
Sequences follow the pattern described for vveeccttoorr (§16.3). The fundamental sequences provided by
the standard library are:
vveeccttoorr
lliisstt
ddeeqquuee
From these,
ssttaacckk
qquueeuuee
pprriioorriittyy__qquueeuuee
are created by providing suitable interfaces. These sequences are called container adapters,
sequence adapters, or simply adapters (§17.3).
17.2.1 Vector [cont.vector]
The standard vveeccttoorr is described in detail in §16.3. The facilities for reserving space (§16.3.8) are
unique to vveeccttoorr. By default, subscripting using [] is not range checked. If a check is needed, use
aatt() (§16.3.3), a checked vector (§3.7.1), or a checked iterator (§19.3). A vveeccttoorr provides
random-access iterators (§19.2.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
470
Standard Containers
Chapter 17
17.2.2 List [cont.list]
A lliisstt is a sequence optimized for insertion and deletion of elements. Compared to vveeccttoorr (and
ddeeqquuee; §17.2.3), subscripting would be painfully slow, so subscripting is not provided for lliisstt.
Consequently, lliisstt provides bidirectional iterators (§19.2.1) rather than random-access iterators.
This implies that a lliisstt will typically be implemented using some form of a doubly-linked list (see
§17.8[16]).
A lliisstt provides all of the member types and operations offered by vveeccttoorr (§16.3), with the
exceptions of subscripting, ccaappaacciittyy(), and rreesseerrvvee():
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss ssttdd::lliisstt {
ppuubblliicc:
// types and operations like vector’s, except [], at(), capacity(), and reserve()
// ...
};
17.2.2.1 Splice, Sort, and Merge [cont.splice]
In addition to the general sequence operations, lliisstt provides several operations specially suited for
list manipulation:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss lliisstt {
ppuubblliicc:
// ...
// list-specific operations:
vvooiidd sspplliiccee(iitteerraattoorr ppooss, lliisstt& xx);
// move all elements from x to before
// pos in this list without copying.
vvooiidd sspplliiccee(iitteerraattoorr ppooss, lliisstt& xx, iitteerraattoorr pp); // move *p from x to before
// pos in this list without copying.
vvooiidd sspplliiccee(iitteerraattoorr ppooss, lliisstt& xx, iitteerraattoorr ffiirrsstt, iitteerraattoorr llaasstt);
vvooiidd m
meerrggee(lliisstt&);
// merge sorted lists
tteem
mppllaattee <ccllaassss C
Cm
mpp> vvooiidd m
meerrggee(lliisstt&, C
Cm
mpp);
vvooiidd ssoorrtt();
tteem
mppllaattee <ccllaassss C
Cm
mpp> vvooiidd ssoorrtt(C
Cm
mpp);
// ...
};
These lliisstt operations are all stable; that is, they preserve the relative order of elements that have
equivalent values.
The ffrruuiitt examples from §16.3.6 work with ffrruuiitt defined to be a lliisstt. In addition, we can
extract elements from one list and insert them into another by a single ‘‘splice’’ operation. Given:
ffrruuiitt:
aappppllee
cciittrruuss:
oorraannggee
ppeeaarr
ggrraappeeffrruuiitt
lleem
moonn
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.2.2.1
Splice, Sort, and Merge
471
we can splice the oorraannggee from cciittrruuss into ffrruuiitt like this:
lliisstt<ssttrriinngg>::iitteerraattoorr p = ffiinndd__iiff(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),iinniittiiaall(´pp´));
ffrruuiitt.sspplliiccee(pp,cciittrruuss,cciittrruuss.bbeeggiinn());
The effect is to remove the first element from cciittrruuss (cciittrruuss.bbeeggiinn()) and place it just before the
first element of ffrruuiitt with the initial letter pp, thereby giving:
ffrruuiitt:
aappppllee
oorraannggee
cciittrruuss:
ggrraappeeffrruuiitt
ppeeaarr
lleem
moonn
Note that sspplliiccee() doesn’t copy elements the way iinnsseerrtt() does (§16.3.6). It simply modifies the
lliisstt data structures that refer to the element.
In addition to splicing individual elements and ranges, we can sspplliiccee() all elements of a lliisstt:
ffrruuiitt.sspplliiccee(ffrruuiitt.bbeeggiinn(),cciittrruuss);
This yields:
ffrruuiitt:
ggrraappeeffrruuiitt
lleem
moonn
aappppllee
oorraannggee
ppeeaarr
cciittrruuss:
<eem
mppttyy>
Each version of sspplliiccee() takes as its second argument the lliisstt from which elements are taken. This
allows elements to be removed from their original lliisstt. An iterator alone wouldn’t allow that
because there is no general way to determine the container holding an element given only an iterator to that element (§18.6).
Naturally, an iterator argument must be a valid iterator for the lliisstt into which it is supposed to
point. That is, it must point to an element of that lliisstt or be the lliisstt’s eenndd(). If not, the result is
undefined and possibly disastrous. For example:
lliisstt<ssttrriinngg>::iitteerraattoorr p = ffiinndd__iiff(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),iinniittiiaall(´pp´));
ffrruuiitt.sspplliiccee(pp,cciittrruuss,cciittrruuss.bbeeggiinn());
// ok
ffrruuiitt.sspplliiccee(pp,cciittrruuss,ffrruuiitt.bbeeggiinn());
// error: fruit.begin() doesn’t point into citrus
cciittrruuss.sspplliiccee(pp,ffrruuiitt,ffrruuiitt.bbeeggiinn());
// error: p doesn’t point into citrus
The first sspplliiccee() is ok even though cciittrruuss is empty.
Am
meerrggee() combines two sorted lists by removing the elements from one lliisstt and entering
them into the other while preserving order. For example,
ff11:
aappppllee
qquuiinnccee
ppeeaarr
lleem
moonn
ggrraappeeffrruuiitt
ff22:
oorraannggee
lliim
mee
can be sorted and merged like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
472
Standard Containers
Chapter 17
ff11.ssoorrtt();
ff22.ssoorrtt();
ff11.m
meerrggee(ff22);
This yields:
ff11:
aappppllee
ggrraappeeffrruuiitt
lleem
moonn
lliim
mee
oorraannggee
ppeeaarr
qquuiinnccee
ff22:
<eem
mppttyy>
If one of the lists being merged is not sorted, m
meerrggee() will still produce a list containing the union
of elements of the two lists. However, there are no guarantees made about the order of the result.
Like sspplliiccee(), m
meerrggee() refrains from copying elements. Instead, it removes elements from
the source list and splices them into the target list. After an xx.m
meerrggee(yy), the y list is empty.
17.2.2.2 Front Operations [cont.front]
Operations that refer to the first element of a lliisstt are provided to complement the operations referring to the last element provided by every sequence (§16.3.6):
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss lliisstt {
ppuubblliicc:
// ...
// element access:
rreeffeerreennccee ffrroonntt();
// reference to first element
ccoonnsstt__rreeffeerreennccee ffrroonntt() ccoonnsstt;
vvooiidd ppuusshh__ffrroonntt(ccoonnsstt T
T&);
vvooiidd ppoopp__ffrroonntt();
// add new first element
// remove first element
// ...
};
The first element of a container is called its ffrroonntt. For a lliisstt, front operations are as efficient and
convenient as back operations (§16.3.5). When there is a choice, back operations should be preferred over front operations. Code written using back operations can be used for a vveeccttoorr as well as
for a lliisstt. So if there is a chance that the code written using a lliisstt will ever evolve into a generic
algorithm applicable to a variety of containers, it is best to prefer the more widely available back
operations. This is a special case of the rule that to achieve maximal flexibility, it is usually wise to
use the minimal set of operations to do a task (§17.1.4.1).
17.2.2.3 Other Operations [cont.list.etc]
Insertion and removal of elements are particularly efficient for lliisstts. This, of course, leads people
to prefer lliisstts when these operations are frequent. That, in turn, makes it worthwhile to support
common ways of removing elements directly:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.2.2.3
Other Operations
473
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss lliisstt {
ppuubblliicc:
// ...
vvooiidd rreem
moovvee(ccoonnsstt T
T& vvaall);
tteem
mppllaattee <ccllaassss P
Prreedd> vvooiidd rreem
moovvee__iiff(P
Prreedd pp);
vvooiidd uunniiqquuee();
tteem
mppllaattee <ccllaassss B
BiinnP
Prreedd> vvooiidd uunniiqquuee(B
BiinnP
Prreedd bb);
// remove duplicates using ==
// remove duplicates using b
vvooiidd rreevveerrssee();
// reverse order of elements
};
For example, given
ffrruuiitt:
aappppllee
oorraannggee
ggrraappeeffrruuiitt
lleem
moonn
oorraannggee
lliim
mee
ppeeaarr
qquuiinnccee
we can remove all elements with the value "oorraannggee" like this:
ffrruuiitt.rreem
moovvee("oorraannggee");
yielding:
ffrruuiitt:
aappppllee
ggrraappeeffrruuiitt
lleem
moonn
lliim
mee
ppeeaarr
qquuiinnccee
Often, it is more interesting to remove all elements that meet some criterion rather than simply all
elements with a given value. The rreem
moovvee__iiff() operation does that. For example,
ffrruuiitt.rreem
moovvee__iiff(iinniittiiaall(´ll´));
removes every element with the initial ´ll´ from ffrruuiitt giving:
ffrruuiitt:
aappppllee
ggrraappeeffrruuiitt
ppeeaarr
qquuiinnccee
A common reason for removing elements is to eliminate duplicates. The uunniiqquuee() operation is
provided for that. For example:
ffrruuiitt.ssoorrtt();
ffrruuiitt.uunniiqquuee();
The reason for sorting is that uunniiqquuee removes only duplicates that appear consecutively. For example, had fruit contained:
aappppllee
ppeeaarr
aappppllee
aappppllee
ppeeaarr
a simple ffrruuiitt.uunniiqquuee() would have produced
aappppllee
ppeeaarr
aappppllee
ppeeaarr
whereas sorting first gives:
aappppllee
ppeeaarr
If only certain duplicates should be eliminated, we can provide a predicate to specify which
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
474
Standard Containers
Chapter 17
duplicates we want to remove. For example, we might define a binary predicate (§18.4.2)
iinniittiiaall22(xx) to compare ssttrriinnggs that have the initial x but yield ffaallssee for every ssttrriinngg that doesn’t.
Given:
ppeeaarr
ppeeaarr
aappppllee
aappppllee
we can remove consecutive duplicates of every ffrruuiitt with the initial p by a call
ffrruuiitt.uunniiqquuee(iinniittiiaall22(´pp´));
This would give
ppeeaarr
aappppllee
aappppllee
As noted in §16.3.2, we sometimes want to view a container in reverse order. For a lliisstt, it is possible to reverse the elements so that the first becomes the last, etc., without copying the elements.
The rreevveerrssee() operation is provided to do that. Given:
ffrruuiitt:
bbaannaannaa cchheerrrryy
lliim
mee
ssttrraaw
wbbeerrrryy
ffrruuiitt.rreevveerrssee() produces:
ffrruuiitt:
ssttrraaw
wbbeerrrryy lliim
mee
cchheerrrryy bbaannaannaa
An element that is removed from a list is destroyed. However, note that destroying a pointer does
not imply that the object it points to is ddeelleetteed. If you want a container of pointers that ddeelleettees elements pointed to when the pointer is removed from the container or the container is destroyed, you
must write one yourself (§17.8[13]).
17.2.3 Deque [cont.deque]
A ddeeqquuee (it rhymes with check) is a double-ended queue. That is, a ddeeqquuee is a sequence optimized
so that operations at both ends are about as efficient as for a lliisstt, whereas subscripting approaches
the efficiency of a vveeccttoorr:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss ssttdd::ddeeqquuee {
// types and operations like vector (§16.3.3, §16.3.5, §16.3.6)
// plus front operations (§17.2.2.2) like list
};
Insertion and deletion of elements ‘‘in the middle’’ have vveeccttoorr-like (in)efficiencies rather than
lliisstt-like efficiencies. Consequently, a ddeeqquuee is used where additions and deletions take place ‘‘at
the ends.’’ For example, we might use a ddeeqquuee to model a section of a railroad or to represent a
deck of cards in a game:
ddeeqquuee<ccaarr> ssiiddiinngg__nnoo__33;
ddeeqquuee<C
Caarrdd> bboonnuuss;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.3
Sequence Adapters
475
17.3 Sequence Adapters [cont.adapters]
The vveeccttoorr, lliisstt, and ddeeqquuee sequences cannot be built from each other without loss of efficiency.
On the other hand, ssttaacckks and qquueeuuees can be elegantly and efficiently implemented using those
three basic sequences. Therefore, ssttaacckk and qquueeuuee are defined not as separate containers, but as
adaptors of basic containers.
A container adapter provides a restricted interface to a container. In particular, adapters do not
provide iterators; they are intended to be used only through their specialized interfaces.
The techniques used to create a container adapter from a container are generally useful for nonintrusively adapting the interface of a class to the needs of its users.
17.3.1 Stack [cont.stack]
The ssttaacckk container adapter is defined in <ssttaacckk>. It is so simple that the best way to describe it is
to present an implementation:
tteem
mppllaattee <ccllaassss T
T, ccllaassss C = ddeeqquuee<T
T> > ccllaassss ssttdd::ssttaacckk {
pprrootteecctteedd:
C cc;
ppuubblliicc:
ttyyppeeddeeff ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee vvaalluuee__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee C
C::ssiizzee__ttyyppee ssiizzee__ttyyppee;
ttyyppeeddeeff C ccoonnttaaiinneerr__ttyyppee;
eexxpplliicciitt ssttaacckk(ccoonnsstt C
C& a = C
C()) : cc(aa) { }
bbooooll eem
mppttyy() ccoonnsstt { rreettuurrnn cc.eem
mppttyy(); }
ssiizzee__ttyyppee ssiizzee() ccoonnsstt { rreettuurrnn cc.ssiizzee(); }
vvaalluuee__ttyyppee& ttoopp() { rreettuurrnn cc.bbaacckk(); }
ccoonnsstt vvaalluuee__ttyyppee& ttoopp() ccoonnsstt { rreettuurrnn cc.bbaacckk(); }
vvooiidd ppuusshh(ccoonnsstt vvaalluuee__ttyyppee& xx) { cc.ppuusshh__bbaacckk(xx); }
vvooiidd ppoopp() { cc.ppoopp__bbaacckk(); }
};
That is, a ssttaacckk is simply an interface to a container of the type passed to it as a template argument.
All ssttaacckk does is to eliminate the non-stack operations on its container from the interface and give
bbaacckk(), ppuusshh__bbaacckk(), and ppoopp__bbaacckk() their conventional names: ttoopp(), ppuusshh(), and ppoopp().
By default, a ssttaacckk makes a ddeeqquuee to hold its elements, but any sequence that provides bbaacckk(),
ppuusshh__bbaacckk(), and ppoopp__bbaacckk() can be used. For example:
ssttaacckk<cchhaarr> ss11;
ssttaacckk< iinntt,vveeccttoorr<iinntt> > ss22;
// uses a deque<char> to store elements of type char
// uses a vector<int> to store elements of type int
It is possible to supply an existing container to initialize a stack. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
476
Standard Containers
Chapter 17
vvooiidd pprriinntt__bbaacckkw
waarrddss(vveeccttoorr<iinntt>& vv)
{
ssttaacckk<iinntt> ssttaattee(vv); // initialize state from v
w
whhiillee (ssttaattee.ssiizzee()) {
ccoouutt << ssttaattee.ttoopp();
ssttaattee.ppoopp();
}
}
However, the elements of a container argument are copied, so supplying an existing container can
be expensive.
Elements are added to a ssttaacckk using ppuusshh__bbaacckk() on the container that is used to store the elements. Consequently, a ssttaacckk cannot overflow as long as there is memory available on the machine
for the container to acquire (using its allocator; see §19.4).
On the other hand, a ssttaacckk can underflow:
vvooiidd ff()
{
ssttaacckk<iinntt> ss;
ss.ppuusshh(22);
iiff (ss.eem
mppttyy()) {
// underflow is preventable
// don’t pop
}
eellssee {
// but not impossible
ss.ppoopp(); // fine: s.size() becomes 0
ss.ppoopp(); // undefined effect, probably bad
}
}
Note that one does not ppoopp() an element to use it. Instead, the ttoopp() is accessed and then
ppoopp()’d when it is no longer needed. This is not too inconvenient, and it is more efficient when
the ppoopp() isn’t necessary:
vvooiidd ff(ssttaacckk<cchhaarr>& ss)
{
iiff (ss.ttoopp()==´cc´) ss.ppoopp();
// ...
}
// remove optional initial ’c’
Unlike fully developed containers, ssttaacckk (like other container adapters) doesn’t have an allocator
template parameter. Instead, the ssttaacckk and its users rely on the allocator from the container used to
implement the ssttaacckk.
17.3.2 Queue [cont.queue]
Defined in <qquueeuuee>, a qquueeuuee is an interface to a container that allows the insertion of elements at
the bbaacckk() and the extraction of elements at the ffrroonntt():
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.3.2
Queue
477
tteem
mppllaattee <ccllaassss T
T, ccllaassss C = ddeeqquuee<T
T> > ccllaassss ssttdd::qquueeuuee {
pprrootteecctteedd:
C cc;
ppuubblliicc:
ttyyppeeddeeff ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee vvaalluuee__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee C
C::ssiizzee__ttyyppee ssiizzee__ttyyppee;
ttyyppeeddeeff C ccoonnttaaiinneerr__ttyyppee;
eexxpplliicciitt qquueeuuee(ccoonnsstt C
C& a = C
C()) : cc(aa) { }
bbooooll eem
mppttyy() ccoonnsstt { rreettuurrnn cc.eem
mppttyy(); }
ssiizzee__ttyyppee ssiizzee() ccoonnsstt { rreettuurrnn cc.ssiizzee(); }
vvaalluuee__ttyyppee& ffrroonntt() { rreettuurrnn cc.ffrroonntt(); }
ccoonnsstt vvaalluuee__ttyyppee& ffrroonntt() ccoonnsstt { rreettuurrnn cc.ffrroonntt(); }
vvaalluuee__ttyyppee& bbaacckk() { rreettuurrnn cc.bbaacckk(); }
ccoonnsstt vvaalluuee__ttyyppee& bbaacckk() ccoonnsstt { rreettuurrnn cc.bbaacckk(); }
vvooiidd ppuusshh(ccoonnsstt vvaalluuee__ttyyppee& xx) { cc.ppuusshh__bbaacckk(xx); }
vvooiidd ppoopp() { cc.ppoopp__ffrroonntt(); }
};
By default, a qquueeuuee makes a ddeeqquuee to hold its elements, but any sequence that provides ffrroonntt(),
bbaacckk(), ppuusshh__bbaacckk(), and ppoopp__ffrroonntt() can be used. Because a vveeccttoorr does not provide
ppoopp__ffrroonntt(), a vveeccttoorr cannot be used as the underlying container for a queue.
Queues seem to pop up somewhere in every system. One might define a server for a simple
message-based system like this:
ssttrruucctt M
Meessssaaggee {
// ...
};
vvooiidd sseerrvveerr(qquueeuuee<M
Meessssaaggee>& qq)
{
w
whhiillee(!qq.eem
mppttyy()) {
M
Meessssaaggee& m = qq.ffrroonntt(); // get hold of message
m
m.sseerrvviiccee();
// call function to serve request
qq.ppoopp();
// destroy message
}
}
Messages would be put on the qquueeuuee using ppuusshh().
If the requester and the server are running in different processes or threads, some form of synchronization of the queue access would be necessary. For example:
vvooiidd sseerrvveerr22(qquueeuuee<M
Meessssaaggee>& qq, L
Loocckk& llcckk)
{
w
whhiillee(!qq.eem
mppttyy()) {
M
Meessssaaggee m
m;
{
L
LoocckkP
Pttrr hh(llcckk);
// hold lock only while extracting message (see §14.4.7)
iiff (qq.eem
mppttyy()) rreettuurrnn;
// somebody else got the message
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
478
Standard Containers
Chapter 17
m = qq.ffrroonntt();
qq.ppoopp();
}
m
m.sseerrvviiccee();
// call function to serve request
}
}
There is no standard definition of concurrency or locking in C++ or in the world in general. Have a
look to see what your system has to offer and how to access it from C++ (§17.8[8]).
17.3.3 Priority Queue [cont.pqueue]
A pprriioorriittyy__qquueeuuee is a queue in which each element is given a priority that controls the order in
which the elements get to be ttoopp():
tteem
mppllaattee <ccllaassss T
T, ccllaassss C = vveeccttoorr<T
T>, ccllaassss C
Cm
mpp = lleessss<ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee> >
ccllaassss ssttdd::pprriioorriittyy__qquueeuuee {
pprrootteecctteedd:
C cc;
C
Cm
mpp ccm
mpp;
ppuubblliicc:
ttyyppeeddeeff ttyyppeennaam
mee C
C::vvaalluuee__ttyyppee vvaalluuee__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee C
C::ssiizzee__ttyyppee ssiizzee__ttyyppee;
ttyyppeeddeeff C ccoonnttaaiinneerr__ttyyppee;
eexxpplliicciitt pprriioorriittyy__qquueeuuee(ccoonnsstt C
Cm
mpp& aa11 = C
Cm
mpp(), ccoonnsstt C
C& aa22 = C
C())
: cc(aa22), ccm
mpp(aa11) { }
tteem
mppllaattee <ccllaassss IInn>
pprriioorriittyy__qquueeuuee(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt C
Cm
mpp& = C
Cm
mpp(), ccoonnsstt C
C& = C
C());
bbooooll eem
mppttyy() ccoonnsstt { rreettuurrnn cc.eem
mppttyy(); }
ssiizzee__ttyyppee ssiizzee() ccoonnsstt { rreettuurrnn cc.ssiizzee(); }
ccoonnsstt vvaalluuee__ttyyppee& ttoopp() ccoonnsstt { rreettuurrnn cc.ffrroonntt(); }
vvooiidd ppuusshh(ccoonnsstt vvaalluuee__ttyyppee&);
vvooiidd ppoopp();
};
The declaration of pprriioorriittyy__qquueeuuee is found in <qquueeuuee>.
By default, the pprriioorriittyy__qquueeuuee simply compares elements using the < operator and ppoopp()
returns the largest element:
ssttrruucctt M
Meessssaaggee {
iinntt pprriioorriittyy;
bbooooll ooppeerraattoorr<(ccoonnsstt M
Meessssaaggee& xx) ccoonnsstt { rreettuurrnn pprriioorriittyy < xx.pprriioorriittyy; }
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.3.3
Priority Queue
479
vvooiidd sseerrvveerr(pprriioorriittyy__qquueeuuee<M
Meessssaaggee>& qq, L
Loocckk& llcckk)
{
w
whhiillee(!qq.eem
mppttyy()) {
M
Meessssaaggee m
m;
{
L
LoocckkP
Pttrr hh(llcckk);
// hold lock only while extracting message (see §14.4.7)
iiff (qq.eem
mppttyy()) rreettuurrnn;
// somebody else got the message
m = qq.ttoopp();
qq.ppoopp();
}
m
m.sseerrvviiccee(); // call function to serve request
}
}
This example differs from the qquueeuuee example (§17.3.2) in that m
meessssaaggeess with higher priority will
get served first. The order in which elements with equal priority come to the head of the queue is
not defined. Two elements are considered of equal priority if neither has higher priority than the
other (§17.4.1.5).
An alternative to < for comparison can be provided as a template argument. For example, we
could sort strings in a case-insensitive manner by placing them in
pprriioorriittyy__qquueeuuee<ssttrriinngg,N
Nooccaassee> ppqq;
// use Nocase::operator()() for comparisons (§17.1.4.1)
using ppqq.ppuusshh() and then retrieving them using ppqq.ttoopp() and ppqq.ppoopp().
Objects defined by templates given different template arguments are of different types
(§13.6.3.1). For example:
vvooiidd ff(pprriioorriittyy__qquueeuuee<ssttrriinngg>& ppqq11)
{
ppqq = ppqq11; // error: type mismatch
}
We can supply a comparison criterion without affecting the type of a pprriioorriittyy__qquueeuuee by providing
a comparison object of the appropriate type as a constructor argument. For example:
ssttrruucctt SSttrriinngg__ccm
mpp { // type used to express comparison criteria at run time
SSttrriinngg__ccm
mpp(iinntt n = 00);
// use comparison criteria n
// ...
};
vvooiidd gg(pprriioorriittyy__qquueeuuee<ssttrriinngg,SSttrriinngg__ccm
mpp>& ppqq)
{
pprriioorriittyy__qquueeuuee<ssttrriinngg> ppqq22(SSttrriinngg__ccm
mpp(nnooccaassee));
ppqq = ppqq22; // ok: pq and pq2 are of the same type, pq now also uses String_cmp(nocase)
}
Keeping elements in order isn’t free, but it needn’t be expensive either. One useful way of implementing a pprriioorriittyy__qquueeuuee is to use a tree structure to keep track of the relative positions of elements. This gives an O
O(lloogg(nn)) cost of both ppuusshh() and ppoopp().
By default, a pprriioorriittyy__qquueeuuee makes a vveeccttoorr to hold its elements, but any sequence that provides ffrroonntt(), ppuusshh__bbaacckk(), ppoopp__bbaacckk(), and random iterators can be used. A pprriioorriittyy__qquueeuuee
is most likely implemented using a hheeaapp (§18.8).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
480
Standard Containers
Chapter 17
17.4 Associative Containers [cont.assoc]
An associative array is one of the most useful general, user-defined types. In fact, it is often a
built-in type in languages primarily concerned with text processing and symbolic processing. An
associative array, often called a map and sometimes called a dictionary, keeps pairs of values.
Given one value, called the key, we can access the other, called the mapped value. An associative
array can be thought of as an array for which the index need not be an integer:
tteem
mppllaattee<ccllaassss K
K, ccllaassss V
V> ccllaassss A
Assssoocc {
ppuubblliicc:
V
V& ooppeerraattoorr[](ccoonnsstt K
K&); // return a reference to the V corresponding to K
// ...
};
Thus, a key of type K names a mapped value of type V
V.
Associative containers are a generalization of the notion of an associative array. The m
maapp is a
traditional associative array, where a single value is associated with each unique key. A m
muullttiim
maapp
is an associative array that allows duplicate elements for a given key, and sseett and m
muullttiisseett can be
seen as degenerate associative arrays in which no value is associated with a key.
17.4.1 Map [cont.map]
Am
maapp is a sequence of (key,value) pairs that provides for fast retrieval based on the key. At most
one value is held for each key; in other words, each key in a m
maapp is unique. A m
maapp provides bidirectional iterators (§19.2.1).
The m
maapp requires that a less-than operation exist for its key types (§17.1.4.1) and keeps its elements sorted so that iteration over a m
maapp occurs in order. For elements for which there is no obvious order or when there is no need to keep the container sorted, we might consider using a
hhaasshh__m
maapp (§17.6).
17.4.1.1 Types [cont.map.types]
Am
maapp has the usual container member types (§16.3.1) plus a few relating to its specific function:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss ssttdd::m
maapp {
ppuubblliicc:
// types:
ttyyppeeddeeff K
Keeyy kkeeyy__ttyyppee;
ttyyppeeddeeff T m
maappppeedd__ttyyppee;
ttyyppeeddeeff ppaaiirr<ccoonnsstt K
Keeyy, T
T> vvaalluuee__ttyyppee;
ttyyppeeddeeff C
Cm
mpp kkeeyy__ccoom
mppaarree;
ttyyppeeddeeff A aallllooccaattoorr__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee A
A::rreeffeerreennccee rreeffeerreennccee;
ttyyppeeddeeff ttyyppeennaam
mee A
A::ccoonnsstt__rreeffeerreennccee ccoonnsstt__rreeffeerreennccee;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.4.1.1
Types
481
ttyyppeeddeeff implementation_defined1 iitteerraattoorr;
ttyyppeeddeeff implementation_defined2 ccoonnsstt__iitteerraattoorr;
ttyyppeeddeeff ttyyppeennaam
mee A
A::ssiizzee__ttyyppee ssiizzee__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee A
A::ddiiffffeerreennccee__ttyyppee ddiiffffeerreennccee__ttyyppee;
ttyyppeeddeeff ssttdd::rreevveerrssee__iitteerraattoorr<iitteerraattoorr> rreevveerrssee__iitteerraattoorr;
ttyyppeeddeeff ssttdd::rreevveerrssee__iitteerraattoorr<ccoonnsstt__iitteerraattoorr> ccoonnsstt__rreevveerrssee__iitteerraattoorr;
// ...
};
Note that the vvaalluuee__ttyyppee of a m
maapp is a (key,value) ppaaiirr. The type of the mapped values is referred
to as the m
maappppeedd__ttyyppee. Thus, a m
maapp is a sequence of ppaaiirr<ccoonnsstt K
Keeyy,m
maappppeedd__ttyyppee> elements.
As usual, the actual iterator types are implementation-defined. Since a m
maapp most likely is
implemented using some form of a tree, these iterators usually provide some form of tree traversal.
The reverse iterators are constructed from the standard rreevveerrssee__iitteerraattoorr templates (§19.2.5).
17.4.1.2 Iterators and Pairs [cont.map.iter]
Am
maapp provides the usual set of functions that return iterators (§16.3.2):
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > > ccllaassss m
maapp {
ppuubblliicc:
// ...
// iterators:
iitteerraattoorr bbeeggiinn();
ccoonnsstt__iitteerraattoorr bbeeggiinn() ccoonnsstt;
iitteerraattoorr eenndd();
ccoonnsstt__iitteerraattoorr eenndd() ccoonnsstt;
rreevveerrssee__iitteerraattoorr rrbbeeggiinn();
ccoonnsstt__rreevveerrssee__iitteerraattoorr rrbbeeggiinn() ccoonnsstt;
rreevveerrssee__iitteerraattoorr rreenndd();
ccoonnsstt__rreevveerrssee__iitteerraattoorr rreenndd() ccoonnsstt;
// ...
};
Iteration over a m
maapp is simply an iteration over a sequence of ppaaiirr<ccoonnsstt K
Keeyy,m
maappppeedd__ttyyppee> elements. For example, we might print out the entries of a phone book like this:
vvooiidd ff(m
maapp<ssttrriinngg,nnuum
mbbeerr>& pphhoonnee__bbooookk)
{
ttyyppeeddeeff m
maapp<ssttrriinngg,nnuum
mbbeerr>::ccoonnsstt__iitteerraattoorr C
CII;
ffoorr (C
CII p = pphhoonnee__bbooookk.bbeeggiinn(); pp!=pphhoonnee__bbooookk.eenndd(); ++pp)
ccoouutt << pp->ffiirrsstt << ´\\tt´ << pp->sseeccoonndd << ´\\nn´;
}
A m
maapp iterator presents the elements in ascending order of its keys (§17.4.1.5). Therefore, the
pphhoonnee__bbooookk entries will be output in lexicographical order.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
482
Standard Containers
Chapter 17
We refer to the first element of any ppaaiirr as ffiirrsstt and the second as sseeccoonndd independently of
what types they actually are:
tteem
mppllaattee <ccllaassss T
T11, ccllaassss T
T22> ssttrruucctt ssttdd::ppaaiirr {
ttyyppeeddeeff T
T11 ffiirrsstt__ttyyppee;
ttyyppeeddeeff T
T22 sseeccoonndd__ttyyppee;
T
T11 ffiirrsstt;
T
T22 sseeccoonndd;
ppaaiirr() :ffiirrsstt(T
T11()), sseeccoonndd(T
T22()) { }
ppaaiirr(ccoonnsstt T
T11& xx, ccoonnsstt T
T22& yy) :ffiirrsstt(xx), sseeccoonndd(yy) { }
tteem
mppllaattee<ccllaassss U
U, ccllaassss V
V>
ppaaiirr(ccoonnsstt ppaaiirr<U
U, V
V>& pp) :ffiirrsstt(pp.ffiirrsstt), sseeccoonndd(pp.sseeccoonndd) { }
};
The last constructor exists to allow conversions in the initializer (§13.6.2). For example:
ppaaiirr<iinntt,ddoouubbllee> ff(cchhaarr cc, iinntt ii)
{
rreettuurrnn ppaaiirr<iinntt,ddoouubbllee>(cc,ii); // conversions required
}
In a m
maapp, the key is the first element of the pair and the mapped value is the second.
The usefulness of ppaaiirr is not limited to the implementation of m
maapp, so it is a standard library
class in its own right. The definition of ppaaiirr is found in <uuttiilliittyy>. A function to make it convenient to create ppaaiirrs is also provided:
tteem
mppllaattee <ccllaassss T
T11, ccllaassss T
T22> ppaaiirr<T
T11,T
T22> ssttdd::m
maakkee__ppaaiirr(T
T11 tt11, T
T22 tt22)
{
rreettuurrnn ppaaiirr<T
T11,T
T22>(tt11,tt22);
}
A ppaaiirr is by default initialized to the default values of its element types. In particular, this implies
that elements of built-in types are initialized to 0 (§5.1.1) and ssttrriinnggs are initialized to the empty
string (§20.3.4). A type without a default constructor can be an element of a ppaaiirr only provided the
pair is explicitly initialized.
17.4.1.3 Subscripting [cont.map.element]
The characteristic m
maapp operation is the associative lookup provided by the subscript operator:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss m
maapp {
ppuubblliicc:
// ...
m
maappppeedd__ttyyppee& ooppeerraattoorr[](ccoonnsstt kkeeyy__ttyyppee& kk); // access element with key k
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.4.1.3
Subscripting
483
The subscript operator performs a lookup on the key given as an index and returns the corresponding value. If the key isn’t found, an element with the key and the default value of the m
maappppeedd__ttyyppee
is inserted into the m
maapp. For example:
vvooiidd ff()
{
m
maapp<ssttrriinngg,iinntt> m
m; // map starting out empty
iinntt x = m
m["H
Heennrryy"]; // create new entry for "Henry", initialize to 0, return 0
m
m["H
Haarrrryy"] = 77;
// create new entry for "Harry", initialize to 0, and assign 7
iinntt y = m
m["H
Heennrryy"]; // return the value from "Henry"’s entry
m
m["H
Haarrrryy"] = 99;
// change the value from "Harry"’s entry to 9
}
As a slightly more realistic example, consider a program that calculates sums of items presented as
input in the form of (item-name,value) pairs such as
nnaaiill 110000 hhaam
mm
meerr 2 ssaaw
w 3 ssaaw
w 4 hhaam
mm
meerr 7 nnaaiill 11000000 nnaaiill 225500
and also calculates the sum for each item. The main work can be done while reading the (itemname,value) pairs into a m
maapp:
vvooiidd rreeaaddiitteem
mss(m
maapp<ssttrriinngg,iinntt>& m
m)
{
ssttrriinngg w
woorrdd;
iinntt vvaall = 00;
w
whhiillee (cciinn >> w
woorrdd >> vvaall) m
m[w
woorrdd] += vvaall;
}
The subscript operation m
m[w
woorrdd] identifies the appropriate (ssttrriinngg,iinntt) pair and returns a reference to its iinntt part. This code takes advantage of the fact that a new element gets its iinntt value set to
0 by default.
Am
maapp constructed by rreeaaddiitteem
mss() can then be output using a conventional loop:
iinntt m
maaiinn()
{
m
maapp<ssttrriinngg,iinntt> ttbbll;
rreeaaddiitteem
mss(ttbbll);
iinntt ttoottaall = 00;
ttyyppeeddeeff m
maapp<ssttrriinngg,iinntt>::ccoonnsstt__iitteerraattoorr C
CII;
ffoorr (C
CII p = ttbbll.bbeeggiinn(); pp!=ttbbll.eenndd(); ++pp) {
ttoottaall += pp->sseeccoonndd;
ccoouutt << pp->ffiirrsstt << ´\\tt´ << pp->sseeccoonndd << ´\\nn´;
}
ccoouutt << "----------------\\nnttoottaall\\tt" << ttoottaall << ´\\nn´;
rreettuurrnn !cciinn;
}
Given the input above, the output is:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
484
Standard Containers
Chapter 17
hhaam
mm
meerr 9
nnaaiill
11335500
ssaaw
w
7
---------------ttoottaall
11336666
Note that the items are printed in lexical order (§17.4.1, §17.4.1.5).
A subscripting operation must find the key in the m
maapp. This, of course, is not as cheap as subscripting an array with an integer. The cost is O
O(lloogg(ssiizzee__ooff__m
maapp)), which is acceptable for
many applications. For applications for which this is too expensive, a hashed container is often the
answer (§17.6).
Subscripting a m
maapp adds a default element when the key is not found. Therefore, there is no
version of ooppeerraattoorr[]() for ccoonnsstt m
maapps. Furthermore, subscripting can be used only if the
m
maappppeedd__ttyyppee (value type) has a default value. If the programmer simply wants to see if a key is
present, the ffiinndd() operation (§17.4.1.6) can be used to locate a kkeeyy without modifying the m
maapp.
17.4.1.4 Constructors [cont.map.ctor]
Am
maapp provides the usual complement of constructors, etc. (§16.3.4) :
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp =lleessss<K
Keeyy>,
ccllaassss A =aallllooccaattoorr<ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss m
maapp {
ppuubblliicc:
// ...
// construct/copy/destroy:
eexxpplliicciitt m
maapp(ccoonnsstt C
Cm
mpp& = C
Cm
mpp(), ccoonnsstt A
A& = A
A());
tteem
mppllaattee <ccllaassss IInn> m
maapp(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt C
Cm
mpp& = C
Cm
mpp(), ccoonnsstt A
A& = A
A());
m
maapp(ccoonnsstt m
maapp&);
~m
maapp();
m
maapp& ooppeerraattoorr=(ccoonnsstt m
maapp&);
// ...
};
Copying a container implies allocating space for its elements and making copies of each element
(§16.3.4). This can be very expensive and should be done only when necessary. Consequently,
containers such as m
maapps tend to be passed by reference.
The member template constructor takes a sequence of ppaaiirr<ccoonnsstt K
Keeyy,T
T>s described by a pair
input iterator IInn. It iinnsseerrtt()s (§17.4.1.7) the elements from the sequence into the m
maapp.
17.4.1.5 Comparisons [cont.map.comp]
To find an element in a m
maapp given a key, the m
maapp operations must compare keys. Also, iterators
traverse a m
maapp in order of increasing key values, so insertion will typically also compare keys (to
place an element into a tree structure representing the m
maapp).
By default, the comparison used for keys is < (less than), but an alternative can be provided as a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.4.1.5
Comparisons
485
template parameter or as a constructor argument (see §17.3.3). The comparison given is a comparison of keys, but the vvaalluuee__ttyyppee of a m
maapp is a (key,value) pair. Consequently, vvaalluuee__ccoom
mpp() is
provided to compare such pairs using the key comparison function:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss m
maapp {
ppuubblliicc:
// ...
ttyyppeeddeeff C
Cm
mpp kkeeyy__ccoom
mppaarree;
ccllaassss vvaalluuee__ccoom
mppaarree : ppuubblliicc bbiinnaarryy__ffuunnccttiioonn<vvaalluuee__ttyyppee,vvaalluuee__ttyyppee,bbooooll> {
ffrriieenndd ccllaassss m
maapp;
pprrootteecctteedd:
C
Cm
mpp ccm
mpp;
vvaalluuee__ccoom
mppaarree(C
Cm
mpp cc) : ccm
mpp(cc) {}
ppuubblliicc:
bbooooll ooppeerraattoorr()(ccoonnsstt T
T& xx, ccoonnsstt T
T& yy) ccoonnsstt { rreettuurrnn ccm
mpp(xx.ffiirrsstt, yy.ffiirrsstt); }
};
kkeeyy__ccoom
mppaarree kkeeyy__ccoom
mpp() ccoonnsstt;
vvaalluuee__ccoom
mppaarree vvaalluuee__ccoom
mpp() ccoonnsstt;
// ...
};
For example:
m
maapp<ssttrriinngg,iinntt> m
m11;
m
maapp<ssttrriinngg,iinntt,N
Nooccaassee> m
m22;
m
maapp<ssttrriinngg,iinntt,SSttrriinngg__ccm
mpp> m
m33;
m
maapp<ssttrriinngg,iinntt> m
m44(SSttrriinngg__ccm
mpp(lliitteerraarryy));
// specify comparison type (§17.1.4.1)
// specify comparison type (§17.1.4.1)
// pass comparison object
The kkeeyy__ccoom
mpp() and vvaalluuee__ccoom
mpp() member functions make it possible to query a m
maapp for the
kind of comparisons used for keys and values. This is usually done to supply the same comparison
criterion to some other container or algorithm. For example:
vvooiidd ff(m
maapp<ssttrriinngg,iinntt>& m
m)
{
m
maapp<ssttrriinngg,iinntt> m
mm
m;
m
maapp<ssttrriinngg,iinntt> m
mm
mm
m(m
m.kkeeyy__ccoom
mpp());
// ...
}
// compare using < by default
// compare the way m does
See §17.1.4.1 for an example of how to define a particular comparison and §18.4 for an explanation
of function objects in general.
17.4.1.6 Map Operations [cont.map.map]
The crucial idea for m
maapps and indeed for all associative containers is to gain information based on a
key. Several specialized operations are provided for that:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
486
Standard Containers
Chapter 17
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss m
maapp {
ppuubblliicc:
// ...
// map operations:
iitteerraattoorr ffiinndd(ccoonnsstt kkeeyy__ttyyppee& kk);
ccoonnsstt__iitteerraattoorr ffiinndd(ccoonnsstt kkeeyy__ttyyppee& kk) ccoonnsstt;
// find element with key k
ssiizzee__ttyyppee ccoouunntt(ccoonnsstt kkeeyy__ttyyppee& kk) ccoonnsstt;
// find number of elements with key k
iitteerraattoorr lloow
weerr__bboouunndd(ccoonnsstt kkeeyy__ttyyppee& kk);
// find first element with key k
ccoonnsstt__iitteerraattoorr lloow
weerr__bboouunndd(ccoonnsstt kkeeyy__ttyyppee& kk) ccoonnsstt;
iitteerraattoorr uuppppeerr__bboouunndd(ccoonnsstt kkeeyy__ttyyppee& kk);
// find first element with key greater than k
ccoonnsstt__iitteerraattoorr uuppppeerr__bboouunndd(ccoonnsstt kkeeyy__ttyyppee& kk) ccoonnsstt;
ppaaiirr<iitteerraattoorr,iitteerraattoorr> eeqquuaall__rraannggee(ccoonnsstt kkeeyy__ttyyppee& kk);
ppaaiirr<ccoonnsstt__iitteerraattoorr,ccoonnsstt__iitteerraattoorr> eeqquuaall__rraannggee(ccoonnsstt kkeeyy__ttyyppee& kk) ccoonnsstt;
// ...
};
Am
m.ffiinndd(kk) operation simply yields an iterator to an element with the key kk. If there is no such
element, the iterator returned is m
m.eenndd(). For a container with unique keys, such as m
maapp and sseett,
the resulting iterator will point to the unique element with the key kk. For a container with nonunique keys, such as m
muullttiim
maapp and m
muullttiisseett, the resulting iterator will point to the first element that
has that key. For example:
vvooiidd ff(m
maapp<ssttrriinngg,iinntt>& m
m)
{
m
maapp<ssttrriinngg,iinntt>::iitteerraattoorr p = m
m.ffiinndd("G
Goolldd");
iiff (pp!=m
m.eenndd()) {
// if "Gold" was found
// ...
}
eellssee iiff (m
m.ffiinndd("SSiillvveerr")!=m
m.eenndd()) { // look for "Silver"
// ...
}
// ...
}
For a m
muullttiim
maapp (§17.4.2), finding the first match is rarely as useful as finding all matches;
m
m.lloow
weerr__bboouunndd(kk) and m
m.uuppppeerr__bboouunndd(kk) give the beginning and the end of the subsequence
of elements of m with the key kk. As usual, the end of a sequence is an iterator to the one-past-thelast element of the sequence. For example:
vvooiidd ff(m
muullttiim
maapp<ssttrriinngg,iinntt>& m
m)
{
m
muullttiim
maapp<ssttrriinngg,iinntt>::iitteerraattoorr llbb = m
m.lloow
weerr__bboouunndd("G
Goolldd");
m
muullttiim
maapp<ssttrriinngg,iinntt>::iitteerraattoorr uubb = m
m.uuppppeerr__bboouunndd("G
Goolldd");
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.4.1.6
Map Operations
487
ffoorr (m
muullttiim
maapp<ssttrriinngg,iinntt>::iitteerraattoorr p = llbb; pp!=uubb; ++pp) {
// ...
}
}
Finding the upper bound and lower bound by two separate operations is neither elegant nor efficient. Consequently, the operation eeqquuaall__rraannggee() is provided to deliver both. For example:
vvooiidd ff(m
muullttiim
maapp<ssttrriinngg,iinntt>& m
m)
{
ttyyppeeddeeff m
muullttiim
maapp<ssttrriinngg,iinntt>::iitteerraattoorr M
MII;
ppaaiirr<M
MII,M
MII> g = m
m.eeqquuaall__rraannggee("G
Goolldd");
ffoorr (M
MII p = gg.ffiirrsstt; pp!=gg.sseeccoonndd; ++pp) {
// ...
}
}
If lloow
weerr__bboouunndd(kk) doesn’t find kk, it returns an iterator to the first element that has a key greater
than kk, or eenndd() if no such greater element exists. This way of reporting failure is also used by
uuppppeerr__bboouunndd() and eeqquuaall__rraannggee().
17.4.1.7 List Operations [cont.map.modifier]
The conventional way of entering a value into an associative array is simply to assign to it using
subscripting. For example:
pphhoonnee__bbooookk["O
Orrddeerr ddeeppaarrttm
meenntt"] = 88222266333399;
This will make sure that the Order department has the desired entry in the pphhoonnee__bbooookk independently of whether it had a prior entry. It is also possible to iinnsseerrtt() entries directly and to remove
entries using eerraassee():
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss m
maapp {
ppuubblliicc:
// ...
// list operations:
ppaaiirr<iitteerraattoorr, bbooooll> iinnsseerrtt(ccoonnsstt vvaalluuee__ttyyppee& vvaall); // insert (key,value) pair
iitteerraattoorr iinnsseerrtt(iitteerraattoorr ppooss, ccoonnsstt vvaalluuee__ttyyppee& vvaall); // pos is just a hint
tteem
mppllaattee <ccllaassss IInn> vvooiidd iinnsseerrtt(IInn ffiirrsstt, IInn llaasstt);
// insert elements from sequence
vvooiidd eerraassee(iitteerraattoorr ppooss);
ssiizzee__ttyyppee eerraassee(ccoonnsstt kkeeyy__ttyyppee& kk);
vvooiidd eerraassee(iitteerraattoorr ffiirrsstt, iitteerraattoorr llaasstt);
vvooiidd cclleeaarr();
// erase the element pointed to
// erase element with key k (if present)
// erase range
// ...
};
The operation m
m.iinnsseerrtt(vvaall) attempts to add a (K
Keeyy,T
T) pair vvaall to m
m. Since m
maapps rely on
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
488
Standard Containers
Chapter 17
unique keys, insertion takes place only if there is not already an element in the m with that key.
The return value of m
m.iinnsseerrtt(vvaall) is a ppaaiirr<iitteerraattoorr,bbooooll>. The bbooooll is ttrruuee if vvaall was actually
inserted. The iterator refers to the element of m holding the key kk. For example:
vvooiidd ff(m
maapp<ssttrriinngg,iinntt>& m
m)
{
ppaaiirr<ssttrriinngg,iinntt> pp9999("P
Paauull",9999);
ppaaiirr<m
maapp<ssttrriinngg,iinntt>::iitteerraattoorr,bbooooll> p = m
m.iinnsseerrtt(pp9999);
iiff (pp.sseeccoonndd) {
// "Paul" was inserted
}
eellssee {
// "Paul" was there already
}
m
maapp<ssttrriinngg,iinntt>::iitteerraattoorr i = pp.ffiirrsstt;
// points to m["Paul"]
// ...
}
Usually, we do not care whether a key is newly inserted or was present in the m
maapp before the
iinnsseerrtt(). When we are interested, it is often because we want to register the fact that a value is in
am
maapp somewhere else (outside the m
maapp). The other two versions of iinnsseerrtt() do not return an indication of whether a value was actually inserted.
Specifying a position, iinnsseerrtt(ppooss,vvaall), is simply a hint to the implementation to start the
search for the key vvaall at ppooss. If the hint is good, significant performance improvements can result.
If the hint is bad, you’d have done better without it both notationally and efficiency-wise. For
example:
vvooiidd ff(m
maapp<ssttrriinngg,iinntt>& m
m)
{
m
m["D
Diillbbeerrtt"] = 33; // neat, possibly less efficient
m
m.iinnsseerrtt(m
m.bbeeggiinn(),m
maakkee__ppaaiirr(ccoonnsstt ssttrriinngg("D
Dooggbbeerrtt"),9999));
}
// ugly
In fact, [] is little more than a convenient notation for iinnsseerrtt(). The result of m
m[kk] is equivalent
to the result of (*(m
m.iinnsseerrtt(m
maakkee__ppaaiirr(kk,V
V())).ffiirrsstt)).sseeccoonndd, where V
V() is the default
value for the mapped type. When you understand that equivalence, you probably understand associative containers.
Because [] always uses V
V(), you cannot use subscripting on a m
maapp with a value type that does
not have a default value. This is an unfortunate limitation of the standard associative containers.
However, the requirement of a default value is not a fundamental property of associative containers
(see §17.6.2).
You can erase elements specified by a key. For example:
vvooiidd ff(m
maapp<ssttrriinngg,iinntt>& m
m)
{
iinntt ccoouunntt = pphhoonnee__bbooookk.eerraassee("R
Raattbbeerrtt");
// ...
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.4.1.7
List Operations
489
The integer returned is the number of erased elements. In particular, ccoouunntt is 0 if there was no element with the key "R
Raattbbeerrtt" to erase. For a m
muullttiim
maapp or m
muullttiisseett, the value can be larger than 11.
Alternatively, one can erase an element given an iterator pointing to it or a range of elements given
a sequence. For example:
vvooiidd gg(m
maapp<ssttrriinngg,iinntt>& m
m)
{
m
m.eerraassee(m
m.ffiinndd("C
Caattbbeerrtt"));
m
m.eerraassee(m
m.ffiinndd("A
Alliiccee"),m
m.ffiinndd("W
Waallllyy"));
}
Naturally, it is faster to erase an element for which you already have an iterator than to first find the
element given its key and then erase it. After eerraassee(), the iterator cannot be used again because
the element to which it pointed is no longer there. Erasing eenndd() is harmless.
17.4.1.8 Other Functions [cont.map.etc]
Finally, a m
maapp provides the usual functions dealing with the number of elements and a specialized
ssw
waapp():
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss m
maapp {
ppuubblliicc:
// ...
// capacity:
ssiizzee__ttyyppee ssiizzee() ccoonnsstt;
// number of elements
ssiizzee__ttyyppee m
maaxx__ssiizzee() ccoonnsstt;
// size of largest possible map
bbooooll eem
mppttyy() ccoonnsstt { rreettuurrnn ssiizzee()==00; }
vvooiidd ssw
waapp(m
maapp&);
};
As usual, a value returned by ssiizzee() or m
maaxx__ssiizzee() is a number of elements.
In addition, m
maapp provides ==, !=, <, >, <=, >=, and ssw
waapp() as nonmember functions:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp, ccllaassss A
A>
bbooooll ooppeerraattoorr==(ccoonnsstt m
maapp<K
Keeyy,T
T,C
Cm
mpp,A
A>&, ccoonnsstt m
maapp<K
Keeyy,T
T,C
Cm
mpp,A
A>&);
// similarly !=, <, >, <=, and >=
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp, ccllaassss A
A>
vvooiidd ssw
waapp(m
maapp<K
Keeyy,T
T,C
Cm
mpp,A
A>&, m
maapp<K
Keeyy,T
T,C
Cm
mpp,A
A>&);
Why would anyone want to compare two m
maapps? When we specifically compare two m
maapps, we usually want to know not just if the m
maapps differ, but also how they differ if they do. In such cases, we
don’t use == or !=. However, by providing ==, <, and ssw
waapp() for every container, we make it
possible to write algorithms that can be applied to every container. For example, these functions
allow us to ssoorrtt() a vveeccttoorr of m
maapps and to have a sseett of m
maapps.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
490
Standard Containers
Chapter 17
17.4.2 Multimap [cont.multimap]
Am
muullttiim
maapp is like a m
maapp, except that it allows duplicate keys:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>,
ccllaassss A = aallllooccaattoorr< ppaaiirr<ccoonnsstt K
Keeyy,T
T> > >
ccllaassss ssttdd::m
muullttiim
maapp {
ppuubblliicc:
// like map, except:
iitteerraattoorr iinnsseerrtt(ccoonnsstt vvaalluuee__ttyyppee&);
// returns iterator, not pair
// no subscript operator []
};
For example (using C
Cssttrriinngg__lleessss from §17.1.4.1 to compare C-style strings):
vvooiidd ff(m
maapp<cchhaarr*,iinntt,C
Cssttrriinngg__lleessss>& m
m, m
muullttiim
maapp<cchhaarr*,iinntt,C
Cssttrriinngg__lleessss>& m
mm
m)
{
m
m.iinnsseerrtt(m
maakkee__ppaaiirr("xx",44));
m
m.iinnsseerrtt(m
maakkee__ppaaiirr("xx",55)); // no effect: there already is an entry for "x" (§17.4.1.7)
// now m["x"] == 4
m
mm
m.iinnsseerrtt(m
maakkee__ppaaiirr("xx",44));
m
mm
m.iinnsseerrtt(m
maakkee__ppaaiirr("xx",55));
// mm now holds both ("x",4) and ("x",5)
}
This implies that m
muullttiim
maapp cannot support subscripting by key values in the way m
maapp does. The
eeqquuaall__rraannggee(), lloow
weerr__bboouunndd(), and uuppppeerr__bboouunndd() operations (§17.4.1.6) are the primary
means of accessing multiple values with the same key.
Naturally, where several values can exist for a single key, a m
muullttiim
maapp is preferred over a m
maapp.
That happens far more often than people first think when they hear about m
muullttiim
maapp. In some ways,
am
muullttiim
maapp is even cleaner and more elegant than a m
maapp.
Because a person can easily have several phone numbers, a phone book is a good example of a
m
muullttiim
maapp. I might print my phone numbers like this:
vvooiidd pprriinntt__nnuum
mbbeerrss(ccoonnsstt m
muullttiim
maapp<ssttrriinngg,iinntt>& pphhoonnee__bbooookk)
{
ttyyppeeddeeff m
muullttiim
maapp<ssttrriinngg,iinntt>::ccoonnsstt__iitteerraattoorr II;
ppaaiirr<II,II> b = pphhoonnee__bbooookk.eeqquuaall__rraannggee("SSttrroouussttrruupp");
ffoorr (II i = bb.ffiirrsstt; i != bb.sseeccoonndd; ++ii) ccoouutt << ii->sseeccoonndd << ´\\nn´;
}
For a m
muullttiim
maapp, the argument to iinnsseerrtt() is always inserted. Consequently, the
m
muullttiim
maapp::iinnsseerrtt() returns an iterator rather than a ppaaiirr<iitteerraattoorr,bbooooll> like m
maapp does. For uniformity, the library could have provided the general form of iinnsseerrtt() for both m
maapp and m
muullttiim
maapp
even though the bbooooll would have been redundant for a m
muullttiim
maapp. Yet another design alternative
would have been to provide a simple iinnsseerrtt() that didn’t return a bbooooll in either case and then supply users of m
maapp with some other way of figuring out whether a key was newly inserted. This is a
case in which different interface design ideas clash.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.4.3
Set
491
17.4.3 Set [cont.set]
A sseett can be seen as a m
maapp (§17.4.1), where the values are irrelevant, so we keep track of only the
keys. This leads to only minor changes to the user interface:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>, ccllaassss A = aallllooccaattoorr<K
Keeyy> >
ccllaassss ssttdd::sseett {
ppuubblliicc:
// like map except:
ttyyppeeddeeff K
Keeyy vvaalluuee__ttyyppee;
ttyyppeeddeeff C
Cm
mpp vvaalluuee__ccoom
mppaarree;
// no subscript operator []
// the key itself is the value
};
Defining vvaalluuee__ttyyppee as the kkeeyy__ttyyppee type is a trick to allow code that uses m
maapps and sseetts to be
identical in many cases.
Note that sseett relies on a comparison operation (by default <) rather than equality (==). This
implies that equivalence of elements is defined by inequality (§17.1.4.1) and that iteration through
a sseett has a well-defined order.
Like m
maapp, sseett provides ==, !=, <, >, <=, >=, and ssw
waapp().
17.4.4 Multiset [cont.multiset]
Am
muullttiisseett is a sseett that allows duplicate keys:
tteem
mppllaattee <ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss C
Cm
mpp = lleessss<K
Keeyy>, ccllaassss A = aallllooccaattoorr<K
Keeyy> >
ccllaassss ssttdd::m
muullttiisseett {
ppuubblliicc:
// like set, except:
iitteerraattoorr iinnsseerrtt(ccoonnsstt vvaalluuee__ttyyppee&); // returns iterator, not pair
};
The eeqquuaall__rraannggee(), lloow
weerr__bboouunndd(), and uuppppeerr__bboouunndd() operations (§17.4.1.6) are the primary
means of accessing multiple occurrences of a key.
17.5 Almost Containers [cont.etc]
Built-in arrays (§5.2), ssttrriinnggs (Chapter 20), vvaallaarrrraayys (§22.4), and bbiittsseetts (§17.5.3) hold elements
and can therefore be considered containers for many purposes. However, each lacks some aspect or
other of the standard container interface, so these ‘‘almost containers’’ are not completely interchangeable with fully developed containers such as vveeccttoorr and lliisstt.
17.5.1 String [cont.string]
A bbaassiicc__ssttrriinngg provides subscripting, random-access iterators, and most of the notational conveniences of a container (Chapter 20). However, bbaassiicc__ssttrriinngg does not provide as wide a selection of
types as elements. It also is optimized for use as a string of characters and is typically used in ways
that differ significantly from a container.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
492
Standard Containers
Chapter 17
17.5.2 Valarray [cont.valarray]
A vvaallaarrrraayy (§22.4) is a vector for optimized numeric computation. Consequently, a vvaallaarrrraayy
doesn’t attempt to be a general container. A vvaallaarrrraayy provides many useful numeric operations.
However, of the standard container operations (§17.1.1), it offers only ssiizzee() and a subscript operator (§22.4.2). A pointer to an element of a vvaallaarrrraayy is a random-access iterator (§19.2.1).
17.5.3 Bitset [cont.bitset]
Often, aspects of a system, such as the state of an input stream (§21.3.3), are represented as a set of
flags indicating binary conditions such as good/bad, true/false, and on/off. C++ supports the notion
of small sets of flags efficiently through bitwise operations on integers (§6.2.4). These operations
include & (and), | (or), ^ (exclusive or), << (shift left), and >> (shift right). Class bbiittsseett<N
N> generalizes this notion and offers greater convenience by providing operations on a set of N bits
indexed from 0 through N
N-11, where N is known at compile time. For sets of bits that don’t fit into
a lloonngg iinntt, using a bbiittsseett is much more convenient than using integers directly. For smaller sets,
there may be an efficiency tradeoff. If you want to name the bits, rather than numbering them,
using a sseett (§17.4.3), an enumeration (§4.8), or a bitfield (§C.8.1) are alternatives.
A bbiittsseett<N
N> is an array of N bits. A bbiittsseett differs from a vveeccttoorr<bbooooll> (§16.3.11) by being of
fixed size, from sseett (§17.4.3) by having its bits indexed by integers rather than associatively by
value, and from both vveeccttoorr<bbooooll> and sseett by providing operations to manipulate the bits.
It is not possible to address a single bit directly using a built-in pointer (§5.1). Consequently,
bbiittsseett provides a reference-to-bit type. This is actually a generally useful technique for addressing
objects for which a built-in pointer for some reason is unsuitable:
tteem
mppllaattee<ssiizzee__tt N
N> ccllaassss ssttdd::bbiittsseett {
ppuubblliicc:
ccllaassss rreeffeerreennccee {
// reference to a single bit:
ffrriieenndd ccllaassss bbiittsseett;
rreeffeerreennccee();
ppuubblliicc:
// b[i] refers to the (i+1)’th bit:
~rreeffeerreennccee();
rreeffeerreennccee& ooppeerraattoorr=(bbooooll xx);
// for b[i] = x;
rreeffeerreennccee& ooppeerraattoorr=(ccoonnsstt rreeffeerreennccee&); // for b[i] = b[j];
bbooooll ooppeerraattoorr~() ccoonnsstt;
// return ˜b[i]
ooppeerraattoorr bbooooll() ccoonnsstt;
// for x = b[i];
rreeffeerreennccee& fflliipp();
// b[i].flip();
};
// ...
};
The bbiittsseett template is defined in namespace ssttdd and presented in <bbiittsseett>.
For historical reasons, bbiittsseett differs somewhat in style from other standard library classes. For
example, if an index (also known as a bit position) is out of range, an oouutt__ooff__rraannggee exception is
thrown. No iterators are provided. Bit positions are numbered right to left in the same way bits
often are in a word, so the value of bb[ii] is ppoow
w(ii,22). Thus, a bitset can be thought of as an N
N-bit
binary number:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.5.3
Bitset
position:
bitset<10>:
493
9 8 7 6 5 4 3 2 1 0
1 1 1 1 0 1 1 1 0 1
17.5.3.1 Constructors [cont.bitset.ctor]
A bbiittsseett can be constructed with default values, from the bits in an uunnssiiggnneedd lloonngg iinntt, or from a
ssttrriinngg:
tteem
mppllaattee<ssiizzee__tt N
N> ccllaassss bbiittsseett {
ppuubblliicc:
// ...
// constructors:
bbiittsseett();
bbiittsseett(uunnssiiggnneedd lloonngg vvaall);
// N zero-bits
// bits from val
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
// Tr is a character trait (§20.2)
eexxpplliicciitt bbiittsseett(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>& ssttrr,
// bits from string str
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>::ssiizzee__ttyyppee ppooss = 00,
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>::ssiizzee__ttyyppee n = bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>::nnppooss);
// ...
};
The default value of a bit is 00. When an uunnssiiggnneedd lloonngg iinntt argument is supplied, each bit in the
integer is used to initialize the corresponding bit in the bitset (if any). A bbaassiicc__ssttrriinngg (Chapter 20)
argument does the same, except that the character ´00´ gives the bitvalue 00, the character ´11´ gives
the bitvalue 11, and other characters cause an iinnvvaalliidd__aarrgguum
meenntt exception to be thrown. By default,
a complete string is used for initialization. However, in the style of a bbaassiicc__ssttrriinngg constructor
(§20.3.4), a user can specify that only the range of characters from ppooss to the end of the string or to
ppooss+nn are to be used. For example:
vvooiidd ff()
{
bbiittsseett<1100> bb11; // all 0
bbiittsseett<1166> bb22 = 00xxaaaaaaaa;
bbiittsseett<3322> bb33 = 00xxaaaaaaaa;
// 1010101010101010
// 00000000000000001010101010101010
bbiittsseett<1100> bb44("11001100110011001100");
// 1010101010
bbiittsseett<1100> bb55("1100111100111111001111111100",44);
// 0111011110
bbiittsseett<1100> bb66("1100111100111111001111111100",22,88); // 0011011101
bbiittsseett<1100> bb77("nn00gg0000dd");
bbiittsseett<1100> bb88 = "nn00gg0000dd";
// invalid_argument thrown
// error: no char* to bitset conversion
}
A key idea in the design of bbiittsseett is that an optimized implementation can be provided for bitsets
that fit in a single word. The interface reflects this assumption.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
494
Standard Containers
Chapter 17
17.5.3.2 Bit Manipulation Operations [cont.bitset.oper]
A bbiittsseett provides the operators for accessing individual bits and for manipulating all bits in the set:
tteem
mppllaattee<ssiizzee__tt N
N> ccllaassss ssttdd::bbiittsseett {
ppuubblliicc:
// ...
// bitset operations:
rreeffeerreennccee ooppeerraattoorr[](ssiizzee__tt ppooss);
// b[i]
bbiittsseett& ooppeerraattoorr&=(ccoonnsstt bbiittsseett& ss);
bbiittsseett& ooppeerraattoorr|=(ccoonnsstt bbiittsseett& ss);
bbiittsseett& ooppeerraattoorr^=(ccoonnsstt bbiittsseett& ss);
// and
// or
// exclusive or
bbiittsseett& ooppeerraattoorr<<=(ssiizzee__tt nn);
bbiittsseett& ooppeerraattoorr>>=(ssiizzee__tt nn);
// logical left shift (fill with zeros)
// logical right shift (fill with zeros)
bbiittsseett& sseett();
bbiittsseett& sseett(ssiizzee__tt ppooss, iinntt vvaall = 11);
// set every bit to 1
// b[pos]=val
bbiittsseett& rreesseett();
bbiittsseett& rreesseett(ssiizzee__tt ppooss);
// set every bit to 0
// b[pos]=0
bbiittsseett& fflliipp();
bbiittsseett& fflliipp(ssiizzee__tt ppooss);
// change the value of every bit
// change the value of b[pos]
bbiittsseett ooppeerraattoorr~() ccoonnsstt { rreettuurrnn bbiittsseett<N
N>(*tthhiiss).fflliipp(); }
// make complement set
bbiittsseett ooppeerraattoorr<<(ssiizzee__tt nn) ccoonnsstt { rreettuurrnn bbiittsseett<N
N>(*tthhiiss)<<=nn; } // make shifted set
bbiittsseett ooppeerraattoorr>>(ssiizzee__tt nn) ccoonnsstt { rreettuurrnn bbiittsseett<N
N>(*tthhiiss)>>=nn; } // make shifted set
// ...
};
The subscript operator throws oouutt__ooff__rraannggee if the subscript is out of range. There is no unchecked
subscript operation.
The bbiittsseett& returned by these operations is *tthhiiss. An operator returning a bbiittsseett (rather than a
bbiittsseett&) makes a copy of *tthhiiss, applies its operation to that copy, and returns the result. In particular, >> and << really are shift operations rather than I/O operations. The output operator for a bbiitt-sseett is a << that takes an oossttrreeaam
m and a bbiittsseett (§17.5.3.3).
When bits are shifted, a logical (rather than cyclic) shift is used. That implies that some bits
‘‘fall off the end’’ and that some positions get the default value 0. Note that because ssiizzee__tt is an
unsigned type, it is not possible to shift by a negative number. It does, however, imply that bb<<-11
shifts by a very large positive value, thus leaving every bit of the bbiittsseett b with the value 00. Your
compiler should warn against this.
17.5.3.3 Other Operations [cont.bitset.etc]
A bbiittsseett also supports common operations such as ssiizzee(), ==, I/O , etc.:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.5.3.3
Other Operations
495
tteem
mppllaattee<ssiizzee__tt N
N> ccllaassss bbiittsseett {
ppuubblliicc:
// ...
uunnssiiggnneedd lloonngg ttoo__uulloonngg() ccoonnsstt;
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A> bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A> ttoo__ssttrriinngg() ccoonnsstt;
ssiizzee__tt ccoouunntt() ccoonnsstt;
ssiizzee__tt ssiizzee() ccoonnsstt { rreettuurrnn N
N; }
// number of bits with value 1
// number of bits
bbooooll ooppeerraattoorr==(ccoonnsstt bbiittsseett& ss) ccoonnsstt;
bbooooll ooppeerraattoorr!=(ccoonnsstt bbiittsseett& ss) ccoonnsstt;
bbooooll tteesstt(ssiizzee__tt ppooss) ccoonnsstt;
bbooooll aannyy() ccoonnsstt;
bbooooll nnoonnee() ccoonnsstt;
// true if b[pos] is 1
// true if any bit is 1
// true if no bit is 1
};
The operations ttoo__uulloonngg() and ttoo__ssttrriinngg() provide the inverse operations to the constructors. To
avoid nonobvious conversions, named operations were preferred over conversion operations. If the
value of the bbiittsseett has so many significant bits that it cannot be represented as an uunnssiiggnneedd lloonngg,
ttoo__uulloonngg() throws oovveerrfflloow
w__eerrrroorr.
The ttoo__ssttrriinngg() operation produces a string of the desired type holding a sequence of ´00´ and
´11´ characters; bbaassiicc__ssttrriinngg is the template used to implement strings (Chapter 20). We could use
ttoo__ssttrriinngg to write out the binary representation of an iinntt:
vvooiidd bbiinnaarryy(iinntt ii)
{
bbiittsseett<88*ssiizzeeooff(iinntt)> b = ii;
// assume 8-bit byte (see also §22.2)
ccoouutt << bb.tteem
mppllaattee ttoo__ssttrriinngg<cchhaarr>() << ´\\nn´;
}
Unfortunately, invoking an explicitly qualified member template requires a rather elaborate and
rare syntax (§C.13.6).
In addition to the member functions, bbiittsseett provides binary & (and), | (or), ^ (exclusive or), and
the usual I/O operators:
tteem
mppllaattee<ssiizzee__tt N
N> bbiittsseett<N
N>& ssttdd::ooppeerraattoorr&(ccoonnsstt bbiittsseett<N
N>&, ccoonnsstt bbiittsseett<N
N>&);
tteem
mppllaattee<ssiizzee__tt N
N> bbiittsseett<N
N>& ssttdd::ooppeerraattoorr|(ccoonnsstt bbiittsseett<N
N>&, ccoonnsstt bbiittsseett<N
N>&);
tteem
mppllaattee<ssiizzee__tt N
N> bbiittsseett<N
N>& ssttdd::ooppeerraattoorr^(ccoonnsstt bbiittsseett<N
N>&, ccoonnsstt bbiittsseett<N
N>&);
tteem
mppllaattee <ccllaassss cchhaarrT
T, ccllaassss T
Trr, ssiizzee__tt N
N>
bbaassiicc__iissttrreeaam
m<cchhaarrT
T,T
Trr>& ssttdd::ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<cchhaarrT
T,T
Trr>&, bbiittsseett<N
N>&);
tteem
mppllaattee <ccllaassss cchhaarrT
T, ccllaassss T
Trr, ssiizzee__tt N
N>
bbaassiicc__oossttrreeaam
m<cchhaarrT
T,T
Trr>& ssttdd::ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarrT
T,T
Trr>&, ccoonnsstt bbiittsseett<N
N>&);
We can therefore write out a bitset without first converting it to a string. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
496
Standard Containers
Chapter 17
vvooiidd bbiinnaarryy(iinntt ii)
{
bbiittsseett<88*ssiizzeeooff(iinntt)> b = ii;
ccoouutt << b << ´\\nn´;
}
// assume 8-bit byte (see also §22.2)
This prints the bits represented as 11s and 00s left-to-right, with the most significant bit leftmost.
17.5.4 Built-In Arrays [cont.array]
A built-in array supplies subscripting and random-access iterators in the form of ordinary pointers
(§2.7.2). However, an array doesn’t know its own size, so users must keep track of that size. In
general, an array doesn’t provide the standard member operations and types.
It is possible, and sometimes useful, to provide an ordinary array in a guise that provides the
notational convenience of a standard container without changing its low-level nature:
tteem
mppllaattee<ccllaassss T
T, iinntt m
maaxx> ssttrruucctt cc__aarrrraayy {
ttyyppeeddeeff T vvaalluuee__ttyyppee;
ttyyppeeddeeff T
T* iitteerraattoorr;
ttyyppeeddeeff ccoonnsstt T
T* ccoonnsstt__iitteerraattoorr;
ttyyppeeddeeff T
T& rreeffeerreennccee;
ttyyppeeddeeff ccoonnsstt T
T& ccoonnsstt__rreeffeerreennccee;
T vv[m
maaxx];
ooppeerraattoorr T
T*() { rreettuurrnn vv; }
rreeffeerreennccee ooppeerraattoorr[](ssiizzee__tt ii) { rreettuurrnn vv[ii]; }
ccoonnsstt__rreeffeerreennccee ooppeerraattoorr[](ssiizzee__tt ii) ccoonnsstt { rreettuurrnn vv[ii]; }
iitteerraattoorr bbeeggiinn() { rreettuurrnn vv; }
ccoonnsstt__iitteerraattoorr bbeeggiinn() ccoonnsstt { rreettuurrnn vv; }
iitteerraattoorr eenndd() { rreettuurrnn vv+m
maaxx; }
ccoonnsstt__iitteerraattoorr eenndd() ccoonnsstt { rreettuurrnn vv+m
maaxx; }
ppttrrddiiffff__tt ssiizzee() ccoonnsstt { rreettuurrnn m
maaxx; }
};
The cc__aarrrraayy template is not part of the standard library. It is presented here as a simple example of
how to fit a ‘‘foreign’’ container into the standard container framework. It can be used with standard algorithms (Chapter 18) using bbeeggiinn(), eenndd(), etc. It can be allocated on the stack without
any indirect use of dynamic memory. Also, it can be passed to a C-style function that expects a
pointer. For example:
vvooiidd ff(iinntt* pp, iinntt sszz);
// C-style
vvooiidd gg()
{
cc__aarrrraayy<iinntt,1100> aa;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.5.4
Built-In Arrays
497
ff(aa,aa.ssiizzee());
// C-style use
cc__aarrrraayy<iinntt,1100>::iitteerraattoorr p = ffiinndd(aa.bbeeggiinn(),aa.eenndd(),777777); // C++/STL style use
// ...
}
17.6 Defining a New Container [cont.hash]
The standard containers provide a framework to which a user can add. Here, I show how to provide
a container in such a way that it can be used interchangeably with the standard containers wherever
reasonable. The implementation is meant to be realistic, but it is not optimal. The interface is chosen to be very close to that of existing, widely-available, and high-quality implementations of the
notion of a hhaasshh__m
maapp. Use the hhaasshh__m
maapp provided here to study the general issues. Then, use a
supported hhaasshh__m
maapp for production use.
17.6.1 Hash_map [cont.hash.map]
Am
maapp is an associative container that accepts almost any type as its element type. It does that by
relying only on a less-than operation for comparing elements (§17.4.1.5). However, if we know
more about a key type we can often reduce the time needed to find an element by providing a hash
function and implementing a container as a hash table.
A hash function is a function that quickly maps a value to an index in such a way that two distinct values rarely end up with the same index. Basically, a hash table is implemented by placing a
value at its index, unless another value is already placed there, and ‘‘nearby’’ if one is. Finding an
element placed at its index is fast, and finding one ‘‘nearby’’ is not slow, provided equality testing
is reasonably fast. Consequently, it is not uncommon for a hhaasshh__m
maapp to provide five to ten times
faster lookup than a m
maapp for larger containers, where the speed of lookup matters most. On the
other hand, a hhaasshh__m
maapp with an ill-chosen hash function can be much slower than a m
maapp.
There are many ways of implementing a hash table. The interface of hhaasshh__m
maapp is designed to
differ from that of the standard associative containers only where necessary to gain performance
through hashing. The most fundamental difference between a m
maapp and a hhaasshh__m
maapp is that a m
maapp
requires a < for its element type, while a hhaasshh__m
maapp requires an == and a hash function. Thus, a
hhaasshh__m
maapp must differ from a m
maapp in the non-default ways of creating one. For example:
m
maapp<ssttrriinngg,iinntt> m
m11;
m
maapp<ssttrriinngg,iinntt,N
Nooccaassee> m
m22;
// compare strings using <
// compare strings using Nocase() (§17.1.4.1)
hhaasshh__m
maapp<ssttrriinngg,iinntt> hhm
m11;
// hash using Hash<string>() (§17.6.2.3), compare using ==
hhaasshh__m
maapp<ssttrriinngg,iinntt,hhffcctt> hhm
m22;
// hash using hfct(), compare using ==
hhaasshh__m
maapp<ssttrriinngg,iinntt,hhffcctt,eeqqll> hhm
m33; // hash using hfct(), compare using eql
A container using hashed lookup is implemented using one or more tables. In addition to holding
its elements, the container needs to keep track of which values have been associated with each
hashed value (‘‘index’’ in the prior explanation); this is done using a ‘‘hash table.’’ Most hash
table implementations seriously degrade in performance if that table gets ‘‘too full,’’ say 75% full.
Consequently, the hhaasshh__m
maapp defined next is automatically resized when it gets too full. However,
resizing can be expensive, so it is useful to be able to specify an initial size.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
498
Standard Containers
Chapter 17
Thus, a first approximation of a hhaasshh__m
maapp looks like this:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss hhaasshh__m
maapp {
// like map, except:
ttyyppeeddeeff H H
Haasshheerr;
ttyyppeeddeeff E
EQ
Q kkeeyy__eeqquuaall;
hhaasshh__m
maapp(ccoonnsstt T
T& ddvv =T
T(), ssiizzee__ttyyppee n =110011, ccoonnsstt H
H& hhff =H
H(), ccoonnsstt E
EQ
Q& =E
EQ
Q());
tteem
mppllaattee<ccllaassss IInn> hhaasshh__m
maapp(IInn ffiirrsstt, IInn llaasstt,
ccoonnsstt T
T& ddvv =T
T(), ssiizzee__ttyyppee n =110011, ccoonnsstt H
H& hhff =H
H(), ccoonnsstt E
EQ
Q& =E
EQ
Q());
};
Basically, this is the m
maapp interface (§17.4.1.4), with < replaced by == and a hash function.
The uses of a m
maapp in this book so far (§3.7.4, §6.1, §17.4.1) can be converted to use a
hhaasshh__m
maapp simply by changing the name m
maapp to hhaasshh__m
maapp. Often, a change between a m
maapp and a
hhaasshh__m
maapp can be eased by using ttyyppeeddeeff. For example:
ttyyppeeddeeff hhaasshh__m
maapp<ssttrriinngg,rreeccoorrdd> M
Maapp;
M
Maapp ddiiccttiioonnaarryy;
The ttyyppeeddeeff is also useful to further hide the actual type of the dictionary from its users.
Though not strictly correct, I think of the tradeoff between a m
maapp and a hhaasshh__m
maapp as simply a
space/time tradeoff. If efficiency isn’t an issue, it isn’t worth wasting time choosing between them:
either will do well. For large and heavily used tables, hhaasshh__m
maapp has a definite speed advantage
and should be used unless space is a premium. Even then, I might consider other ways of saving
space before choosing a ‘‘plain’’ m
maapp. Actual measurement is essential to avoid optimizing the
wrong code.
The key to efficient hashing is the quality of the hash function. If a good hash function isn’t
available, a m
maapp can easily outperform a hhaasshh__m
maapp. Hashing based on a C-style string, a ssttrriinngg, or
an integer is usually very effective. However, it is worth remembering that the effectiveness of a
hash function critically depends on the actual values being hashed (§17.8[35]). A hhaasshh__m
maapp must
be used where < is not defined or is unsuitable for the intended key. Conversely, a hash function
does not define an ordering the way < does, so a m
maapp must be used when it is important to keep the
elements sorted.
Like m
maapp, hhaasshh__m
maapp provides ffiinndd() to allow a programmer to determine whether a key has
been inserted.
17.6.2 Representation and Construction [cont.hash.rep]
Many different implementations of a hhaasshh__m
maapp are possible. Here, I use one that is reasonably fast
and whose most important operations are fairly simple. The key operations are the constructors, the
lookup (operator []), the resize operation, and the operation removing an element (eerraassee()).
The simple implementation chosen here relies on a hash table that is a vveeccttoorr of pointers to
entries. Each E
Ennttrryy holds a kkeeyy, a vvaalluuee, a pointer to the next E
Ennttrryy (if any) with the same hash
value, and an eerraasseedd bit :
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.6.2
Representation and Construction
key
...
key
val
e next
val
e next
499
Expressed as declarations, it looks like this:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss hhaasshh__m
maapp {
// ...
pprriivvaattee:
// representation
ssttrruucctt E
Ennttrryy {
kkeeyy__ttyyppee kkeeyy;
m
maappppeedd__ttyyppee vvaall;
E
Ennttrryy* nneexxtt;
// hash overflow link
bbooooll eerraasseedd;
E
Ennttrryy(kkeeyy__ttyyppee kk, m
maappppeedd__ttyyppee vv, E
Ennttrryy* nn)
: kkeeyy(kk), vvaall(vv), nneexxtt(nn), eerraasseedd(ffaallssee) { }
};
vveeccttoorr<E
Ennttrryy> vv;
vveeccttoorr<E
Ennttrryy*> bb;
// the actual entries
// the hash table: pointers into v
// ...
};
Note the eerraasseedd bit. The way several values with the same hash value are handled here makes it
hard to remove an element. So instead of actually removing an element when eerraassee() is called, I
simply mark the element eerraasseedd and ignore it until the table is resized.
In addition to the main data structure, a hhaasshh__m
maapp needs a few pieces of administrative data.
Naturally, each constructor needs to set up all of this. For example:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss hhaasshh__m
maapp {
// ...
hhaasshh__m
maapp(ccoonnsstt T
T& ddvv =T
T(), ssiizzee__ttyyppee n =110011, ccoonnsstt H
H& h =H
H(), ccoonnsstt E
EQ
Q& e =E
EQ
Q())
: ddeeffaauulltt__vvaalluuee(ddvv), bb(nn), nnoo__ooff__eerraasseedd(00), hhaasshh(hh), eeqq(ee)
{
sseett__llooaadd();
// defaults
vv.rreesseerrvvee(m
maaxx__llooaadd*bb.ssiizzee());
// reserve space for growth
}
vvooiidd sseett__llooaadd(ffllooaatt m = 00.77, ffllooaatt g = 11.66) { m
maaxx__llooaadd = m
m; ggrroow
w = gg; }
// ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
500
Standard Containers
pprriivvaattee:
ffllooaatt m
maaxx__llooaadd;
ffllooaatt ggrroow
w;
Chapter 17
// keep v.size()<=b.size()*max_load
// when necessary, resize(bucket_count()*grow)
ssiizzee__ttyyppee nnoo__ooff__eerraasseedd;
// number of entries in v occupied by erased elements
H
Haasshheerr hhaasshh;
kkeeyy__eeqquuaall eeqq;
// hash function
// equality
ccoonnsstt T ddeeffaauulltt__vvaalluuee;
// default value used by []
};
The standard associative containers require that a mapped type have a default value (§17.4.1.7).
This restriction is not logically necessary and can be inconvenient. Making the default value an
argument allows us to write:
hhaasshh__m
maapp<ssttrriinngg,N
Nuum
mbbeerr> pphhoonnee__bbooookk11;
// default: Number()
hhaasshh__m
maapp<ssttrriinngg,N
Nuum
mbbeerr> pphhoonnee__bbooookk22(N
Nuum
mbbeerr(441111)); // default: Number(411)
17.6.2.1 Lookup [cont.hash.lookup]
Finally, we can provide the crucial lookup operations:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss hhaasshh__m
maapp {
// ...
m
maappppeedd__ttyyppee& ooppeerraattoorr[](ccoonnsstt kkeeyy__ttyyppee& kk);
iitteerraattoorr ffiinndd(ccoonnsstt kkeeyy__ttyyppee&);
ccoonnsstt__iitteerraattoorr ffiinndd(ccoonnsstt kkeeyy__ttyyppee&) ccoonnsstt;
// ...
};
To find a vvaalluuee, ooppeerraattoorr[]() uses a hash function to find an index in the hash table for the kkeeyy.
It then searches through the entries until it finds a matching kkeeyy. The vvaalluuee in that E
Ennttrryy is the
one we are seeking. If it is not found, a default value is entered:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
m
maappppeedd__ttyyppee& hhaasshh__m
maapp::ooppeerraattoorr[](ccoonnsstt kkeeyy__ttyyppee& kk)
{
ssiizzee__ttyyppee i = hhaasshh(kk)%bb.ssiizzee();
// hash
ffoorr(E
Ennttrryy* p = bb[ii]; pp; p = pp->nneexxtt) // search among entries hashed to i
iiff (eeqq(kk,pp->kkeeyy)) {
// found
iiff (pp->eerraasseedd) {
// re-insert
pp->eerraasseedd = ffaallssee;
nnoo__ooff__eerraasseedd--;
rreettuurrnn pp->vvaall = ddeeffaauulltt__vvaalluuee;
}
rreettuurrnn pp->vvaall;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.6.2.1
Lookup
501
// not found:
iiff (bb.ssiizzee()*m
maaxx__llooaadd < vv.ssiizzee()) { // if ‘‘too full’’
rreessiizzee(bb.ssiizzee()*ggrroow
w);
// grow
rreettuurrnn ooppeerraattoorr[](kk);
// rehash
}
vv.ppuusshh__bbaacckk(E
Ennttrryy(kk,ddeeffaauulltt__vvaalluuee,bb[ii]));
bb[ii] = &vv.bbaacckk();
// add Entry
// point to new element
rreettuurrnn bb[ii]->vvaall;
}
Unlike m
maapp, hhaasshh__m
maapp doesn’t rely on an equality test synthesized from a less-than operation
(§17.1.4.1). This is because of the call of eeqq() in the loop that looks through elements with the
same hash value. This loop is crucial to the performance of the lookup, and for common and obvious key types such as ssttrriinngg and C-style strings, the overhead of an extra comparison could be significant.
I could have used a sseett<E
Ennttrryy> to represent the set of values that have the same hash value.
However, if we have a good hash function (hhaasshh()) and an appropriately-sized hash table (bb), most
such sets will have exactly one element. Consequently, I linked the elements of that set together
using the nneexxtt field of E
Ennttrryy (§17.8[27]).
Note that b keeps pointers to elements of v and that elements are added to vv. In general,
ppuusshh__bbaacckk() can cause reallocation and thus invalidate pointers to elements (§16.3.5). However,
in this case constructors (§17.6.2) and rreessiizzee() carefully rreesseerrvvee() enough space so that no unexpected reallocation happens.
17.6.2.2 Erase and Rehash [cont.hash.erase]
Hashed lookup becomes inefficient when the table gets too full. To lower the chance of that happening, the table is automatically rreessiizzee()d by the subscript operator. The sseett__llooaadd() (§17.6.2)
provides a way of controlling when and how resizing happens. Other functions are provided to
allow a programmer to observe the state of a hhaasshh__m
maapp:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss hhaasshh__m
maapp {
// ...
vvooiidd rreessiizzee(ssiizzee__ttyyppee nn);
// make the size of the hash table n
vvooiidd eerraassee(iitteerraattoorr ppoossiittiioonn);
// erase the element pointed to
ssiizzee__ttyyppee ssiizzee() ccoonnsstt { rreettuurrnn vv.ssiizzee()-nnoo__ooff__eerraasseedd; } // number of elements
ssiizzee__ttyyppee bbuucckkeett__ccoouunntt() ccoonnsstt { rreettuurrnn bb.ssiizzee(); }
// size of hash table
H
Haasshheerr hhaasshh__ffuunn() ccoonnsstt { rreettuurrnn hhaasshh; }
kkeeyy__eeqquuaall kkeeyy__eeqq() ccoonnsstt { rreettuurrnn eeqq; }
// hash function used
// equality used
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
502
Standard Containers
Chapter 17
// ...
};
The rreessiizzee() operation is essential, reasonably simple, and potentially expensive:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
vvooiidd hhaasshh__m
maapp::rreessiizzee(ssiizzee__ttyyppee ss)
{
iiff (ss <= bb.ssiizzee()) rreettuurrnn;
bb.rreessiizzee(ss);
// add s-b.size() pointers
bb.cclleeaarr();
vv.rreesseerrvvee(ss*m
maaxx__llooaadd); // if v needs to reallocate, let it happen now
iiff (nnoo__ooff__eerraasseedd) { // really remove erased elements
ffoorr (ssiizzee__ttyyppee i = vv.ssiizzee()-11; 00<=ii; ii--)
iiff (vv[ii].eerraasseedd) {
vv.eerraassee(&vv[ii]);
iiff (--nnoo__ooff__eerraasseedd == 00) bbrreeaakk;
}
}
ffoorr (ssiizzee__ttyyppee i = 00; ii<vv.ssiizzee(); ii++) {
ssiizzee__ttyyppee iiii = hhaasshh(vv[ii].kkeeyy)%bb.ssiizzee();
vv[ii].nneexxtt = bb[iiii];
bb[iiii] = &vv[ii];
}
// rehash:
// hash
// link
}
If necessary, a user can ‘‘manually’’ call rreessiizzee() to ensure that the cost is incurred at a predictable
time. I have found a rreessiizzee() operation important in some applications, but it is not fundamental
to the notion of hash tables. Some implementation strategies don’t need it.
All of the real work is done elsewhere (and only if a hhaasshh__m
maapp is resized) , so eerraassee() is trivial:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss H = H
Haasshh<K
Keeyy>,
ccllaassss E
EQ
Q = eeqquuaall__ttoo<K
Keeyy>, ccllaassss A = aallllooccaattoorr<T
T> >
vvooiidd hhaasshh__m
maapp::eerraassee(iitteerraattoorr pp) // erase the element pointed to
{
iiff (pp->eerraasseedd == ffaallssee) nnoo__ooff__eerraasseedd++;
pp->eerraasseedd = ttrruuee;
}
17.6.2.3 Hashing [cont.hasher]
To complete hhaasshh__m
maapp::ooppeerraattoorr[](), we need to define hhaasshh() and eeqq(). For reasons that
will become clear in §18.4, a hash function is best defined as ooppeerraattoorr()() for a function object:
tteem
mppllaattee <ccllaassss T
T> ssttrruucctt H
Haasshh : uunnaarryy__ffuunnccttiioonn<T
T, ssiizzee__tt> {
ssiizzee__tt ooppeerraattoorr()(ccoonnsstt T
T& kkeeyy) ccoonnsstt;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.6.2.3
Hashing
503
A good hash function takes a key and returns an integer so that different keys yield different integers with high probability. Choosing a good hash function is an art. However, exclusive-or’ing the
bits of the key’s representation into an integer is often acceptable:
tteem
mppllaattee <ccllaassss T
T> ssiizzee__tt H
Haasshh<T
T>::ooppeerraattoorr()(ccoonnsstt T
T& kkeeyy) ccoonnsstt
{
ssiizzee__tt rreess = 00;
ssiizzee__tt lleenn = ssiizzeeooff(T
T);
ccoonnsstt cchhaarr* p = rreeiinntteerrpprreett__ccaasstt<ccoonnsstt cchhaarr*>(&kkeeyy);
w
whhiillee (lleenn--) rreess = (rreess<<11)^*pp++; // use bytes of key’s representation
rreettuurrnn rreess;
}
The use of rreeiinntteerrpprreett__ccaasstt (§6.2.7) is a good indication that something unsavory is going on and
that we can do better in cases when we know more about the object being hashed. In particular, if
an object contains a pointer, if the object is large, or if the alignment requirements on members
have left unused space (‘‘holes’’) in the representation, we can usually do better (see §17.8[29]).
A C-style string is a pointer (to the characters), and a ssttrriinngg contains a pointer. Consequently,
specializations are in order:
ssiizzee__tt H
Haasshh<cchhaarr*>::ooppeerraattoorr()(ccoonnsstt cchhaarr* kkeeyy) ccoonnsstt
{
ssiizzee__tt rreess = 00;
w
whhiillee (*kkeeyy) rreess = (rreess<<11)^*kkeeyy++;
rreettuurrnn rreess;
// use int value of characters
}
tteem
mppllaattee <ccllaassss C
C>
ssiizzee__tt H
Haasshh< bbaassiicc__ssttrriinngg<C
C> >::ooppeerraattoorr()(ccoonnsstt bbaassiicc__ssttrriinngg<C
C>& kkeeyy) ccoonnsstt
{
ssiizzee__tt rreess = 00;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
C>::ccoonnsstt__iitteerraattoorr C
CII;
C
CII p = kkeeyy.bbeeggiinn();
C
CII eenndd = kkeeyy.eenndd();
w
whhiillee (pp!=eenndd) rreess = (rreess<<11)^*pp++;
rreettuurrnn rreess;
// use int value of characters
}
An implementation of hhaasshh__m
maapp will include hash functions for at least integer and string keys.
For more adventurous key types, the user may have to help out with suitable specializations.
Experimentation supported by good measurement is essential when choosing a hash function. Intuition tends to work poorly in this area.
To complete the hhaasshh__m
maapp, we need to define the iterators and a minor host of trivial functions;
this is left as an exercise (§17.8[34]).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
504
Standard Containers
Chapter 17
17.6.3 Other Hashed Associative Containers [cont.hash.other]
For consistency and completeness, the hhaasshh__m
maapp should have matching hhaasshh__sseett,
hhaasshh__m
muullttiim
maapp, and hhaasshh__m
muullttiisseett. Their definitions are obvious from those of hhaasshh__m
maapp, m
maapp,
m
muullttiim
maapp, sseett, and m
muullttiisseett, so I leave these as an exercise (§17.8[34]). Good public domain and
commercial implementations of these hashed associative containers are available. For real programs, these should be preferred to locally concocted versions, such as mine.
17.7 Advice [cont.advice]
[1] By default, use vveeccttoorr when you need a container; §17.1.
[2] Know the cost (complexity, big-O measure) of every operation you use frequently; §17.1.2.
[3] The interface, implementation, and representation of a container are distinct concepts. Don’t
confuse them; §17.1.3.
[4] You can sort and search according to a variety of criteria; §17.1.4.1.
[5] Do not use a C-style string as a key unless you supply a suitable comparison criterion;
§17.1.4.1.
[6] You can define a comparison criteria so that equivalent, yet different, key values map to the
same key; §17.1.4.1.
[7] Prefer operations on the end of a sequence (bbaacckk-operations) when inserting and deleting elements; §17.1.4.1.
[8] Use lliisstt when you need to do many insertions and deletions from the front or the middle of a
container; §17.2.2.
[9] Use m
maapp or m
muullttiim
maapp when you primarily access elements by key; §17.4.1.
[10] Use the minimal set of operations to gain maximum flexibility; §17.1.1
[11] Prefer a m
maapp to a hhaasshh__m
maapp if the elements need to be kept in order; §17.6.1.
[12] Prefer a hhaasshh__m
maapp to a m
maapp when speed of lookup is essential; §17.6.1.
[13] Prefer a hhaasshh__m
maapp to a m
maapp if no less-than operation can be defined for the elements; §17.6.1.
[14] Use ffiinndd() when you need to check if a key is in an associative container; §17.4.1.6.
[15] U
Ussee eeqquuaall__rraannggee() to find all elements of a given key in an associative container; §17.4.1.6.
[16] Use m
muullttiim
maapp when several values need to be kept for a single key; §17.4.2.
[17] Use sseett or m
muullttiisseett when the key itself is the only value you need to keep; §17.4.3.
17.8 Exercises [cont.exercises]
The solutions to several exercises for this chapter can be found by looking at the source text of an
implementation of the standard library. Do yourself a favor: try to find your own solutions before
looking to see how your library implementer approached the problems. Then, look at your
implementation’s version of the containers and their operations.
1. (∗2.5) Understand the O
O() notation (§17.1.2). Do some measurements of operations on standard containers to determine the constant factors involved.
2. (∗2) Many phone numbers don’t fit into a lloonngg. Write a pphhoonnee__nnuum
mbbeerr type and a class that
provides a set of useful operations on a container of pphhoonnee__nnuum
mbbeerrss.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 17.8
Exercises
505
3. (∗2) Write a program that lists the distinct words in a file in alphabetical order. Make two versions: one in which a word is simply a whitespace-separated sequence of characters and one in
which a word is a sequence of letters separated by any sequence of non-letters.
4. (∗2.5) Implement a simple solitaire card game.
5. (∗1.5) Implement a simple test of whether a word is a palindrome (that is, if its representation is
symmetric; examples are aaddaa, oottttoo, and ttuutt). Implement a simple test of whether an integer is a
palindrome. Implement a simple test of a whether sentence is a palindrome. Generalize.
6. (∗1.5) Define a queue using (only) two ssttaacckks.
7. (∗1.5) Define a stack similar to ssttaacckk (§17.3.1), except that it doesn’t copy its underlying container and that it allows iteration over its elements.
8. (∗3) Your computer will have support for concurrent activities through the concept of a thread,
task, or process. Figure out how that is done. The concurrency mechanism will have a concept
of locking to prevent two tasks accessing the same memory simultaneously. Use the machine’s
locking mechanism to implement a lock class.
9. (∗2.5) Read a sequence of dates such as D
Deecc8855, D
Deecc5500, JJaann7766, etc., from input and then output
them so that later dates come first. The format of a date is a three-letter month followed by a
two-digit year. Assume that all the years are from the same century.
10. (∗2.5) Generalize the input format for dates to allow dates such as D
Deecc11998855, 1122/33/11999900,
(D
Deecc,3300,11995500), 33/66/22000011, etc. Modify exercise §17.8[9] to cope with the new formats.
11. (∗1.5) Use a bbiittsseett to print the binary values of some numbers, including 00, 11, -11, 1188, -1188, and
the largest positive iinntt.
12. (∗1.5) Use bbiittsseett to represent which students in a class were present on a given day. Read the
bbiittsseetts for a series of 12 days and determine who was present every day. Determine which students were present at least 8 days.
13. (∗1.5) Write a L
Liisstt of pointers that ddeelleettees the objects pointed to when it itself is destroyed or if
the element is removed from the L
Liisstt.
14. (∗1.5) Given a ssttaacckk object, print its elements in order (without changing the value of the stack).
15. (∗2.5) Complete hhaasshh__m
maapp (§17.6.1). This involves implementing ffiinndd() and eeqquuaall__rraannggee()
and devising a way of testing the completed template. Test hhaasshh__m
maapp with at least one key
type for which the default hash function would be unsuitable.
16. (∗2.5) Implement and test a list in the style of the standard lliisstt.
17. (∗2) Sometimes, the space overhead of a lliisstt can be a problem. Write and test a singly-linked
list in the style of a standard container.
18. (∗2.5) Implement a list that is like a standard lliisstt, except that it supports subscripting. Compare
the cost of subscripting for a variety of lists to the cost of subscripting a vveeccttoorr of the same
length.
19. (∗2) Implement a template function that merges two containers.
20. (∗1.5) Given a C-style string, determine whether it is a palindrome. Determine whether an initial sequence of at least three words in the string is a palindrome.
21. (∗2) Read a sequence of (nnaam
mee,vvaalluuee) pairs and produce a sorted list of
(nnaam
mee,ttoottaall,m
meeaann,m
meeddiiaann) 4-tuples. Print that list.
22. (∗2.5) Determine the space overhead of each of the standard containers on your implementation.
23. (∗3.5) Consider what would be a reasonable implementation strategy for a hhaasshh__m
maapp that
needed to use minimal space. Consider what would be a reasonable implementation strategy for
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
506
Standard Containers
Chapter 17
a hhaasshh__m
maapp that needed to use minimal lookup time. In each case, consider what operations
you might omit so as to get closer to the ideal (no space overhead and no lookup overhead,
respectively). Hint: There is an enormous literature on hash tables.
24. (∗2) Devise a strategy for dealing with overflow in hhaasshh__m
maapp (different values hashing to the
same hash value) that makes eeqquuaall__rraannggee() trivial to implement.
25. (∗2.5) Estimate the space overhead of a hhaasshh__m
maapp and then measure it. Compare the estimate
to the measurements. Compare the space overhead of your hhaasshh__m
maapp and your
implementation’s m
maapp.
26. (∗2.5) Profile your hhaasshh__m
maapp to see where the time is spent. Do the same for your
implementation’s m
maapp and a widely-distributed hhaasshh__m
maapp.
27. (∗2.5) Implement a hhaasshh__m
maapp based on a vveeccttoorr<m
maapp<K
K,V
V>*> so that each m
maapp holds all
keys that have the same hash value.
28. (∗3) Implement a hhaasshh__m
maapp using Splay trees (see D. Sleator and R. E. Tarjan: Self-Adjusting
Binary Search Trees, JACM, Vol. 32. 1985).
29. (∗2) Given a data structure describing a string-like entity:
ssttrruucctt SStt {
iinntt ssiizzee;
cchhaarr ttyyppee__iinnddiiccaattoorr;
cchhaarr* bbuuff;
// point to size characters
sstt(ccoonnsstt cchhaarr* pp); // allocate and fill buf
};
Create 1000 SStts and use them as keys for a hhaasshh__m
maapp. Devise a program to measure the performance of the hhaasshh__m
maapp. Write a hash function (a H
Haasshh; §17.6.2.3) specifically for SStt keys.
30. (∗2) Give at least four different ways of removing the eerraasseedd elements from a hhaasshh__m
maapp. You
should use a standard library algorithm (§3.8, Chapter 18) to avoid an explicit loop.
31. (∗3) Implement a hhaasshh__m
maapp that erases elements immediately.
32. (∗2) The hash function presented in §17.6.2.3 doesn’t always consider all of the representation
of a key. When will part of a representation be ignored? Write a hash function that always considers all of the representations of a key. Give an example of when it might be wise to ignore
part of a key and write a hash function that computes its value based only on the part of a key
considered relevant.
33. (∗2.5) The code of hash functions tends to be similar: a loop gets more data and then hashes it.
Define a H
Haasshh (§17.6.2.3) that gets its data by repeatedly calling a function that a user can
define on a per-type basis. For example:
ssiizzee__tt rreess = 00;
w
whhiillee (ssiizzee__tt v = hhaasshh(kkeeyy)) rreess = (rreess<<33)^vv;
Here, a user can define hhaasshh(K
K) for each type K that needs to be hashed.
34. (∗3) Given some implementation of hhaasshh__m
maapp, implement hhaasshh__m
muullttiim
maapp, hhaasshh__sseett, and
hhaasshh__m
muullttiisseett.
35. (∗2.5) Write a hash function intended to map uniformly distributed iinntt values into hash values
intended for a table size of about 1024. Given that function, devise a set of 1024 key values, all
of which map to the same value.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
18
________________________________________
________________________________________________________________________________________________________________________________________________________________
Algorithms and Function Objects
Form is liberating.
– engineers´ proverb
Introduction — overview of standard algorithms — sequences — function objects —
predicates — arithmetic objects — binders — member function objects — ffoorr__eeaacchh —
finding elements — ccoouunntt — comparing sequences — searching — copying — ttrraannss-ffoorrm
m — replacing and removing elements — filling a sequence — reordering — ssw
waapp
— sorted sequences — bbiinnaarryy__sseeaarrcchh — m
meerrggee — set operations — m
miinn and m
maaxx—
heaps — permutations — C-style algorithms — advice — exercises.
18.1 Introduction [algo.intro]
A container by itself is really not that interesting. To be genuinely useful, a container must be supported by basic operations such as finding its size, iterating, copying, sorting, and searching for elements. Fortunately, the standard library provides algorithms to serve the most common and basic
needs that users have of containers.
This chapter summarizes the standard algorithms and gives a few examples of their uses, a presentation of the key principles and techniques used to express the algorithms in C++, and a more
detailed explanation of a few key algorithms.
Function objects provide a mechanism through which a user can customize the behavior of the
standard algorithms. Function objects supply key information that an algorithm needs in order to
operate on a user’s data. Consequently, emphasis is placed on how function objects can be defined
and used.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
508
Algorithms and Function Objects
Chapter 18
18.2 Overview of Standard Library Algorithms [algo.summary]
At first glimpse, the standard library algorithms can appear overwhelming. However, there are just
60 of them. I have seen classes with more member functions. Furthermore, many algorithms share
a common basic behavior and a common interface style that eases understanding. As with language features, a programmer should use the algorithms actually needed and understood – and only
those. There are no awards for using the highest number of standard algorithms in a program. Nor
are there awards for using standard algorithms in the most clever and obscure way. Remember, a
primary aim of writing code is to make its meaning clear to the next person reading it – and that
person just might be yourself a few years hence. On the other hand, when doing something with
elements of a container, consider whether that action could be expressed as an algorithm in the style
of the standard library. That algorithm might already exist. If you don’t consider work in terms of
general algorithms, you will reinvent the wheel.
Each algorithm is expressed as a template function (§13.3) or a set of template functions. In
that way, an algorithm can operate on many kinds of sequences containing elements of a variety of
types. Algorithms that return an iterator (§19.1) as a result generally use the end of an input
sequence to indicate failure. For example:
vvooiidd ff(lliisstt<ssttrriinngg>& llss)
{
lliisstt<ssttrriinngg>::ccoonnsstt__iitteerraattoorr p = ffiinndd(llss.bbeeggiinn(),llss.eenndd(),"F
Frreedd");
iiff (pp == llss.eenndd()) {
// didn’t find "Fred"
}
eellssee {
// here, p points to "Fred"
}
}
The algorithms do not perform range checking on their input or output. Range errors must be prevented by other means (§18.3.1, §19.3). When an algorithm returns an iterator, that iterator is of
the same type as one of its inputs. In particular, an algorithm’s arguments control whether it
returns a ccoonnsstt__iitteerraattoorr or a non-ccoonnsstt iitteerraattoorr. For example:
vvooiidd ff(lliisstt<iinntt>& llii, ccoonnsstt lliisstt<ssttrriinngg>& llss)
{
lliisstt<iinntt>::iitteerraattoorr p = ffiinndd(llii.bbeeggiinn(),llii.eenndd(),4422);
lliisstt<ssttrriinngg>::ccoonnsstt__iitteerraattoorr q = ffiinndd(llss.bbeeggiinn(),llss.eenndd(),"R
Riinngg");
}
The algorithms in the standard library cover the most common general operations on containers
such as traversals, sorting, searching, and inserting and removing elements. The standard algorithms are all in the ssttdd namespace and their declarations are found in <aallggoorriitthhm
m>. Interestingly,
most of the really common algorithms are so simple that the template functions are typically inline.
This implies that the loops expressed by the algorithms benefit from aggressive per-function optimization.
The standard function objects are also in namespace ssttdd, but their declarations are found in
<ffuunnccttiioonnaall>. The function objects are designed to be easy to inline.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.2
Overview of Standard Library Algorithms
509
Nonmodifying sequence operations are used to extract information from a sequence or to find
the positions of elements in a sequence:
___________________________________________________________________
Nonmodifying Sequence Operations (§18.5) <algorithm>
__________________________________________________________________
____________________________________________________________________
Do operation for each element in a sequence.
ffoorr__eeaacchh(())
Find first occurrence of a value in a sequence.
ffiinndd(())
ffiinndd__iiff(())
Find first match of a predicate in a sequence.
ffiinndd__ffiirrsstt__ooff(())
Find a value from one sequence in another.
aaddjjaacceenntt__ffiinndd(())
Find an adjacent pair of values.
Count occurrences of a value in a sequence.
ccoouunntt(())
Count matches of a predicate in a sequence.
ccoouunntt__iiff(())
m
miissm
maattcchh(())
Find the first elements for which two sequences differ.
eeqquuaall(())
True if the elements of two sequences are pairwise equal.
sseeaarrcchh(())
Find the first occurrence of a sequence as a subsequence.
Find the last occurrence of a sequence as a subsequence.
ffiinndd__eenndd(())
sseeaarrcchh__nn(())
Find the nnth occurrence of a value in a sequence.
___________________________________________________________________
Most algorithms allow a user to specify the actual action performed for each element or pair of elements. This makes the algorithms much more general and useful than they appear at first glance.
In particular, a user can supply the criteria used for equality and difference (§18.4.2). Where reasonable, the most common and useful action is provided as a default.
Modifying sequence operations have little in common beyond the obvious fact that they might
change the values of elements of a sequence:
_____________________________________________________________________
Modifying Sequence Operations (§18.6) <algorithm>
______________________________________________________________________
____________________________________________________________________
t
tr
ra
an
ns
sf
fo
or
rm
m(
()
)
Apply
an
operation
to
every
element
in
a
sequence.
Copy a sequence starting with its first element.
ccooppyy(())
waarrdd(())
Copy a sequence starting with its last element.
ccooppyy__bbaacckkw
ssw
waapp(())
Swap two elements.
iitteerr__ssw
waapp(())
Swap two elements pointed to by iterators.
waapp__rraannggeess(())
Swap elements of two sequences.
ssw
Replace elements with a given value.
rreeppllaaccee(())
Replace elements matching a predicate.
rreeppllaaccee__iiff(())
rreeppllaaccee__ccooppyy(())
Copy sequence replacing elements with a given value.
rreeppllaaccee__ccooppyy__iiff(())
Copy sequence replacing elements matching a predicate.
Replace every element with a given value.
ffiillll(())
Replace first n elements with a given value.
ffiillll__nn(())
Replace every element with the result of an operation.
ggeenneerraattee(())
ggeenneerraattee__nn(())
Replace first n elements with the result of an operation.
rreem
moovvee(())
Remove elements with a given value.
rreem
moovvee__iiff(())
Remove elements matching a predicate.
_____________________________________________________________________
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
510
Algorithms and Function Objects
Chapter 18
______________________________________________________________________
Modifying Sequence Operations (continued) (§18.6) <algorithm>
_____________________________________________________________________
_______________________________________________________________________
moovvee__ccooppyy(())
Copy a sequence removing elements with a given value.
rreem
moovvee__ccooppyy__iiff(())
Copy a sequence removing elements matching a predicate.
rreem
uunniiqquuee(())
Remove equal adjacent elements.
uunniiqquuee__ccooppyy(())
Copy a sequence removing equal adjacent elements.
rreevveerrssee(())
Reverse the order of elements.
Copy a sequence into reverse order.
rreevveerrssee__ccooppyy(())
Rotate elements.
rroottaattee(())
rroottaattee__ccooppyy(())
Copy a sequence into a rotated sequence.
______________________________________________________________________
rraannddoom
m__sshhuuffffllee(())
Move elements into a uniform distribution.
Every good design shows traces of the personal traits and interests of its designer. The containers
and algorithms in the standard library clearly reflect a strong concern for classical data structures
and the design of algorithms. The standard library provides not only the bare minimum of containers and algorithms needed by essentially every programmer. It also includes many of the tools used
to provide those algorithms and needed to extend the library beyond that minimum.
The emphasis here is not on the design of algorithms or even on the use of any but the simplest
and most obvious algorithms. For information on the design and analysis of algorithms, you
should look elsewhere (for example, [Knuth,1968] and [Tarjan,1983]). Instead, this chapter lists
the algorithms offered by the standard library and explains how they are expressed in C++. This
focus allows someone who understands algorithms to use the library well and to extend it in the
spirit in which it was built.
The standard library provides a variety of operations for sorting, searching, and manipulating
sequences based on an ordering:
______________________________________________________________
Sorted Sequences (§18.7) <algorithm>
_______________________________________________________________
_____________________________________________________________
s
so
or
rt
t(
()
)
Sort
with
good
average
efficiency.
Sort maintaining order of equal elements.
ssttaabbllee__ssoorrtt(())
Get the first part of sequence into order.
ppaarrttiiaall__ssoorrtt(())
ppaarrttiiaall__ssoorrtt__ccooppyy(()) Copy getting the first part of output into order.
nntthh__eelleem
meenntt(())
Put the nth element in its proper place.
weerr__bboouunndd(())
Find the first occurrence of a value.
lloow
Find the first element larger than a value.
uuppppeerr__bboouunndd(())
Find a subsequence with a given value.
eeqquuaall__rraannggee(())
bbiinnaarryy__sseeaarrcchh(())
Is a given value in a sorted sequence?
m
meerrggee(())
Merge two sorted sequences.
meerrggee(())
Merge two consecutive sorted subsequences.
iinnppllaaccee__m
Place elements matching a predicate first.
ppaarrttiittiioonn(())
ssttaabbllee__ppaarrttiittiioonn(())
Place elements matching a predicate first,
______________________________________________________________
preserving relative order.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.2
Overview of Standard Library Algorithms
511
_____________________________________________________________________
Set Algorithms (§18.7.5) <algorithm>
______________________________________________________________________
____________________________________________________________________
True if a sequence is a subsequence of another.
iinncclluuddeess(())
Construct a sorted union.
sseett__uunniioonn(())
sseett__iinntteerrsseeccttiioonn(())
Construct a sorted intersection.
sseett__ddiiffffeerreennccee(())
Construct a sorted sequence of elements
in the first but not the second sequence.
mm
meettrriicc__ddiiffffeerreennccee(())
Construct a sorted sequence of elements
sseett__ssyym
in one but not both sequences.
_____________________________________________________________________
Heap operations keep a sequence in a state that makes it easy to sort when necessary:
_____________________________________________________
Heap Operations (§18.8) <algorithm>
_____________________________________________________
_____________________________________________________
maakkee__hheeaapp(()) Make sequence ready to be used as a heap.
m
Add element to heap.
ppuusshh__hheeaapp(())
ppoopp__hheeaapp(())
Remove element from heap.
_____________________________________________________
s
so
or
rt
t_
_h
he
ea
ap
p(
()
)
Sort
the
heap.
The library provides a few algorithms for selecting elements based on a comparison:
________________________________________________________________
Minimum and Maximum (§18.9) <algorithm>
_________________________________________________________________
_______________________________________________________________
miinn(())
Smaller of two values.
m
maaxx(())
Larger of two values.
m
m
miinn__eelleem
meenntt(())
Smallest value in sequence.
m
maaxx__eelleem
meenntt(())
Largest value in sequence.
lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(())
Lexicographically first of two sequences.
________________________________________________________________
Finally, the library provides ways of permuting a sequence:
______________________________________________________________
Permutations (§18.10) <algorithm>
_____________________________________________________________
_______________________________________________________________
muuttaattiioonn(())
Next permutation in lexicographical order.
nneexxtt__ppeerrm
pprreevv__ppeerrm
muuttaattiioonn(()) Previous permutation in lexicographical order.
______________________________________________________________
In addition, a few generalized numerical algorithms are provided in <nnuum
meerriicc> (§22.6).
In the description of algorithms, the template parameter names are significant. IInn, O
Ouutt, F
Foorr, B
Bii,
and R
Raann mean input iterator, output iterator, forward iterator, bidirectional iterator, and randomaccess iterator, respectively (§19.2.1). P
Prreedd means unary predicate, B
BiinnP
Prreedd means binary predicate (§18.4.2), C
Cm
mpp means a comparison function (§17.1.4.1, §18.7.1), O
Opp means unary operation,
and B
BiinnO
Opp means binary operation (§18.4). Conventionally, much longer names have been used
for template parameters. However, I find that after only a brief acquaintance with the standard
library, those long names decrease readability rather than enhancing it.
A random-access iterator can be used as a bidirectional iterator, a bidirectional iterator as a forward iterator, and a forward iterator as an input or an output iterator (§19.2.1). Passing a type that
doesn’t provide the required operations will cause template-instantiation-time errors (§C.13.7).
Providing a type that has the right operations with the wrong semantics will cause unpredictable
run-time behavior (§17.1.4).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
512
Algorithms and Function Objects
Chapter 18
18.3 Sequences and Containers [algo.seq]
It is a good general principle that the most common use of something should also be the shortest,
the easiest to express, and the safest. The standard library violates this principle in the name of
generality. For a standard library, generality is essential. For example, we can find the first two
occurrences of 4422 in a list like this:
vvooiidd ff(lliisstt<iinntt>& llii)
{
lliisstt<iinntt>::iitteerraattoorr p = ffiinndd(llii.bbeeggiinn(),llii.eenndd(),4422);
iiff (pp != llii.eenndd()) {
lliisstt<iinntt>::iitteerraattoorr q = ffiinndd(++pp,llii.eenndd(),4422);
// ...
}
// ...
}
// first occurrence
// second occurrence
Had ffiinndd() been expressed as an operation on a container, we would have needed some additional
mechanism for finding the second occurrence. Importantly, generalizing such an ‘‘additional
mechanism’’ for every container and every algorithm is hard. Instead, standard library algorithms
work on sequences of elements. That is, the input of an algorithm is expressed as a pair of iterators
that delineate a sequence. The first iterator refers to the first element of the sequence, and the second refers to a point one-beyond-the-last element (§3.8, §19.2). Such a sequence is called ‘‘half
open’’ because it includes the first value mentioned and not the second. A half-open sequence
allows many algorithms to be expressed without making the empty sequence a special case.
A sequence – especially a sequence in which random access is possible – is often called a
range. Traditional mathematical notations for a half-open range are [ffiirrsstt,llaasstt) and [ffiirrsstt,llaasstt[.
Importantly, a sequence can be the elements of a container or a subsequence of a container. Further, some sequences, such as I/O streams, are not containers. However, algorithms expressed in
terms of sequences work just fine.
18.3.1 Input Sequences [algo.range]
Writing xx.bbeeggiinn(),xx.eenndd() to express ‘‘all the elements of xx’’ is common, tedious, and can even
be error-prone. For example, when several iterators are used, it is too easy to provide an algorithm
with a pair of arguments that does not constitute a sequence:
vvooiidd ff(lliisstt<ssttrriinngg>& ffrruuiitt, lliisstt<ssttrriinngg>& cciittrruuss)
{
ttyyppeeddeeff lliisstt<ssttrriinngg>::ccoonnsstt__iitteerraattoorr L
LII;
L
LII pp11 = ffiinndd(ffrruuiitt.bbeeggiinn(),cciittrruuss.eenndd(),"aappppllee");
L
LII pp22 = ffiinndd(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd(),"aappppllee");
L
LII pp33 = ffiinndd(cciittrruuss.bbeeggiinn(),cciittrruuss.eenndd(),"ppeeaarr");
L
LII pp44 = ffiinndd(pp22,pp33,"ppeeaacchh");
// ...
// wrong! (different sequences)
// ok
// ok
// wrong! (different sequences)
}
In this example there are two errors. The first is obvious (once you suspect an error), but it isn’t
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.3.1
Input Sequences
513
easily detected by a compiler. The second is hard to spot in real code even for an experienced programmer. Cutting down on the number of explicit iterators used alleviates this problem. Here, I
outline an approach to dealing with this problem by making the notion of an input sequence
explicit. However, to keep the discussion of standard algorithms strictly within the bounds of the
standard library, I do not use explicit input sequences when presenting algorithms in this chapter.
The key idea is to be explicit about taking a sequence as input. For example:
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T> IInn ffiinndd(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt T
T& vv)
{
w
whhiillee (ffiirrsstt!=llaasstt && *ffiirrsstt!=vv) ++ffiirrsstt;
rreettuurrnn ffiirrsstt;
}
// standard
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T> IInn ffiinndd(IIsseeqq<IInn> rr, ccoonnsstt T
T& vv)
{
rreettuurrnn ffiinndd(rr.ffiirrsstt,rr.sseeccoonndd,vv);
}
// extension
In general, overloading (§13.3.2) allows the input-sequence version of an algorithm to be preferred
when an IIsseeqq argument is used.
Naturally, an input sequence is implemented as a pair (§17.4.1.2) of iterators:
tteem
mppllaattee<ccllaassss IInn> ssttrruucctt IIsseeqq : ppuubblliicc ppaaiirr<IInn,IInn> {
IIsseeqq(IInn ii11, IInn ii22) : ppaaiirr<IInn,IInn>(ii11,ii22) { }
};
We can explicitly make the IIsseeqq needed to invoke the second version of ffiinndd():
L
LII p = ffiinndd(IIsseeqq<L
LII>(ffrruuiitt.bbeeggiinn(),ffrruuiitt.eenndd()),"aappppllee");
However, that is even more tedious than calling the original ffiinndd() directly. Simple helper functions relieve the tedium. In particular, the IIsseeqq of a container is the sequence of elements from its
bbeeggiinn() to its eenndd():
tteem
mppllaattee<ccllaassss C
C> IIsseeqq<C
C::iitteerraattoorr__ttyyppee> iisseeqq(C
C& cc)
{
rreettuurrnn IIsseeqq<C
C::iitteerraattoorr__ttyyppee>(cc.bbeeggiinn(),cc.eenndd());
}
// for container
This allows us to express algorithms on containers compactly and without repetition. For example:
vvooiidd ff(lliisstt<ssttrriinngg>& llss)
{
lliisstt<ssttrriinngg>::iitteerraattoorr p = ffiinndd(llss.bbeeggiinn(),llss.eenndd(),"ssttaannddaarrdd");
lliisstt<ssttrriinngg>::iitteerraattoorr q = ffiinndd (iisseeqq(llss),"eexxtteennssiioonn");
// ..
}
It is easy to define versions of iisseeqq() that produce IIsseeqqs for arrays, input streams, etc. (§18.13[6]).
The key benefit of IIsseeqq is that it makes the notion of an input sequence explicit. The immediate
practical effect is that use of iisseeqq() eliminates much of the tedious and error-prone repetition
needed to express every input sequence as a pair of iterators.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
514
Algorithms and Function Objects
Chapter 18
The notion of an output sequence is also useful. However, it is less simple and less immediately useful than the notion of an input sequence (§18.13[7]; see also §19.2.4).
18.4 Function Objects [algo.fct]
Many algorithms operate on sequences using iterators and values only. For example, we can
ffiinndd() the first element with the value 7 in a sequence like this:
vvooiidd ff(lliisstt<iinntt>& cc)
{
lliisstt<iinntt>::iitteerraattoorr p = ffiinndd(cc.bbeeggiinn(),cc.eenndd(),77);
// ...
}
To do more interesting things we want the algorithms to execute code that we supply (§3.8.4). For
example, we can find the first element in a sequence with a value of less than 7 like this:
bbooooll lleessss__tthhaann__77(iinntt vv)
{
rreettuurrnn vv<77;
}
vvooiidd ff(lliisstt<iinntt>& cc)
{
lliisstt<iinntt>::iitteerraattoorr p = ffiinndd__iiff(cc.bbeeggiinn(),cc.eenndd(),lleessss__tthhaann__77);
// ...
}
There are many obvious uses for functions passed as arguments: logical predicates, arithmetic operations, operations for extracting information from elements, etc. It is neither convenient nor efficient to write a separate function for each use. Nor is a function logically sufficient to express all
that we would like to express. Often, the function called for each element needs to keep data
between invocations and to return the result of many applications. A member function of a class
serves such needs better than a free-standing function does because its object can hold data. In
addition, the class can provide operations for initializing and extracting such data.
Consider how to write a function – or rather a function-like class – to calculate a sum:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSuum
m{
T rreess;
ppuubblliicc:
SSuum
m(T
T i = 00) : rreess(ii) { }
// initialize
vvooiidd ooppeerraattoorr()(T
T xx) { rreess += xx; }
// accumulate
T rreessuulltt() ccoonnsstt { rreettuurrnn rreess; }
// return sum
};
Clearly, SSuum
m is designed for arithmetic types for which initialization by 0 and += are defined. For
example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.4
vvooiidd ff(lliisstt<ddoouubbllee>& lldd)
{
SSuum
m<ddoouubbllee> ss;
s = ffoorr__eeaacchh(lldd.bbeeggiinn(),lldd.eenndd(),ss);
ccoouutt << "tthhee ssuum
m iiss" << ss.rreessuulltt() << ´\\nn´;
}
Function Objects
515
// invoke s() for each element of ld
Here, ffoorr__eeaacchh() (§18.5.1) invokes SSuum
m<ddoouubbllee>::ooppeerraattoorr()(ddoouubbllee) for each element of lldd
and returns the object passed as its third argument.
The key reason this works is that ffoorr__eeaacchh() doesn’t actually assume its third argument to be a
function. It simply assumes that its third argument is something that can be called with an appropriate argument. A suitably-defined object serves as well as – and often better than – a function.
For example, it is easier to inline the application operator of a class than to inline a function passed
as a pointer to function. Consequently, function objects often execute faster than do ordinary functions. An object of a class with an application operator (§11.9) is called a function-like object, a
functor, or simply a function object.
18.4.1 Function Object Bases [algo.bases]
The standard library provides many useful function objects. To aid the writing of function objects,
the library provides a couple of base classes:
tteem
mppllaattee <ccllaassss A
Arrgg, ccllaassss R
Reess> ssttrruucctt uunnaarryy__ffuunnccttiioonn {
ttyyppeeddeeff A
Arrgg aarrgguum
meenntt__ttyyppee;
ttyyppeeddeeff R
Reess rreessuulltt__ttyyppee;
};
tteem
mppllaattee <ccllaassss A
Arrgg, ccllaassss A
Arrgg22, ccllaassss R
Reess> ssttrruucctt bbiinnaarryy__ffuunnccttiioonn {
ttyyppeeddeeff A
Arrgg ffiirrsstt__aarrgguum
meenntt__ttyyppee;
ttyyppeeddeeff A
Arrgg22 sseeccoonndd__aarrgguum
meenntt__ttyyppee;
ttyyppeeddeeff R
Reess rreessuulltt__ttyyppee;
};
The purpose of these classes is to provide standard names for the argument and return types for use
by users of classes derived from uunnaarryy__ffuunnccttiioonn and bbiinnaarryy__ffuunnccttiioonn. Using these bases consistently the way the standard library does will save the programmer from discovering the hard way
why they are useful (§18.4.4.1).
18.4.2 Predicates [algo.pred]
A predicate is a function object (or a function) that returns a bbooooll. For example, <ffuunnccttiioonnaall>
defines:
tteem
mppllaattee <ccllaassss T
T> ssttrruucctt llooggiiccaall__nnoott : ppuubblliicc uunnaarryy__ffuunnccttiioonn<T
T,bbooooll> {
bbooooll ooppeerraattoorr()(ccoonnsstt T
T& xx) ccoonnsstt { rreettuurrnn !xx; }
};
tteem
mppllaattee <ccllaassss T
T> ssttrruucctt lleessss : ppuubblliicc bbiinnaarryy__ffuunnccttiioonn<T
T,T
T,bbooooll> {
bbooooll ooppeerraattoorr()(ccoonnsstt T
T& xx, ccoonnsstt T
T& yy) ccoonnsstt { rreettuurrnn xx<yy; }
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
516
Algorithms and Function Objects
Chapter 18
Unary and binary predicates are often useful in combination with algorithms. For example, we can
compare two sequences, looking for the first element of one that is not less than its corresponding
element in the other:
vvooiidd ff(vveeccttoorr<iinntt>& vvii, lliisstt<iinntt>& llii)
{
ttyyppeeddeeff lliisstt<iinntt>::iitteerraattoorr L
LII;
ttyyppeeddeeff vveeccttoorr<iinntt>::iitteerraattoorr V
VII;
ppaaiirr<V
VII,L
LII> pp11 = m
miissm
maattcchh(vvii.bbeeggiinn(),vvii.eenndd(),llii.bbeeggiinn(),lleessss<iinntt>());
// ...
}
The m
miissm
maattcchh() algorithm applies its binary predicate repeatedly to pairs of corresponding elements until it fails (§18.5.4). It then returns the iterators for the elements that failed the comparison. Because an object is needed rather than a type, lleessss<iinntt>() (with the parentheses) is used
rather than the tempting lleessss<iinntt>.
Instead of finding the first element nnoott lleessss than its corresponding element in the other
sequence, we might like to find the first element lleessss than its corresponding element. We can do
this by presenting the sequences to m
miissm
maattcchh() in the opposite order:
ppaaiirr<L
LII,V
VII> pp22 = m
miissm
maattcchh(llii.bbeeggiinn(),llii.eenndd(),vvii.bbeeggiinn(),lleessss<iinntt>());
or we can use the complementary predicate ggrreeaatteerr__eeqquuaall:
pp11 = m
miissm
maattcchh(vvii.bbeeggiinn(),vvii.eenndd(),llii.bbeeggiinn(),ggrreeaatteerr__eeqquuaall<iinntt>());
In §18.4.4.4, I show how to express the predicate ‘‘not less.’’
18.4.2.1 Overview of Predicates [algo.pred.std]
In <ffuunnccttiioonnaall>, the standard library supplies a few common predicates:
_______________________________________
Predicates <functional>
________________________________________
______________________________________
Binary arg1==arg2
eeqquuaall__ttoo
n
no
ot
t_
_e
eq
qu
ua
al
l_
_t
to
o
Binary arg1!=arg2
ggrreeaatteerr
Binary arg1>arg2
lleessss
Binary arg1<arg2
ggrreeaatteerr__eeqquuaall
Binary arg1>=arg2
Binary arg1<=arg2
lleessss__eeqquuaall
Binary arg1&&arg2
llooggiiccaall__aanndd
llooggiiccaall__oorr
Binary arg1arg2
_______________________________________
llooggiiccaall__nnoott
Unary
!arg
The definitions of lleessss and llooggiiccaall__nnoott are presented in §18.4.2.
In addition to the library-provided predicates, users can write their own. Such user-supplied
predicates are essential for simple and elegant use of the standard libraries and algorithms. The
ability to define predicates is particularly important when we want to use algorithms for classes
designed without thought of the standard library and its algorithms. For example, consider a variant of the C
Clluubb class from §10.4.6:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.4.2.1
Overview of Predicates
517
ccllaassss P
Peerrssoonn { /* ... */ };
ssttrruucctt C
Clluubb {
ssttrriinngg nnaam
mee;
lliisstt<P
Peerrssoonn*> m
meem
mbbeerrss;
lliisstt<P
Peerrssoonn*> ooffffiicceerrss;
// ...
C
Clluubb(ccoonnsstt nnaam
mee& nn);
};
Looking for a C
Clluubb with a given name in a lliisstt<C
Clluubb> is clearly a reasonable thing to do. However, the standard library algorithm ffiinndd__iiff() doesn’t know about C
Clluubbs. The library algorithms
know how to test for equality, but we don’t want to find a C
Clluubb based on its complete value.
Rather, we want to use C
Clluubb::nnaam
mee as the key. So we write a predicate to reflect that:
ccllaassss C
Clluubb__eeqq : ppuubblliicc uunnaarryy__ffuunnccttiioonn<C
Clluubb,bbooooll> {
ssttrriinngg ss;
ppuubblliicc:
eexxpplliicciitt C
Clluubb__eeqq(ccoonnsstt ssttrriinngg& ssss) : ss(ssss) { }
bbooooll ooppeerraattoorr()(ccoonnsstt C
Clluubb& cc) ccoonnsstt { rreettuurrnn cc.nnaam
mee==ss; }
};
Defining useful predicates is simple. Once suitable predicates have been defined for user-defined
types, their use with the standard algorithms is as simple and efficient as examples involving containers of simple types. For example:
vvooiidd ff(lliisstt<C
Clluubb>& llcc)
{
ttyyppeeddeeff lliisstt<C
Clluubb>::iitteerraattoorr L
LC
CII;
L
LC
CII p = ffiinndd__iiff(llcc.bbeeggiinn(),llcc.eenndd(),C
Clluubb__eeqq("D
Diinniinngg P
Phhiilloossoopphheerrss"));
// ...
}
18.4.3 Arithmetic Function Objects [algo.arithmetic]
When dealing with numeric classes, it is sometimes useful to have the standard arithmetic functions
available as function objects. Consequently, in <ffuunnccttiioonnaall> the standard library provides:
__________________________________
Arithmetic Operations <functional>
_________________________________
___________________________________
Binary arg1+arg2
pplluuss
miinnuuss
Binary arg1– arg2
m
m
muullttiipplliieess
Binary arg1*arg2
ddiivviiddeess
Binary arg1/arg2
m
moodduulluuss
Binary arg1%arg2
nneeggaattee
Unary
– arg
__________________________________
We might use m
muullttiipplliieess to multiply elements in two vectors, thereby producing a third:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
518
Algorithms and Function Objects
Chapter 18
vvooiidd ddiissccoouunntt(vveeccttoorr<ddoouubbllee>& aa, vveeccttoorr<ddoouubbllee>& bb, vveeccttoorr<ddoouubbllee>& rreess)
{
ttrraannssffoorrm
m(aa.bbeeggiinn(),aa.eenndd(),bb.bbeeggiinn(),bbaacckk__iinnsseerrtteerr(rreess),m
muullttiipplliieess<ddoouubbllee>());
}
The bbaacckk__iinnsseerrtteerr() is described in §19.2.4. A few numerical algorithms can be found in §22.6.
18.4.4 Binders, Adapters, and Negaters [algo.adapter]
We can use predicates and arithmetic function objects we have written ourselves and rely on the
ones provided by the standard library. However, when we need a new predicate we often find that
the new predicate is a minor variation of an existing one. The standard library supports the composition of function objects:
§18.4.4.1 A binder allows a two-argument function object to be used as a single-argument
function by binding one argument to a value.
§18.4.4.2 A member function adapter allows a member function to be used as an argument to
algorithms.
§18.4.4.3 A pointer to function adapter allows a pointer to function to be used as an argument
to algorithms.
§18.4.4.4 A negater allows us to express the opposite of a predicate.
Collectively, these function objects are referred to as aaddaapptteerrss. These adapters all have a common
structure relying on the function object bases uunnaarryy__ffuunnccttiioonn and bbiinnaarryy__ffuunnccttiioonn (§18.4.1). For
each of these adapters, a helper function is provided to take a function object as an argument and
return a suitable function object. When invoked by its ooppeerraattoorr()(), that function object will
perform the desired action. That is, an adapter is a simple form of a higher-order function: it takes
a function argument and produces a new function from it:
________________________________________________________________________________
Binders, Adapters, and Negaters <functional>
_________________________________________________________________________________
_______________________________________________________________________________
binder2nd
Call binary function with y as 2nd argument.
bind2nd(y)
bbiinnddeerr11sstt
Call binary function with x as 1st argument.
bbiinndd11sstt((xx))
meem
m__ffuunn(())
m
meem
m__ffuunn__tt
Call 0-arg member through pointer.
m
m
meem
m__ffuunn11__tt
Call unary member through pointer.
ccoonnsstt__m
meem
m__ffuunn__tt
Call 0-arg const member through pointer.
ccoonnsstt__m
meem
m__ffuunn11__tt
Call unary const member through pointer.
meem
m__ffuunn__rreeff(()) m
meem
m__ffuunn__rreeff__tt
Call 0-arg member through reference.
m
m
meem
m__ffuunn11__rreeff__tt
Call unary member through reference.
ccoonnsstt__m
meem
m__ffuunn__rreeff__tt
Call 0-arg const member through reference.
ccoonnsstt__m
meem
m__ffuunn11__rreeff__tt
Call unary const member through reference.
ppooiinntteerr__ttoo__uunnaarryy__ffuunnccttiioonn
Call unary pointer to function.
ppttrr__ffuunn(())
ppooiinntteerr__ttoo__bbiinnaarryy__ffuunnccttiioonn
Call binary pointer to function.
ppttrr__ffuunn(())
uunnaarryy__nneeggaattee
Negate unary predicate.
nnoott11(())
________________________________________________________________________________
nnoott22(())
bbiinnaarryy__nneeggaattee
Negate binary predicate.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.4.4.1
Binders
519
18.4.4.1 Binders [algo.binder]
Binary predicates such as lleessss (§18.4.2) are useful and flexible. However, we soon discover that
the most useful kind of predicate is one that compares a fixed argument repeatedly against a container element. The lleessss__tthhaann__77() function (§18.4) is a typical example. The lleessss operation
needs two arguments explicitly provided in each call, so it is not immediately useful. Instead, we
might define:
tteem
mppllaattee <ccllaassss T
T> ccllaassss lleessss__tthhaann : ppuubblliicc uunnaarryy__ffuunnccttiioonn<T
T,bbooooll> {
T aarrgg22;
ppuubblliicc:
eexxpplliicciitt lleessss__tthhaann(ccoonnsstt T
T& xx) : aarrgg22(xx) { }
bbooooll ooppeerraattoorr()(ccoonnsstt T
T& xx) ccoonnsstt { rreettuurrnn xx<aarrgg22; }
};
We can now write:
vvooiidd ff(lliisstt<iinntt>& cc)
{
lliisstt<iinntt>::ccoonnsstt__iitteerraattoorr p = ffiinndd__iiff(cc.bbeeggiinn(),cc.eenndd(),lleessss__tthhaann<iinntt>(77));
// ...
}
We must write lleessss__tthhaann<iinntt>(77) rather than lleessss__tthhaann(77) because the template argument <iinntt>
cannot be deduced from the type of the constructor argument (77) (§13.3.1).
The lleessss__tthhaann predicate is generally useful. Importantly, we defined it by fixing or binding the
second argument of lleessss. Such composition by binding an argument is so common, useful, and
occasionally tedious that the standard library provides a standard class for doing it:
tteem
mppllaattee <ccllaassss B
BiinnO
Opp>
ccllaassss bbiinnddeerr22nndd : ppuubblliicc uunnaarryy__ffuunnccttiioonn<B
BiinnO
Opp::ffiirrsstt__aarrgguum
meenntt__ttyyppee, B
BiinnO
Opp::rreessuulltt__ttyyppee> {
pprrootteecctteedd:
B
BiinnO
Opp oopp;
ttyyppeennaam
mee B
BiinnO
Opp::sseeccoonndd__aarrgguum
meenntt__ttyyppee aarrgg22;
ppuubblliicc:
bbiinnddeerr22nndd(ccoonnsstt B
BiinnO
Opp& xx, ccoonnsstt ttyyppeennaam
mee B
BiinnO
Opp::sseeccoonndd__aarrgguum
meenntt__ttyyppee& vv)
: oopp(xx), aarrgg22(vv) { }
rreessuulltt__ttyyppee ooppeerraattoorr()(ccoonnsstt aarrgguum
meenntt__ttyyppee& xx) ccoonnsstt { rreettuurrnn oopp(xx,aarrgg22); }
};
tteem
mppllaattee <ccllaassss B
BiinnO
Opp, ccllaassss T
T> bbiinnddeerr22nndd<B
BiinnO
Opp> bbiinndd22nndd(ccoonnsstt B
BiinnO
Opp& oopp, ccoonnsstt T
T& vv)
{
rreettuurrnn bbiinnddeerr22nndd<B
BiinnO
Opp>(oopp,vv);
}
For example, we can use bbiinndd22nndd() to create the unary predicate ‘‘less than 77’’ from the binary
predicate ‘‘less’’ and the value 77:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
520
Algorithms and Function Objects
Chapter 18
vvooiidd ff(lliisstt<iinntt>& cc)
{
lliisstt<iinntt>::ccoonnsstt__iitteerraattoorr p = ffiinndd__iiff(cc.bbeeggiinn(),cc.eenndd(),bbiinndd22nndd(lleessss<iinntt>(),77));
// ...
}
Is this readable? Is this efficient? Given an average C++ implementation, this version is actually
more efficient in time and space than is the original version using the function lleessss__tthhaann__77() from
§18.4! The comparison is easily inlined.
The notation is logical, but it does take some getting used to. Often, the definition of a named
operation with a bound argument is worthwhile after all:
tteem
mppllaattee <ccllaassss T
T> ssttrruucctt lleessss__tthhaann : ppuubblliicc bbiinnddeerr22nndd<lleessss<T
T>,T
T> {
eexxpplliicciitt lleessss__tthhaann(ccoonnsstt T
T& xx) : bbiinnddeerr22nndd(lleessss<T
T>(),xx) { }
};
vvooiidd ff(lliisstt<iinntt>& cc)
{
lliisstt<iinntt>::ccoonnsstt__iitteerraattoorr p = ffiinndd__iiff(cc.bbeeggiinn(),cc.eenndd(),lleessss__tthhaann<iinntt>(77));
// ...
}
It is important to define lleessss__tthhaann in terms of lleessss rather than using < directly. That way,
lleessss__tthhaann benefits from any specializations that lleessss might have (§13.5, §19.2.2).
In parallel to bbiinndd22nndd() and bbiinnddeerr22nndd, <ffuunnccttiioonnaall> provides bbiinndd11sstt() and bbiinnddeerr11sstt for
binding the first argument of a binary function.
By binding an argument, bbiinndd11sstt() and bbiinndd22nndd() perform a service very similar to what is
commonly referred to as Currying.
18.4.4.2 Member Function Adapters [algo.memfct]
Most algorithms invoke a standard or user-defined operation. Naturally, users often want to invoke
a member function. For example (§3.8.5):
vvooiidd ddrraaw
w__aallll(lliisstt<SShhaappee*>& llsspp)
{
ffoorr__eeaacchh(cc.bbeeggiinn(),cc.eenndd(),&SShhaappee::ddrraaw
w); // oops! error
}
The problem is that a member function m
mff() needs to be invoked for an object: pp->m
mff(). However, algorithms such as ffoorr__eeaacchh() invoke their function operands by simple application: ff().
Consequently, we need a convenient and efficient way of creating something that allows an algorithm to invoke a member function. The alternative would be to duplicate the set of algorithms:
one version for member functions plus one for ordinary functions. Worse, we’d need additional
versions of algorithms for containers of objects (rather than pointers to objects). As for the binders
(§18.4.4.1), this problem is solved by a class plus a function. First, consider the common case in
which we want to call a member function taking no arguments for the elements of a container of
pointers:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.4.4.2
Member Function Adapters
521
tteem
mppllaattee<ccllaassss R
R, ccllaassss T
T> ccllaassss m
meem
m__ffuunn__tt : ppuubblliicc uunnaarryy__ffuunnccttiioonn<T
T*,R
R> {
R (T
T::*ppm
mff)();
ppuubblliicc:
eexxpplliicciitt m
meem
m__ffuunn__tt(R
R (T
T::*pp)()) :ppm
mff(pp) {}
R ooppeerraattoorr()(T
T* pp) ccoonnsstt { rreettuurrnn (pp->*ppm
mff)(); } // call through pointer
};
tteem
mppllaattee<ccllaassss R
R, ccllaassss T
T> m
meem
m__ffuunn__tt<R
R,T
T> m
meem
m__ffuunn(R
R (T
T::*ff)())
{
rreettuurrnn m
meem
m__ffuunn__tt<R
R,T
T>(ff);
}
This handles the SShhaappee::ddrraaw
w() example:
vvooiidd ddrraaw
w__aallll(lliisstt<SShhaappee*>& llsspp)
// call 0-argument member through pointer to object
{
ffoorr__eeaacchh(llsspp.bbeeggiinn(),llsspp.eenndd(),m
meem
m__ffuunn(&SShhaappee::ddrraaw
w)); // draw all shapes
}
In addition, we need a class and a m
meem
m__ffuunn() function for handling a member function taking an
argument. We also need versions to be called directly for an object rather than through a pointer;
these are named m
meem
m__ffuunn__rreeff(). Finally, we need versions for ccoonnsstt member functions:
tteem
mppllaattee<ccllaassss R
R, ccllaassss T
T> m
meem
m__ffuunn__tt<R
R,T
T> m
meem
m__ffuunn(R
R (T
T::*ff)());
// and versions for unary member, for const member, and const unary member (see table in §18.4.4)
tteem
mppllaattee<ccllaassss R
R, ccllaassss T
T> m
meem
m__ffuunn__rreeff__tt<R
R,T
T> m
meem
m__ffuunn__rreeff(R
R (T
T::*ff)());
// and versions for unary member, for const member, and const unary member (see table in §18.4.4)
Given these member function adapters from <ffuunnccttiioonnaall>, we can write:
vvooiidd ff(lliisstt<ssttrriinngg>& llss)
// use member function that takes no argument for object
{
ttyyppeeddeeff lliisstt<ssttrriinngg>::iitteerraattoorr L
LSSII;
L
LSSII p = ffiinndd__iiff(llss.bbeeggiinn(),llss.eenndd(),m
meem
m__ffuunn__rreeff(&ssttrriinngg::eem
mppttyy));// find ""
}
vvooiidd rroottaattee__aallll(lliisstt<SShhaappee*>& llss, iinntt aannggllee)
// use member function that takes one argument through pointer to object
{
ffoorr__eeaacchh(llss.bbeeggiinn(),llss.eenndd(),bbiinndd22nndd(m
meem
m__ffuunn(&SShhaappee::rroottaattee),aannggllee));
}
The standard library need not deal with member functions taking more than one argument because
no standard library algorithm takes a function with more than two arguments as operands.
18.4.4.3 Pointer to Function Adapters [algo.ptof]
An algorithm doesn’t care whether a ‘‘function argument’’ is a function, a pointer to function, or a
function object. However, a binder (§18.4.4.1) does care because it needs to store a copy for later
use. Consequently, the standard library supplies two adapters to allow pointers to functions to be
used together with the standard algorithms in <ffuunnccttiioonnaall>. The definition and implementation
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
522
Algorithms and Function Objects
Chapter 18
closely follows that of the member function adapters (§18.4.4.2). Again, a pair of functions and a
pair of classes are used:
tteem
mppllaattee <ccllaassss A
A, ccllaassss R
R> ppooiinntteerr__ttoo__uunnaarryy__ffuunnccttiioonn<A
A,R
R> ppttrr__ffuunn(R
R (*ff)(A
A));
tteem
mppllaattee <ccllaassss A
A, ccllaassss A
A22, ccllaassss R
R>
ppooiinntteerr__ttoo__bbiinnaarryy__ffuunnccttiioonn<A
A,A
A22,R
R> ppttrr__ffuunn(R
R (*ff)(A
A, A
A22));
Given these pointer to function adapters, we can use ordinary functions together with binders:
ccllaassss R
Reeccoorrdd { /* ... */ };
bbooooll nnaam
mee__kkeeyy__eeqq(ccoonnsstt R
Reeccoorrdd&, ccoonnsstt R
Reeccoorrdd&); // compare based on names
bbooooll ssssnn__kkeeyy__eeqq(ccoonnsstt R
Reeccoorrdd&, ccoonnsstt R
Reeccoorrdd&); // compare based on number
vvooiidd ff(lliisstt<R
Reeccoorrdd>& llrr) // use pointer to function
{
ttyyppeeddeeff ttyyppeennaam
mee lliisstt<R
Reeccoorrdd>::iitteerraattoorr L
LII;
L
LII p = ffiinndd__iiff(llrr.bbeeggiinn(),llrr.eenndd(),bbiinndd22nndd(ppttrr__ffuunn(nnaam
mee__kkeeyy__eeqq),"JJoohhnn B
Brroow
wnn"));
L
LII q = ffiinndd__iiff(llrr.bbeeggiinn(),llrr.eenndd(),bbiinndd22nndd(ppttrr__ffuunn(ssssnn__kkeeyy__eeqq),11223344556677889900));
// ...
}
This looks for elements of the list llrr that match the keys JJoohhnn B
Brroow
wnn and 11223344556677889900.
18.4.4.4 Negaters [algo.negate]
The predicate negaters are related to the binders in that they take an operation and produce a related
operation from it. The definition and implementation of negaters follow the pattern of the member
function adapters (§18.4.4.2). Their definitions are trivial, but their simplicity is obscured by the
use of long standard names:
tteem
mppllaattee <ccllaassss P
Prreedd>
ccllaassss uunnaarryy__nneeggaattee : ppuubblliicc uunnaarryy__ffuunnccttiioonn<ttyyppeennaam
mee P
Prreedd::aarrgguum
meenntt__ttyyppee,bbooooll> {
uunnaarryy__ffuunnccttiioonn<aarrgguum
meenntt__ttyyppee,bbooooll> oopp;
ppuubblliicc:
eexxpplliicciitt uunnaarryy__nneeggaattee(ccoonnsstt P
Prreedd& pp) : oopp(pp) { }
bbooooll ooppeerraattoorr()(ccoonnsstt aarrgguum
meenntt__ttyyppee& xx) ccoonnsstt { rreettuurrnn !oopp(xx); }
};
tteem
mppllaattee <ccllaassss P
Prreedd>
ccllaassss bbiinnaarryy__nneeggaattee : ppuubblliicc bbiinnaarryy__ffuunnccttiioonn<ttyyppeennaam
mee P
Prreedd::ffiirrsstt__aarrgguum
meenntt__ttyyppee,
ttyyppeennaam
mee P
Prreedd::sseeccoonndd__aarrgguum
meenntt__ttyyppee, bbooooll> {
ttyyppeeddeeff ffiirrsstt__aarrgguum
meenntt__ttyyppee A
Arrgg;
ttyyppeeddeeff sseeccoonndd__aarrgguum
meenntt__ttyyppee A
Arrgg22;
bbiinnaarryy__ffuunnccttiioonn<A
Arrgg, A
Arrgg22, bbooooll> oopp;
ppuubblliicc:
eexxpplliicciitt bbiinnaarryy__nneeggaattee(ccoonnsstt P
Prreedd& pp) : oopp(pp) { }
bbooooll ooppeerraattoorr()(ccoonnsstt A
Arrgg& xx, ccoonnsstt A
Arrgg22& yy) ccoonnsstt { rreettuurrnn !oopp(xx,yy); }
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.4.4.4
Negaters
tteem
mppllaattee<ccllaassss P
Prreedd> uunnaarryy__nneeggaattee<P
Prreedd> nnoott11(ccoonnsstt P
Prreedd& pp);
tteem
mppllaattee<ccllaassss P
Prreedd> bbiinnaarryy__nneeggaattee<P
Prreedd> nnoott22(ccoonnsstt P
Prreedd& pp);
523
// negate unary
// negate binary
These classes and functions are declared in <ffuunnccttiioonnaall>. The names ffiirrsstt__aarrgguum
meenntt__ttyyppee,
sseeccoonndd__aarrgguum
meenntt__ttyyppee, etc., come from the standard base classes uunnaarryy__ffuunnccttiioonn and
bbiinnaarryy__ffuunnccttiioonn.
Like the binders, the negaters are most conveniently used indirectly through their helper functions. For example, we can express the binary predicate ‘‘not less than’’ and use it to find the first
corresponding pair of elements whose first element is greater than or equal to its second:
vvooiidd ff(vveeccttoorr<iinntt>& vvii, lliisstt<iinntt>& llii) // revised example from §18.4.2
{
// ...
pp11 = m
miissm
maattcchh(vvii.bbeeggiinn(),vvii.eenndd(),llii.bbeeggiinn(),nnoott22(lleessss<iinntt>()));
// ...
}
That is, pp11 identifies the first pair of elements for which the predicate nnoott lleessss tthhaann failed.
Predicates deal with Boolean conditions, so there are no equivalents to the bitwise operators |,
&, ^, and ~.
Naturally, binders, adapters, and negaters are useful in combination. For example:
eexxtteerrnn "C
C" iinntt ssttrrccm
mpp(ccoonnsstt cchhaarr*,ccoonnsstt cchhaarr*);
// from <cstdlib>
vvooiidd ff(lliisstt<cchhaarr*>& llss)
// use pointer to function
{
ttyyppeeddeeff ttyyppeennaam
mee lliisstt<cchhaarr*>::ccoonnsstt__iitteerraattoorr L
LII;
L
LII p = ffiinndd__iiff(llss.bbeeggiinn(),llss.eenndd(),nnoott11(bbiinndd22nndd(ppttrr__ffuunn(ssttrrccm
mpp),"ffuunnnnyy")));
}
This finds an element of the list llss that contains the C-style string "ffuunnnnyy". The negater is needed
because ssttrrccm
mpp() returns 0 when strings compare equal.
18.5 Nonmodifying Sequence Algorithms [algo.nonmodifying]
Nonmodifying sequence algorithms are the basic means for finding something in a sequence without writing a loop. In addition, they allow us to find out things about elements. These algorithms
can take const-iterators (§19.2.1) and – with the excetion of ffoorr__eeaacchh() – should not be used to
invoke operations that modify the elements of the sequence.
18.5.1 For_each [algo.foreach]
We use a library to benefit from the work of others. Using a library function, class, algorithm, etc.,
saves the work of inventing, designing, writing, debugging, and documenting something. Using
the standard library also makes the resulting code easier to read for others who are familiar with
that library, but who would have to spend time and effort understanding home-brewed code.
A key benefit of the standard library algorithms is that they save the programmer from writing
explicit loops. Loops can be tedious and error-prone. The ffoorr__eeaacchh() algorithm is the simplest
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
524
Algorithms and Function Objects
Chapter 18
algorithm in the sense that it does nothing but eliminate an explicit loop. It simply calls its operator
argument for a sequence:
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Opp> O
Opp ffoorr__eeaacchh(IInn ffiirrsstt, IInn llaasstt, O
Opp ff)
{
w
whhiillee (ffiirrsstt != llaasstt) ff(*ffiirrsstt++);
rreettuurrnn ff;
}
What functions would people want to call this way? If you want to accumulate information from
the elements, consider aaccccuum
muullaattee() (§22.6). If you want to find something in a sequence, consider ffiinndd() and ffiinndd__iiff() (§18.5.2). If you change or remove elements, consider rreeppllaaccee()
(§18.6.4) or rreem
moovvee() (§18.6.5). In general, before using ffoorr__eeaacchh(), consider if there is a more
specialized algorithm that would do more for you.
The result of ffoorr__eeaacchh() is the function or function object passed as its third argument. As
shown in the SSuum
m example (§18.4), this allows information to be passed back to a caller.
One common use of ffoorr__eeaacchh() is to extract information from elements of a sequence. For
example, consider collecting the names of any of a number of C
Clluubbs:
vvooiidd eexxttrraacctt(ccoonnsstt lliisstt<C
Clluubb>& llcc, lliisstt<P
Peerrssoonn*>& ooffff) // place the officers from ‘lc’ on ‘off’
{
ffoorr__eeaacchh(llcc.bbeeggiinn(),llcc.eenndd(),E
Exxttrraacctt__ooffffiicceerrss(ooffff));
}
In parallel to the examples from §18.4 and §18.4.2, we define a function class that extracts the
desired information. In this case, the names to be extracted are found in lliisstt<P
Peerrssoonn*>s in our
lliisstt<C
Clluubb>. Consequently, E
Exxttrraacctt__ooffffiicceerrss needs to copy the officers from a C
Clluubb’s ooffffiicceerrss list
to our list:
ccllaassss E
Exxttrraacctt__ooffffiicceerrss {
lliisstt<P
Peerrssoonn*>& llsstt;
ppuubblliicc:
eexxpplliicciitt E
Exxttrraacctt__ooffffiicceerrss(lliisstt<P
Peerrssoonn*>& xx) : llsstt(xx) { }
vvooiidd ooppeerraattoorr()(ccoonnsstt C
Clluubb& cc)
{ ccooppyy(cc.ooffffiicceerrss.bbeeggiinn(),cc.ooffffiicceerrss.eenndd(),bbaacckk__iinnsseerrtteerr(llsstt)); }
};
We can now print out the names, again using ffoorr__eeaacchh():
vvooiidd eexxttrraacctt__aanndd__pprriinntt(ccoonnsstt lliisstt<C
Clluubb>& llcc)
{
lliisstt<P
Peerrssoonn*> ooffff;
eexxttrraacctt(llcc,ooffff);
ffoorr__eeaacchh(ooffff.bbeeggiinn(),ooffff.eenndd(),P
Prriinntt__nnaam
mee(ccoouutt));
}
Writing P
Prriinntt__nnaam
mee is left as an exercise (§18.13[4]).
The ffoorr__eeaacchh() algorithm is classified as nonmodifying because it doesn’t explicitly modify a
sequence. However, if applied to a non-ccoonnsstt sequence ffoorr__eeaacchh()’s operation (its third argument) may change the elements of the sequence. For an example, see ddeelleettee__ppttrr() in §18.6.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.5.2
The Find Family
525
18.5.2 The Find Family [algo.find]
The ffiinndd() algorithms look through a sequence or a pair of sequences to find a value or a match on
a predicate. The simple versions of ffiinndd() look for a value or for a match with a predicate:
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T> IInn ffiinndd(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss IInn, ccllaassss P
Prreedd> IInn ffiinndd__iiff(IInn ffiirrsstt, IInn llaasstt, P
Prreedd pp);
The algorithms ffiinndd() and ffiinndd__iiff() return an iterator to the first element that matches a value and
a predicate, respectively. In fact, ffiinndd() can be understood as the version of ffiinndd__iiff() with the
predicate ==. Why aren’t they both called ffiinndd()? The reason is that function overloading cannot
always distinguish calls of two template functions with the same number of arguments. Consider:
bbooooll pprreedd(iinntt);
vvooiidd ff(vveeccttoorr<bbooooll(*ff)(iinntt)>& vv11, vveeccttoorr<iinntt>& vv22)
{
ffiinndd(vv11.bbeeggiinn(),vv11.eenndd(),pprreedd);
// find ‘pred’
ffiinndd__iiff(vv22.bbeeggiinn(),vv22.eenndd(),pprreedd);
// find int for which pred() returns true
}
If ffiinndd() and ffiinndd__iiff() had had the same name, surprising ambiguities would have resulted. In
general, the __iiff suffix is used to indicate that an algorithm takes a predicate.
The ffiinndd__ffiirrsstt__ooff() algorithm finds the first element of a sequence that has a match in a second
sequence:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22>
F
Foorr ffiinndd__ffiirrsstt__ooff(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22, F
Foorr22 llaasstt22);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22, ccllaassss B
BiinnP
Prreedd>
F
Foorr ffiinndd__ffiirrsstt__ooff(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22, F
Foorr22 llaasstt22, B
BiinnP
Prreedd pp);
For example:
iinntt xx[] = { 11,33,44 };
iinntt yy[] = { 00,22,33,44,55};
vvooiidd ff()
{
iinntt* p = ffiinndd__ffiirrsstt__ooff(xx,xx+33,yy,yy+55);
iinntt* q = ffiinndd__ffiirrsstt__ooff(pp+11,xx+33,yy,yy+55);
}
// p = &x[1]
// q = &x[2]
The pointer p will point to xx[11] because 3 is the first element of x with a match in yy. Similarly, q
will point to xx[22].
The aaddjjaacceenntt__ffiinndd() algorithm finds a pair of adjacent matching values:
tteem
mppllaattee<ccllaassss F
Foorr> F
Foorr aaddjjaacceenntt__ffiinndd(F
Foorr ffiirrsstt, F
Foorr llaasstt);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss B
BiinnP
Prreedd> F
Foorr aaddjjaacceenntt__ffiinndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, B
BiinnP
Prreedd pp);
The return value is an iterator to the first matching element. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
526
Algorithms and Function Objects
Chapter 18
vvooiidd ff(vveeccttoorr<ssttrriinngg>& tteexxtt)
{
vveeccttoorr<ssttrriinngg>::iitteerraattoorr p = aaddjjaacceenntt__ffiinndd(tteexxtt.bbeeggiinn(),tteexxtt.eenndd(),"tthhee");
iiff (pp != tteexxtt.eenndd()) {
// I duplicated "the" again!
}
}
18.5.3 Count [algo.count]
The ccoouunntt() and ccoouunntt__iiff() algorithms count occurrences of a value in a sequence:
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T>
iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee ccoouunntt(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss IInn, ccllaassss P
Prreedd>
iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee ccoouunntt__iiff(IInn ffiirrsstt, IInn llaasstt, P
Prreedd pp);
The return type of ccoouunntt() is interesting. Consider an obvious and somewhat simple-minded version of ccoouunntt():
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T> iinntt ccoouunntt(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt T
T& vvaall)
{
iinntt rreess = 00;
w
whhiillee (ffiirrsstt != llaasstt) iiff (*ffiirrsstt++ == vvaall) ++rreess;
rreettuurrnn rreess;
}
The problem is that an iinntt might not be the right type for the result. On a machine with small iinntts,
there might be too many elements in the sequence for ccoouunntt() to fit in an iinntt. Conversely, a highperformance implementation on a specialized machine might prefer to keep the count in a sshhoorrtt.
Clearly, the number of elements in the sequence cannot be larger than the maximum difference
between its iterators (§19.2.1). Consequently, the first idea for a solution to this problem is to
define the return type as
ttyyppeennaam
mee IInn::ddiiffffeerreennccee__ttyyppee
However, a standard algorithm should be applicable to built-in arrays as well as to standard containers. For example:
vvooiidd ff(ccoonnsstt cchhaarr* pp, iinntt ssiizzee)
{
iinntt n = ccoouunntt(pp,pp+ssiizzee,´ee´); // count the number of occurrences of the letter ’e’
}
Unfortunately, iinntt*::ddiiffffeerreennccee__ttyyppee is not valid C++. This problem is solved by partial specialization of an iitteerraattoorr__ttrraaiittss (§19.2.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.5.4
Equal and Mismatch
527
18.5.4 Equal and Mismatch [algo.equal]
The eeqquuaall() and m
miissm
maattcchh() algorithms compare two sequences:
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22> bbooooll eeqquuaall(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss B
BiinnP
Prreedd>
bbooooll eeqquuaall(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, B
BiinnP
Prreedd pp);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22> ppaaiirr<IInn, IInn22> m
miissm
maattcchh(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss B
BiinnP
Prreedd>
ppaaiirr<IInn, IInn22> m
miissm
maattcchh(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, B
BiinnP
Prreedd pp);
The eeqquuaall() algorithm simply tells whether all corresponding pairs of elements of two sequences
compare equal; m
miissm
maattcchh() looks for the first pair of elements that compares unequal and returns
iterators to those elements. No end is specified for the second sequence; that is, there is no llaasstt22.
Instead, it is assumed that there are at least as many elements in the second sequence as in the first
and ffiirrsstt22+(llaasstt-ffiirrsstt) is used as llaasstt22. This technique is used throughout the standard library,
where pairs of sequences are used for operations on pairs of elements.
As shown in §18.5.1, these algorithms are even more useful than they appear at first glance
because the user can supply predicates defining what it means to be equal and to match.
Note that the sequences need not be of the same type. For example:
vvooiidd ff(lliisstt<iinntt>& llii, vveeccttoorr<ddoouubbllee>& vvdd)
{
bbooooll b = eeqquuaall(llii.bbeeggiinn(),llii.eenndd(),vvdd.bbeeggiinn());
}
All that is required is that the elements be acceptable as operands of the predicate.
The two versions of m
miissm
maattcchh() differ only in their use of predicates. In fact, we could implement them as one function with a default template argument:
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss B
BiinnP
Prreedd>
ppaaiirr<IInn, IInn22> m
miissm
maattcchh(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22,
B
BiinnP
Prreedd p = eeqquuaall__ttoo<IInn::vvaalluuee__ttyyppee,IInn22::vvaalluuee__ttyyppee>())// §18.4.2.1
{
w
whhiillee (ffiirrsstt != llaasstt && pp(*ffiirrsstt,*ffiirrsstt22)) {
++ffiirrsstt;
++ffiirrsstt22;
}
rreettuurrnn ppaaiirr<IInn,IInn22>(ffiirrsstt,ffiirrsstt22);
}
The difference between having two functions and having one with a default argument can be
observed by someone taking pointers to functions. However, thinking of many of the variants of
the standard algorithms as simply ‘‘the version with the default predicate’’ roughly halves the number of template functions that need to be remembered.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
528
Algorithms and Function Objects
Chapter 18
18.5.5 Search [algo.search]
The sseeaarrcchh(), sseeaarrcchh__nn(), and ffiinndd__eenndd() algorithms find one sequence as a subsequence in
another:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22>
F
Foorr sseeaarrcchh(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22, F
Foorr22 llaasstt22);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22, ccllaassss B
BiinnP
Prreedd>
F
Foorr sseeaarrcchh(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22, F
Foorr22 llaasstt22, B
BiinnP
Prreedd pp);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22>
F
Foorr ffiinndd__eenndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22, F
Foorr22 llaasstt22);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22, ccllaassss B
BiinnP
Prreedd>
F
Foorr ffiinndd__eenndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22, F
Foorr22 llaasstt22, B
BiinnP
Prreedd pp);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss SSiizzee, ccllaassss T
T>
F
Foorr sseeaarrcchh__nn(F
Foorr ffiirrsstt, F
Foorr llaasstt, SSiizzee nn, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss SSiizzee, ccllaassss T
T, ccllaassss B
BiinnP
Prreedd>
F
Foorr sseeaarrcchh__nn(F
Foorr ffiirrsstt, F
Foorr llaasstt, SSiizzee nn, ccoonnsstt T
T& vvaall, B
BiinnP
Prreedd pp);
The sseeaarrcchh() algorithm looks for its second sequence as a subsequence of its first. If that second
sequence is found, an iterator for the first matching element in the first sequence is returned. The
end of sequence (llaasstt) is returned to represent ‘‘not found.’’ Thus, the return value is always in the
[ffiirrsstt,llaasstt] sequence. For example:
ssttrriinngg qquuoottee("W
Whhyy w
waassttee ttiim
mee lleeaarrnniinngg, w
whheenn iiggnnoorraannccee iiss iinnssttaannttaanneeoouuss?");
bbooooll iinn__qquuoottee(ccoonnsstt ssttrriinngg& ss)
{
cchhaarr* p = sseeaarrcchh(qquuoottee.bbeeggiinn(),qquuoottee.eenndd(),ss.bbeeggiinn(),ss.eenndd()); // find s in quote
rreettuurrnn pp!=qquuoottee.eenndd();
}
vvooiidd gg()
{
bbooooll bb11 = iinn__qquuoottee("lleeaarrnniinngg");
bbooooll bb22 = iinn__qquuoottee("lleem
mm
miinngg");
}
// b1 = true
// b2 = false
Thus, sseeaarrcchh() is an operation for finding a substring generalized to all sequences. This implies
that sseeaarrcchh() is a very useful algorithm.
The ffiinndd__eenndd() algorithm looks for its second input sequence as a subsequence of its first
input sequence. If that second sequence is found, ffiinndd__eenndd() returns an iterator pointing to the
last match in its first input. In other words, ffiinndd__eenndd() is sseeaarrcchh() ‘‘backwards.’’ It finds the
last occurrence of its second input sequence in its first input sequence, rather than the first occurrence of its second sequence.
The sseeaarrcchh__nn() algorithm finds a sequence of at least n matches for its vvaalluuee argument in the
sequence. It returns an iterator to the first element of the sequence of n matches.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.6
Modifying Sequence Algorithms
529
18.6 Modifying Sequence Algorithms [algo.modifying]
If you want to change a sequence, you can explicitly iterate through it. You can then modify values. Wherever possible, however, we prefer to avoid this kind of programming in favor of simpler
and more systematic styles of programming. The alternative is algorithms that traverse sequences
performing specific tasks. The nonmodifying algorithms (§18.5) serve this need when we just read
from the sequence. The modifying sequence algorithms are provided to do the most common
forms of updates. Some update a sequence, while others produce a new sequence based on information found during a traversal.
Standard algorithms work on data structures through iterators. This implies that inserting a new
element into a container or deleting one is not easy. For example, given only an iterator, how can
we find the container from which to remove the element pointed to? Unless special iterators are
used (e.g., inserters, §3.8, §19.2.4), operations through iterators do not change the size of a container. Instead of inserting and deleting elements, the algorithms change the values of elements,
swap elements, and copy elements. Even rreem
moovvee() operates by overwriting the elements to be
removed (§18.6.5). In general, the fundamental modifying operations produce outputs that are
modified copies of their inputs. The algorithms that appear to modify a sequence are variants that
copy within a sequence.
18.6.1 Copy [algo.copy]
Copying is the simplest way to produce one sequence from another. The definitions of the basic
copy operations are trivial:
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt> O
Ouutt ccooppyy(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess)
{
w
whhiillee (ffiirrsstt != llaasstt) *rreess++ = *ffiirrsstt++;
rreettuurrnn rreess;
}
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss B
Bii22> B
Bii22 ccooppyy__bbaacckkw
waarrdd(B
Bii ffiirrsstt, B
Bii llaasstt, B
Bii22 rreess)
{
w
whhiillee (ffiirrsstt != llaasstt) *--rreess = *--llaasstt;
rreettuurrnn rreess;
}
The target of a copy algorithm need not be a container. Anything that can be described by an output iterator (§19.2.6) will do. For example:
vvooiidd ff(lliisstt<C
Clluubb>& llcc, oossttrreeaam
m& ooss)
{
ccooppyy(llcc.bbeeggiinn(),llcc.eenndd(),oossttrreeaam
m__iitteerraattoorr<C
Clluubb>(ooss));
}
To read a sequence, we need a sequence describing where to begin and where to end. To write, we
need only an iterator describing where to write to. However, we must take care not to write beyond
the end of the target. One way to ensure that we don’t do this is to use an inserter (§19.2.4) to grow
the target as needed. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
530
Algorithms and Function Objects
Chapter 18
vvooiidd ff(vveeccttoorr<cchhaarr>& vvss)
{
vveeccttoorr<cchhaarr> vv;
ccooppyy(vvss.bbeeggiinn(),vvss.eenndd(),vv.bbeeggiinn());
// might overwrite end of v
ccooppyy(vvss.bbeeggiinn(),vvss.eenndd(),bbaacckk__iinnsseerrtteerr(vv)); // add elements from vs to end of v
}
The input sequence and the output sequence may overlap. We use ccooppyy() when the sequences do
not overlap or if the end of the output sequence is in the input sequence. We use
ccooppyy__bbaacckkw
waarrdd() when the beginning of the output sequence is in the input sequence. In that
way, no element is overwritten until after it has been copied. See also §18.13[13].
Naturally, to copy something backwards we need a bidirectional iterator (§19.2.1) for both the
input and the output sequences. For example:
vvooiidd ff(vveeccttoorr<cchhaarr>& vvcc)
{
vveeccttoorr<cchhaarr> vv(vvcc.ssiizzee());
ccooppyy__bbaacckkw
waarrdd(vvcc.bbeeggiinn(),vvcc.eenndd(),oouuttppuutt__iitteerraattoorr<cchhaarr>(ccoouutt)); // error
ccooppyy__bbaacckkw
waarrdd(vvcc.bbeeggiinn(),vvcc.eenndd(),vv.eenndd());
ccooppyy(vv.bbeeggiinn(),vv.eenndd(),oossttrreeaam
m__iitteerraattoorr<cchhaarr>(ooss));
// ok
}
Often, we want to copy only elements that fulfill some criterion. Unfortunately, ccooppyy__iiff() was
somehow dropped from the set of algorithms provided by the standard library (mea culpa). On the
other hand, it is trivial to define:
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss P
Prreedd> O
Ouutt ccooppyy__iiff(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, P
Prreedd pp)
{
w
whhiillee (ffiirrsstt != llaasstt) {
iiff (pp(*ffiirrsstt)) *rreess++ = *ffiirrsstt;
++ffiirrsstt;
}
rreettuurrnn rreess;
}
Now if we want to print elements with a value larger than nn, we can do it like this:
vvooiidd ff(lliisstt<iinntt>&lldd, iinntt nn, oossttrreeaam
m& ooss)
{
ccooppyy__iiff(lldd.bbeeggiinn(),lldd.eenndd(),oossttrreeaam
m__iitteerraattoorr<iinntt>(ooss),bbiinndd22nndd(ggrreeaatteerr<iinntt>(),nn));
}
See also rreem
moovvee__ccooppyy__iiff() (§18.6.5).
18.6.2 Transform [algo.transform]
Somewhat confusingly, ttrraannssffoorrm
m() doesn’t necessarily change its input. Instead, it produces an
output that is a transformation of its input based on a user-supplied operation:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.6.2
Transform
531
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss O
Opp>
O
Ouutt ttrraannssffoorrm
m(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, O
Opp oopp)
{
w
whhiillee (ffiirrsstt != llaasstt) *rreess++ = oopp(*ffiirrsstt++);
rreettuurrnn rreess;
}
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt, ccllaassss B
BiinnO
Opp>
O
Ouutt ttrraannssffoorrm
m(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, O
Ouutt rreess, B
BiinnO
Opp oopp)
{
w
whhiillee (ffiirrsstt != llaasstt) *rreess++ = oopp(*ffiirrsstt++,*ffiirrsstt22++);
rreettuurrnn rreess;
}
The ttrraannssffoorrm
m() that reads a single sequence to produce its output is rather similar to ccooppyy().
Instead of writing its element, it writes the result of its operation on that element. Thus, we could
have defined ccooppyy() as ttrraannssffoorrm
m() with an operation that returns its argument:
tteem
mppllaattee<ccllaassss T
T> T iiddeennttiittyy(ccoonnsstt T
T& xx) { rreettuurrnn xx; }
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt> O
Ouutt ccooppyy(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess)
{
rreettuurrnn ttrraannssffoorrm
m(ffiirrsstt,llaasstt,rreess,iiddeennttiittyy);
}
Another way to view ttrraannssffoorrm
m() is as a variant of ffoorr__eeaacchh that explicitly produces output. For
example, we can produce a list of name ssttrriinnggs from a list of C
Clluubbs using ttrraannssffoorrm
m():
ssttrriinngg nnaam
meeooff(ccoonnsstt C
Clluubb& cc) // extract name string
{
rreettuurrnn cc.nnaam
mee;
}
vvooiidd ff(lliisstt<C
Clluubb>& llcc)
{
ttrraannssffoorrm
m(llcc.bbeeggiinn(),llcc.eenndd(),oossttrreeaam
m__iitteerraattoorr<ssttrriinngg>(ccoouutt),nnaam
meeooff);
}
One reason ttrraannssffoorrm
m() is called ‘‘transform’’ is that the result of the operation is often written
back to where the argument came from. As an example, consider deleting the objects pointed to by
a set of pointers:
tteem
mppllaattee<ccllaassss T
T> T
T* ddeelleettee__ppttrr(T
T* pp) { ddeelleettee pp; rreettuurrnn 00; }
vvooiidd ppuurrggee(ddeeqquuee<SShhaappee*>& ss)
{
ttrraannssffoorrm
m(ss.bbeeggiinn(),ss.eenndd(),ss.bbeeggiinn(),ddeelleettee__ppttrr);
// ...
}
The ttrraannssffoorrm
m() algorithm always produces an output sequence. Here, I directed the result back
to the input sequence so that ddeelleettee__ppttrr(pp) has the effect pp=ddeelleettee__ppttrr(pp). This was why I chose
to return 0 from ddeelleettee__ppttrr().
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
532
Algorithms and Function Objects
Chapter 18
The ttrraannssffoorrm
m() algorithm that takes two sequences allows people to combine information
from two sources. For example, an animation may have a routine that updates the position of a list
of shapes by applying a translation:
SShhaappee* m
moovvee__sshhaappee(SShhaappee* ss, P
Pooiinntt pp)
{
ss->m
moovvee__ttoo(ss->cceenntteerr()+pp);
rreettuurrnn ss;
}
// *s += p
vvooiidd uuppddaattee__ppoossiittiioonnss(lliisstt<SShhaappee*>& llss, vveeccttoorr<P
Pooiinntt>& ooppeerr)
{
// invoke operation on corresponding object:
ttrraannssffoorrm
m(llss.bbeeggiinn(),llss.eenndd(),ooppeerr.bbeeggiinn(),llss.bbeeggiinn(),m
moovvee__sshhaappee);
}
I didn’t really want to produce a return value from m
moovvee__sshhaappee(). However, ttrraannssffoorrm
m() insists
on assigning the result of its operation, so I let m
moovvee__sshhaappee() return its first operand so that I
could write it back to where it came from.
Sometimes, we do not have the freedom to do that. For example, an operation that I didn’t write
and don’t want to modify might not return a value. Sometimes, the input sequence is ccoonnsstt. In
such cases, we might define a two-sequence ffoorr__eeaacchh() to match the two-sequence ttrraannssffoorrm
m():
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss B
BiinnO
Opp>
B
BiinnO
Opp ffoorr__eeaacchh(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, B
BiinnO
Opp oopp)
{
w
whhiillee (ffiirrsstt != llaasstt) oopp(*ffiirrsstt++,*ffiirrsstt22++);
rreettuurrnn oopp;
}
vvooiidd uuppddaattee__ppoossiittiioonnss(lliisstt<SShhaappee*>& llss, vveeccttoorr<P
Pooiinntt>& ooppeerr)
{
ffoorr__eeaacchh(llss.bbeeggiinn(),llss.eenndd(),ooppeerr.bbeeggiinn(),m
moovvee__sshhaappee);
}
At other times, it can be useful to have an output iterator that doesn’t actually write anything
(§19.6[2]).
There are no standard library algorithms that read three or more sequences. Such algorithms are
easily written, though. Alternatively, you can use ttrraannssffoorrm
m() repeatedly.
18.6.3 Unique [algo.unique]
Whenever information is collected, duplication can occur. The uunniiqquuee() and uunniiqquuee__ccooppyy()
algorithms eliminate adjacent duplicate values:
tteem
mppllaattee<ccllaassss F
Foorr> F
Foorr uunniiqquuee(F
Foorr ffiirrsstt, F
Foorr llaasstt);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss B
BiinnP
Prreedd> F
Foorr uunniiqquuee(F
Foorr ffiirrsstt, F
Foorr llaasstt, B
BiinnP
Prreedd pp);
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt> O
Ouutt uunniiqquuee__ccooppyy(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss B
BiinnP
Prreedd>
O
Ouutt uunniiqquuee__ccooppyy(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, B
BiinnP
Prreedd pp);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.6.3
Unique
533
The uunniiqquuee() algorithm eliminates adjacent duplicates from a sequence, uunniiqquuee__ccooppyy() makes a
copy without duplicates. For example:
vvooiidd ff(lliisstt<ssttrriinngg>& llss, vveeccttoorr<ssttrriinngg>& vvss)
{
llss.ssoorrtt(); // list sort (§17.2.2.1)
uunniiqquuee__ccooppyy(llss.bbeeggiinn(),llss.eenndd(),bbaacckk__iinnsseerrtteerr(vvss));
}
This copies llss to vvss, eliminating duplicates in the process. The ssoorrtt() is needed to get equal
strings adjacent.
Like other standard algorithms, uunniiqquuee() operates on iterators. It has no way of knowing the
type of container these iterators point into, so it cannot modify that container. It can only modify
the values of the elements. This implies that uunniiqquuee() does not eliminate duplicates from its input
sequence in the way we naively might expect. Rather, it moves unique elements towards the front
(head) of a sequence and returns an iterator to the end of the subsequence of unique elements:
tteem
mppllaattee <ccllaassss F
Foorr> F
Foorr uunniiqquuee(F
Foorr ffiirrsstt, F
Foorr llaasstt)
{
ffiirrsstt = aaddjjaacceenntt__ffiinndd(ffiirrsstt,llaasstt);
// §18.5.2
rreettuurrnn uunniiqquuee__ccooppyy(ffiirrsstt,llaasstt,ffiirrsstt);
}
The elements after the unique subsequence are left unchanged. Therefore, this does not eliminate
duplicates in a vector:
vvooiidd ff(vveeccttoorr<ssttrriinngg>& vvss)
// warning: bad code!
{
ssoorrtt(vvss.bbeeggiinn(),vvss.eenndd());
// sort vector
uunniiqquuee(vvss.bbeeggiinn(),vvss.eenndd());
// eliminate duplicates (no it doesn’t!)
}
In fact, by moving the last elements of a sequence forward to eliminate duplicates, uunniiqquuee() can
introduce new duplicates. For example:
iinntt m
maaiinn()
{
cchhaarr vv[] = "aabbbbccccccddee";
cchhaarr* p = uunniiqquuee(vv,vv+ssttrrlleenn(vv));
ccoouutt << v << ´ ´ << pp-vv << ´\\nn´;
}
produced
aabbccddeeccddee 5
That is, p points to the second cc.
Algorithms that might have removed elements (but can’t) generally come in two forms: the
‘‘plain’’ version that reorders elements in a way similar to uunniiqquuee() and a version that produces a
new sequence in a way similar to uunniiqquuee__ccooppyy(). The __ccooppyy suffix is used to distinguish these
two kinds of algorithms.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
534
Algorithms and Function Objects
Chapter 18
To eliminate duplicates from a container, we must explicitly shrink it:
tteem
mppllaattee<ccllaassss C
C> vvooiidd eelliim
miinnaattee__dduupplliiccaatteess(C
C& cc)
{
ssoorrtt(cc.bbeeggiinn(),cc.eenndd());
ttyyppeennaam
mee C
C::iitteerraattoorr p = uunniiqquuee(cc.bbeeggiinn(),cc.eenndd());
cc.eerraassee(pp,cc.eenndd());
}
// sort
// compact
// shrink
Note that eelliim
miinnaattee__dduupplliiccaatteess() would make no sense for a built-in array, yet uunniiqquuee() can still
be applied to arrays.
An example of uunniiqquuee__ccooppyy() can be found in §3.8.3.
18.6.3.1 Sorting Criteria [algo.criteria]
To eliminate all duplicates, the input sequences must be sorted (§18.7.1). Both uunniiqquuee() and
uunniiqquuee__ccooppyy() use == as the default criterion for comparison and allow the user to supply alternative criteria. For instance, we might modify the example from §18.5.1 to eliminate duplicate
names. After extracting the names of the C
Clluubb officers, we were left with a lliisstt<P
Peerrssoonn*> called
ooffff (§18.5.1). We could eliminate duplicates like this:
eelliim
miinnaattee__dduupplliiccaatteess(ooffff);
However, this relies on sorting pointers and assumes that each pointer uniquely identifies a person.
In general, we would have to examine the P
Peerrssoonn records to determine whether we would consider
them equal. We might write:
bbooooll ooppeerraattoorr==(ccoonnsstt P
Peerrssoonn& xx, ccoonnsstt P
Peerrssoonn& yy) // equality for object
{
// compare x and y for equality
}
bbooooll ooppeerraattoorr<(ccoonnsstt P
Peerrssoonn& xx, ccoonnsstt P
Peerrssoonn& yy)
{
// compare x and y for order
}
// less than for object
bbooooll P
Peerrssoonn__eeqq(ccoonnsstt P
Peerrssoonn* xx, ccoonnsstt P
Peerrssoonn* yy) // equality through pointer
{
rreettuurrnn *xx == *yy;
}
bbooooll P
Peerrssoonn__lltt(ccoonnsstt P
Peerrssoonn* xx, ccoonnsstt P
Peerrssoonn* yy)
{
rreettuurrnn *xx < *yy;
}
// less than through pointer
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.6.3.1
Sorting Criteria
535
vvooiidd eexxttrraacctt__aanndd__pprriinntt(ccoonnsstt lliisstt<C
Clluubb>& llcc)
{
lliisstt<P
Peerrssoonn*> ooffff;
eexxttrraacctt(llcc,ooffff);
ssoorrtt(ooffff.bbeeggiinn(),ooffff.eenndd(),P
Peerrssoonn__lltt);
lliisstt<C
Clluubb>::iitteerraattoorr p = uunniiqquuee(ooffff.bbeeggiinn(),ooffff.eenndd(),P
Peerrssoonn__eeqq);
ffoorr__eeaacchh(ooffff.bbeeggiinn(),pp,P
Prriinntt__nnaam
mee(ccoouutt));
}
It is wise to make sure that the criterion used to sort matches the one used to eliminate duplicates.
The default meanings of < and == for pointers are rarely useful as comparison criteria for the
objects pointed to.
18.6.4 Replace [algo.replace]
The rreeppllaaccee() algorithms traverse a sequence, replacing values by other values as specified. They
follow the patterns outlined by ffiinndd/ffiinndd__iiff and uunniiqquuee/uunniiqquuee__ccooppyy, thus yielding four variants
in all. Again, the code is simple enough to be illustrative:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T>
vvooiidd rreeppllaaccee(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall, ccoonnsstt T
T& nneew
w__vvaall)
{
w
whhiillee (ffiirrsstt != llaasstt) {
iiff (*ffiirrsstt == vvaall) *ffiirrsstt = nneew
w__vvaall;
++ffiirrsstt;
}
}
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss P
Prreedd, ccllaassss T
T>
vvooiidd rreeppllaaccee__iiff(F
Foorr ffiirrsstt, F
Foorr llaasstt, P
Prreedd pp, ccoonnsstt T
T& nneew
w__vvaall)
{
w
whhiillee (ffiirrsstt != llaasstt) {
iiff (pp(*ffiirrsstt)) *ffiirrsstt = nneew
w__vvaall;
++ffiirrsstt;
}
}
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss T
T>
O
Ouutt rreeppllaaccee__ccooppyy(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, ccoonnsstt T
T& vvaall, ccoonnsstt T
T& nneew
w__vvaall)
{
w
whhiillee (ffiirrsstt != llaasstt) {
*rreess++ = (*ffiirrsstt == vvaall) ? nneew
w__vvaall : *ffiirrsstt;
++ffiirrsstt;
}
rreettuurrnn rreess;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
536
Algorithms and Function Objects
Chapter 18
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss P
Prreedd, ccllaassss T
T>
O
Ouutt rreeppllaaccee__ccooppyy__iiff(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, P
Prreedd pp, ccoonnsstt T
T& nneew
w__vvaall)
{
w
whhiillee (ffiirrsstt != llaasstt) {
*rreess++ = pp(*ffiirrsstt) ? nneew
w__vvaall : *ffiirrsstt;
++ffiirrsstt;
}
rreettuurrnn rreess;
}
We might want to go through a list of ssttrriinnggs, replacing the usual English transliteration of the
name of my home town Aarhus with its proper name Århus:
vvooiidd ff(lliisstt<ssttrriinngg>& ttoow
wnnss)
{
rreeppllaaccee(ttoow
wnnss.bbeeggiinn(),ttoow
wnnss.eenndd(),"A
Aaarrhhuuss","Å
Årrhhuuss");
}
This relies on an extended character set (§C.3.3).
18.6.5 Remove [algo.remove]
The rreem
moovvee() algorithms remove elements from a sequence based on a value or a predicate:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T> F
Foorr rreem
moovvee(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss P
Prreedd> F
Foorr rreem
moovvee__iiff(F
Foorr ffiirrsstt, F
Foorr llaasstt, P
Prreedd pp);
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss T
T>
O
Ouutt rreem
moovvee__ccooppyy(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss P
Prreedd>
O
Ouutt rreem
moovvee__ccooppyy__iiff(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, P
Prreedd pp);
Assuming that a C
Clluubb has an address, we could produce a list of C
Clluubbs located in Copenhagen:
ccllaassss llooccaatteedd__iinn {
ssttrriinngg ttoow
wnn;
ppuubblliicc:
llooccaatteedd__iinn(ccoonnsstt ssttrriinngg& ssss) :ttoow
wnn(ssss) { }
bbooooll ooppeerraattoorr()(ccoonnsstt C
Clluubb& cc) ccoonnsstt { rreettuurrnn cc.ttoow
wnn == ttoow
wnn; }
};
vvooiidd ff(lliisstt<C
Clluubb>& llcc)
{
rreem
moovvee__ccooppyy__iiff(llcc.bbeeggiinn(),llcc.eenndd(),
oouuttppuutt__iitteerraattoorr<C
Clluubb>(ccoouutt),nnoott11(llooccaatteedd__iinn("K
Køøbbeennhhaavvnn")));
}
Thus, rreem
moovvee__ccooppyy__iiff() is ccooppyy__iiff() (§18.6.1) with the inverse condition. That is, an element is
placed on the output by rreem
moovvee__ccooppyy__iiff() if the element does not match the predicate.
The ‘‘plain’’ rreem
moovvee() compacts non-matching elements at the beginning of the sequence and
returns an iterator for the end of the compacted sequence (see also §18.6.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.6.6
Fill and Generate
537
18.6.6 Fill and Generate [algo.fill]
The ffiillll() and ggeenneerraattee() algorithms exist to systematically assign values to sequences:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T> vvooiidd ffiillll(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss O
Ouutt, ccllaassss SSiizzee, ccllaassss T
T> vvooiidd ffiillll__nn(O
Ouutt rreess, SSiizzee nn, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss G
Geenn> vvooiidd ggeenneerraattee(F
Foorr ffiirrsstt, F
Foorr llaasstt, G
Geenn gg);
tteem
mppllaattee<ccllaassss O
Ouutt, ccllaassss SSiizzee, ccllaassss G
Geenn> vvooiidd ggeenneerraattee__nn(O
Ouutt rreess, SSiizzee nn, G
Geenn gg);
The ffiillll() algorithm assigns a specified value; the ggeenneerraattee() algorithm assigns values obtained
by calling its function argument repeatedly. Thus, ffiillll() is simply the special case of ggeenneerraattee()
in which the generator function returns the same value repeatedly. The __nn versions assign to the
first n elements of the sequence.
For example, using the random-number generators R
Raannddiinntt and U
Urraanndd from §22.7:
iinntt vv11[990000];
iinntt vv22[990000];
vveeccttoorr vv33;
vvooiidd ff()
{
ffiillll(vv11,&vv11[990000],9999);
ggeenneerraattee(vv22,&vv22[990000],R
Raannddiinntt);
// set all elements of v1 to 99
// set to random values (§22.7)
// output 200 random integers in the interval [0..99]:
ggeenneerraattee__nn(oossttrreeaam
m__iitteerraattoorr<iinntt>(ccoouutt),220000,U
Urraanndd(110000));
ffiillll__nn(bbaacckk__iinnsseerrtteerr(vv33),2200,9999);
// add 20 elements with the value 99 to v3
}
The ggeenneerraattee() and ffiillll() functions assign rather than initialize. If you need to manipulate raw
storage, say to turn a region of memory into objects of well-defined type and state, you must use an
algorithm like uunniinniittiiaalliizzeedd__ffiillll() from <m
meem
moorryy> (§19.4.4) rather than algorithms from <aallggoo-rriitthhm
m>.
18.6.7 Reverse and Rotate [algo.reverse]
Occasionally, we need to reorder the elements of a sequence:
tteem
mppllaattee<ccllaassss B
Bii> vvooiidd rreevveerrssee(B
Bii ffiirrsstt, B
Bii llaasstt);
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss O
Ouutt> O
Ouutt rreevveerrssee__ccooppyy(B
Bii ffiirrsstt, B
Bii llaasstt, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss F
Foorr> vvooiidd rroottaattee(F
Foorr ffiirrsstt, F
Foorr m
miiddddllee, F
Foorr llaasstt);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss O
Ouutt> O
Ouutt rroottaattee__ccooppyy(F
Foorr ffiirrsstt, F
Foorr m
miiddddllee, F
Foorr llaasstt, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd rraannddoom
m__sshhuuffffllee(R
Raann ffiirrsstt, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss G
Geenn> vvooiidd rraannddoom
m__sshhuuffffllee(R
Raann ffiirrsstt, R
Raann llaasstt, G
Geenn& gg);
The rreevveerrssee() algorithm reverses the order of the elements so that the first element becomes the
last, etc. The rreevveerrssee__ccooppyy() algorithm produces a copy of its input in reverse order.
The rroottaattee() algorithm considers its [ffiirrsstt,llaasstt[ sequence a circle and rotates its elements
until its former m
miiddddllee element is placed where its ffiirrsstt element used to be. That is, the element in
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
538
Algorithms and Function Objects
Chapter 18
position ffiirrsstt+ii moves to position ffiirrsstt+(ii+(llaasstt-m
miiddddllee))%(llaasstt-ffiirrsstt). The % (modulo) is
what makes the rotation cyclic rather than simply a shift to the left. For example:
vvooiidd ff()
{
ssttrriinngg vv[] = { "F
Frroogg", "aanndd","P
Peeaacchh" };
rreevveerrssee(vv,vv+33);
rroottaattee(vv,vv+11,vv+33);
// Peach and Frog
// and Frog Peach
}
The rroottaattee__ccooppyy() algorithm produces a copy of its input in rotated order.
By default, rraannddoom
m__sshhuuffffllee() shuffles its sequence using a uniform distribution randomnumber generator. That is, it chooses a permutation of the elements of the sequence in such a way
that each permutation has the same chance of being chosen. If you want a different distribution or
simply a better random-number generator, you can supply one. For example, using the U
Urraanndd generator from §22.7 we might shuffle a deck of cards like this:
vvooiidd ff(ddeeqquuee<C
Caarrdd>& ddcc)
{
rraannddoom
m__sshhuuffffllee(ddcc.bbeeggiinn(),ddcc.eenndd(),U
Urraanndd(5522));
// ...
}
The movement of elements done by rroottaattee(), etc., is done using ssw
waapp() (§18.6.8).
18.6.8 Swap [algo.swap]
To do anything at all interesting with elements in a container, we need to move them around. Such
movement is best expressed – that is, expressed most simply and most efficiently – as ssw
waapp()s:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssw
waapp(T
T& aa, T
T& bb)
{
T ttm
mpp = aa;
a = bb;
b = ttm
mpp;
}
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22> vvooiidd iitteerr__ssw
waapp(F
Foorr xx, F
Foorr22 yy);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss F
Foorr22> F
Foorr22 ssw
waapp__rraannggeess(F
Foorr ffiirrsstt, F
Foorr llaasstt, F
Foorr22 ffiirrsstt22)
{
w
whhiillee (ffiirrsstt != llaasstt) iitteerr__ssw
waapp(ffiirrsstt++, ffiirrsstt22++);
rreettuurrnn ffiirrsstt22;
}
To swap elements, you need a temporary. There are clever tricks to eliminate that need in specialized cases, but they are best avoided in favor of the simple and obvious. The ssw
waapp() algorithm is
specialized for important types for which it matters (§16.3.9, §13.5.2).
The iitteerr__ssw
waapp() algorithm swaps the elements pointed to by its iterator arguments.
The ssw
waapp__rraannggeess algorithm swaps elements in its two input ranges.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.7
Sorted Sequences
539
18.7 Sorted Sequences [algo.sorted]
Once we have collected some data, we often want to sort it. Once the sequence is sorted, our
options for manipulating the data in a convenient manner increase significantly.
To sort a sequence, we need a way of comparing elements. This is done using a binary predicate (§18.4.2). The default comparison is lleessss (§18.4.2), which in turn uses < by default.
18.7.1 Sorting [algo.sort]
The ssoorrtt() algorithms require random-access iterators (§19.2.1). That is, they work best for
vveeccttoorrs (§16.3) and similar containers:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ssttaabbllee__ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd ssttaabbllee__ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
The standard lliisstt (§17.2.2) does not provide random-access iterators, so lliisstts should be sorted using
the specific lliisstt operations (§17.2.2.1).
The basic ssoorrtt() is efficient – on average N
N*lloogg(N
N) – but its worst-case performance is poor
– O
O(N
N*N
N). Fortunately, the worst case is rare. If guaranteed worst-case behavior is important or
a stable sort is required, ssttaabbllee__ssoorrtt() should be used; that is, an N
N*lloogg(N
N)*lloogg(N
N) algorithm
that improves towards N
N*lloogg(N
N) when the system has sufficient extra memory. The relative order
of elements that compare equal is preserved by ssttaabbllee__ssoorrtt() but not by ssoorrtt().
Sometimes, only the first elements of a sorted sequence are needed. In that case, it makes sense
to sort the sequence only as far as is needed to get the first part in order. That is a partial sort:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ppaarrttiiaall__ssoorrtt(R
Raann ffiirrsstt, R
Raann m
miiddddllee, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp>
vvooiidd ppaarrttiiaall__ssoorrtt(R
Raann ffiirrsstt, R
Raann m
miiddddllee, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss IInn, ccllaassss R
Raann>
R
Raann ppaarrttiiaall__ssoorrtt__ccooppyy(IInn ffiirrsstt, IInn llaasstt, R
Raann ffiirrsstt22, R
Raann llaasstt22);
tteem
mppllaattee<ccllaassss IInn, ccllaassss R
Raann, ccllaassss C
Cm
mpp>
R
Raann ppaarrttiiaall__ssoorrtt__ccooppyy(IInn ffiirrsstt, IInn llaasstt, R
Raann ffiirrsstt22, R
Raann llaasstt22, C
Cm
mpp ccm
mpp);
The plain ppaarrttiiaall__ssoorrtt() algorithms put the elements in the range ffiirrsstt to m
miiddddllee in order. The
ppaarrttiiaall__ssoorrtt__ccooppyy() algorithms produce N elements, where N is the lower of the number of elements in the output sequence and the number of elements in the input sequence. We need to specify both the start and the end of the result sequence because that’s what determines how many elements we need to sort. For example:
ccllaassss C
Coom
mppaarree__ccooppiieess__ssoolldd {
ppuubblliicc:
iinntt ooppeerraattoorr()(ccoonnsstt B
Booookk& bb11, ccoonnsstt B
Booookk& bb22) ccoonnsstt
{ rreettuurrnn bb11.ccooppiieess__ssoolldd()<bb22.ccooppiieess__ssoolldd(); }
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
540
Algorithms and Function Objects
Chapter 18
vvooiidd ff(ccoonnsstt vveeccttoorr<B
Booookk>& ssaalleess) // find the top ten books
{
vveeccttoorr<B
Booookk> bbeessttsseelllleerrss(1100);
ppaarrttiiaall__ssoorrtt__ccooppyy(ssaalleess.bbeeggiinn(),ssaalleess.eenndd(),
bbeessttsseelllleerrss.bbeeggiinn(),bbeessttsseelllleerrss.eenndd(),C
Coom
mppaarree__ccooppiieess__ssoolldd());
ccooppyy(bbeessttsseelllleerrss.bbeeggiinn(),bbeessttsseelllleerrss.eenndd(),oossttrreeaam
m__iitteerraattoorr<B
Booookk>(ccoouutt));
}
Because the target of ppaarrttiiaall__ssoorrtt__ccooppyy() must be a random-access iterator, we cannot sort
directly to ccoouutt.
Finally, algorithms are provided to sort only as far as is necessary to get the N
Nth element to its
proper place with no element comparing less than the N
Ntthh element placed after it in the sequence:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd nntthh__eelleem
meenntt(R
Raann ffiirrsstt, R
Raann nntthh, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd nntthh__eelleem
meenntt(R
Raann ffiirrsstt, R
Raann nntthh, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
This algorithm is particularly useful for people – such as economists, sociologists, and teachers –
who need to look for medians, percentiles, etc.
18.7.2 Binary Search [algo.bsearch]
A sequential search such as ffiinndd() (§18.5.2) is terribly inefficient for large sequences, but it is
about the best we can do without sorting or hashing (§17.6). Once a sequence is sorted, however,
we can use a binary search to determine whether a value is in a sequence:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T> bbooooll bbiinnaarryy__sseeaarrcchh(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T, ccllaassss C
Cm
mpp>
bbooooll bbiinnaarryy__sseeaarrcchh(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaalluuee, C
Cm
mpp ccm
mpp);
For example:
vvooiidd ff(lliisstt<iinntt>& cc)
{
iiff (bbiinnaarryy__sseeaarrcchh(cc.bbeeggiinn(),cc.eenndd(),77)) {
// ...
}
// ...
}
// is 7 in c?
A bbiinnaarryy__sseeaarrcchh() returns a bbooooll indicating whether a value was present. As with ffiinndd(), we
often also want to know where the elements with that value are in that sequence. However, there
can be many elements with a given value in a sequence, and we often need to find either the first or
all such elements. Consequently, algorithms are provided for finding a range of equal elements,
eeqquuaall__rraannggee(), and algorithms for finding the lloow
weerr__bboouunndd()and uuppppeerr__bboouunndd() of that range:
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T> F
Foorr lloow
weerr__bboouunndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T, ccllaassss C
Cm
mpp>
F
Foorr lloow
weerr__bboouunndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall, C
Cm
mpp ccm
mpp);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.7.2
Binary Search
541
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T> F
Foorr uuppppeerr__bboouunndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T, ccllaassss C
Cm
mpp>
F
Foorr uuppppeerr__bboouunndd(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T> ppaaiirr<F
Foorr, F
Foorr> eeqquuaall__rraannggee(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T, ccllaassss C
Cm
mpp>
ppaaiirr<F
Foorr, F
Foorr> eeqquuaall__rraannggee(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall, C
Cm
mpp ccm
mpp);
These algorithms correspond to the operations on m
muullttiim
maapps (§17.4.2). We can think of
lloow
weerr__bboouunndd() as a fast ffiinndd() and ffiinndd__iiff() for sorted sequences. For example:
vvooiidd gg(vveeccttoorr<iinntt>& cc)
{
ttyyppeeddeeff vveeccttoorr<iinntt>::iitteerraattoorr V
VII;
V
VII p = ffiinndd(cc.bbeeggiinn(),cc.eenndd(),77);
V
VII q = lloow
weerr__bboouunndd(cc.bbeeggiinn(),cc.eenndd(),77);
// ...
// probably slow: O(N); c needn’t be sorted
// probably fast: O(log(N)); c must be sorted
}
If lloow
weerr__bboouunndd(ffiirrsstt,llaasstt,kk) doesn’t find kk, it returns an iterator to the first element with a key
greater than kk, or llaasstt if no such greater element exists. This way of reporting failure is also used
by uuppppeerr__bboouunndd() and eeqquuaall__rraannggee(). This means that we can use these algorithms to determine where to insert a new element into a sorted sequence so that the sequence remains sorted.
18.7.3 Merge [algo.merge]
Given two sorted sequences, we can merge them into a new sorted sequence using m
meerrggee() or
merge two parts of a sequence using iinnppllaaccee__m
meerrggee():
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt>
O
Ouutt m
meerrggee(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt, ccllaassss C
Cm
mpp>
O
Ouutt m
meerrggee(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss B
Bii> vvooiidd iinnppllaaccee__m
meerrggee(B
Bii ffiirrsstt, B
Bii m
miiddddllee, B
Bii llaasstt);
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss C
Cm
mpp> vvooiidd iinnppllaaccee__m
meerrggee(B
Bii ffiirrsstt, B
Bii m
miiddddllee, B
Bii llaasstt, C
Cm
mpp ccm
mpp);
Note that these merge algorithms differ from lliisstt’s merge (§17.2.2.1) by nnoott removing elements
from their input sequences. Instead, elements are copied.
For elements that compare equal, elements from the first range will always precede elements
from the second.
The iinnppllaaccee__m
meerrggee() algorithm is primarily useful when you have a sequence that can be
sorted by more than one criterion. For example, you might have a vveeccttoorr of fish sorted by species
(for example, cod, haddock, and herring). If the elements of each species are sorted by weight, you
can get the whole vector sorted by weight by applying iinnppllaaccee__m
meerrggee() to merge the information
for the different species (§18.13[20]).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
542
Algorithms and Function Objects
Chapter 18
18.7.4 Partitions [algo.partition]
To partition a sequence is to place every element that satisfies a predicate before every element that
doesn’t. The standard library provides a ssttaabbllee__ppaarrttiittiioonn(), which maintains relative order
among the elements that do and do not satisfy the predicate. In addition, the library offers ppaarrttii-ttiioonn() which doesn’t maintain relative order, but which runs a bit faster when memory is limited:
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss P
Prreedd> B
Bii ppaarrttiittiioonn(B
Bii ffiirrsstt, B
Bii llaasstt, P
Prreedd pp);
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss P
Prreedd> B
Bii ssttaabbllee__ppaarrttiittiioonn(B
Bii ffiirrsstt, B
Bii llaasstt, P
Prreedd pp);
You can think of a partition as a kind of sort with a very simple sorting criterion. For example:
vvooiidd ff(lliisstt<C
Clluubb>& llcc)
{
lliisstt<C
Clluubb>::iitteerraattoorr p = ppaarrttiittiioonn(llcc.bbeeggiinn(),llcc.eenndd(),llooccaatteedd__iinn("K
Køøbbeennhhaavvnn"));
// ...
}
This ‘‘sorts’’ the lliisstt so that C
Clluubbs in Copenhagen comes first. The return value (here pp) points
either to the first element that doesn’t satisfy the predicate or to the end.
18.7.5 Set Operations on Sequences [algo.set]
A sequence can be considered a set. Looked upon that way, it makes sense to provide set operations such as union and intersection for sequences. However, such operations are horribly inefficient unless the sequences are sorted, so the standard library provides set operations for sorted
sequences only. In particular, the set operations work well for sseetts (§17.4.3) and m
muullttiisseetts
(§17.4.4), both of which are sorted anyway.
If these set algorithms are applied to sequences that are not sorted, the resulting sequences will
not conform to the usual set-theoretical rules. These algorithms do not change their input
sequences, and their output sequences are ordered.
The iinncclluuddeess() algorithm tests whether every member of the first sequence is also a member of
the second:
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22>
bbooooll iinncclluuddeess(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss C
Cm
mpp>
bbooooll iinncclluuddeess(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, C
Cm
mpp ccm
mpp);
The sseett__uunniioonn() and sseett__iinntteerrsseeccttiioonn() produce their obvious outputs as sorted sequences:
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt>
O
Ouutt sseett__uunniioonn(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt, ccllaassss C
Cm
mpp>
O
Ouutt sseett__uunniioonn(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt>
O
Ouutt sseett__iinntteerrsseeccttiioonn(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt, ccllaassss C
Cm
mpp>
O
Ouutt sseett__iinntteerrsseeccttiioonn(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess, C
Cm
mpp ccm
mpp);
The sseett__ddiiffffeerreennccee() algorithm produces a sequence of elements that are members of its first, but
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.7.5
Set Operations on Sequences
543
not its second, input sequence. The sseett__ssyym
mm
meettrriicc__ddiiffffeerreennccee() algorithm produces a sequence
of elements that are members of either, but not of both, of its input sequences:
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt>
O
Ouutt sseett__ddiiffffeerreennccee(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt, ccllaassss C
Cm
mpp>
O
Ouutt sseett__ddiiffffeerreennccee(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt>
O
Ouutt sseett__ssyym
mm
meettrriicc__ddiiffffeerreennccee(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss O
Ouutt, ccllaassss C
Cm
mpp>
O
Ouutt sseett__ssyym
mm
meettrriicc__ddiiffffeerreennccee(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, O
Ouutt rreess, C
Cm
mpp ccm
mpp);
For example:
cchhaarr vv11[] = "aabbccdd";
cchhaarr vv22[] = "ccddeeff";
vvooiidd ff(cchhaarr vv33[])
{
sseett__ddiiffffeerreennccee(vv11,vv11+44,vv22,vv22+44,vv33);
sseett__ssyym
mm
meettrriicc__ddiiffffeerreennccee(vv11,vv11+44,vv22,vv22+44,vv33);
}
// v3 = "ab"
// v3 = "abef"
18.8 Heaps [algo.heap]
The word heap means different things in different contexts. When discussing algorithms, ‘‘heap’’
often refers to a way of organizing a sequence such that it has a first element that is the element
with the highest value. Addition of an element (using ppuusshh__hheeaapp()) and removal of an element
(using ppoopp__hheeaapp()) are reasonably fast, with a worst-case performance of O
O(lloogg(N
N)), where N
is the number of elements in the sequence. Sorting (using ssoorrtt__hheeaapp()) has a worst-case performance of O
O(N
N*lloogg(N
N)). A heap is implemented by this set of functions:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ppuusshh__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd ppuusshh__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ppoopp__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt);
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd ppoopp__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd m
maakkee__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt);
// turn sequence into heap
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd m
maakkee__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ssoorrtt__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt);
// turn heap into sequence
tteem
mppllaattee<ccllaassss R
Raann, ccllaassss C
Cm
mpp> vvooiidd ssoorrtt__hheeaapp(R
Raann ffiirrsstt, R
Raann llaasstt, C
Cm
mpp ccm
mpp);
The style of the heap algorithms is odd. A more natural way of presenting their functionality would
be to provide an adapter class with four operations. Doing that would yield something like a
pprriioorriittyy__qquueeuuee (§17.3.3). In fact, a pprriioorriittyy__qquueeuuee is almost certainly implemented using a heap.
The value pushed by ppuusshh__hheeaapp(ffiirrsstt,llaasstt) is *(llaasstt-11). The assumption is that
[ffiirrsstt,llaasstt-11[ is already a heap, so ppuusshh__hheeaapp() extends the sequence to [ffiirrsstt,llaasstt[ by including the next element. Thus, you can build a heap from an existing sequence by a series of
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
544
Algorithms and Function Objects
Chapter 18
ppuusshh__hheeaapp() operations. Conversely, ppoopp__hheeaapp(ffiirrsstt,llaasstt) removes the first element of the
heap by swapping it with the last element (*(llaasstt-11)) and making [ffiirrsstt,llaasstt-11[ into a heap.
18.9 Min and Max [algo.min]
The algorithms described here select a value based on a comparison. It is obviously useful to be
able to find the maximum and minimum of two values:
tteem
mppllaattee<ccllaassss T
T> ccoonnsstt T
T& m
maaxx(ccoonnsstt T
T& aa, ccoonnsstt T
T& bb)
{
rreettuurrnn (aa<bb) ? b : aa;
}
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
Cm
mpp> ccoonnsstt T
T& m
maaxx(ccoonnsstt T
T& aa, ccoonnsstt T
T& bb, C
Cm
mpp ccm
mpp)
{
rreettuurrnn (ccm
mpp(aa,bb)) ? b : aa;
}
tteem
mppllaattee<ccllaassss T
T> ccoonnsstt T
T& m
miinn(ccoonnsstt T
T& aa, ccoonnsstt T
T& bb);
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
Cm
mpp> ccoonnsstt T
T& m
miinn(ccoonnsstt T
T& aa, ccoonnsstt T
T& bb, C
Cm
mpp ccm
mpp);
The m
maaxx() and m
miinn() operations can be generalized to apply to sequences in the obvious manner:
tteem
mppllaattee<ccllaassss F
Foorr> F
Foorr m
maaxx__eelleem
meenntt(F
Foorr ffiirrsstt, F
Foorr llaasstt);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss C
Cm
mpp> F
Foorr m
maaxx__eelleem
meenntt(F
Foorr ffiirrsstt, F
Foorr llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss F
Foorr> F
Foorr m
miinn__eelleem
meenntt(F
Foorr ffiirrsstt, F
Foorr llaasstt);
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss C
Cm
mpp> F
Foorr m
miinn__eelleem
meenntt(F
Foorr ffiirrsstt, F
Foorr llaasstt, C
Cm
mpp ccm
mpp);
Finally, lexicographical ordering is easily generalized from strings of characters to sequences of
values of a type with comparison:
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22>
bbooooll lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22);
tteem
mppllaattee<ccllaassss IInn, ccllaassss IInn22, ccllaassss C
Cm
mpp>
bbooooll lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, IInn22 llaasstt22, C
Cm
mpp ccm
mpp)
{
w
whhiillee (ffiirrsstt != llaasstt && ffiirrsstt22 != llaasstt22) {
iiff (ccm
mpp(*ffiirrsstt,*ffiirrsstt22)) rreettuurrnn ttrruuee;
iiff (ccm
mpp(*ffiirrsstt22++,*ffiirrsstt++)) rreettuurrnn ffaallssee;
}
rreettuurrnn ffiirrsstt == llaasstt && ffiirrsstt22 != llaasstt22;
}
This is very similar to the function presented for general strings in (§13.4.1). However,
lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree() compares sequences in general and not just strings. It also returns a
bbooooll rather than the more useful iinntt. The result is ttrruuee (only) if the first sequence compares < the
second. In particular, the result is ffaallssee when the sequences compare equal.
C-style strings and ssttrriinnggs are sequences, so lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree() can be used as a
string compare function. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.9
Min and Max
545
cchhaarr vv11[] = "yyeess";
cchhaarr vv22[] = "nnoo";
ssttrriinngg ss11 = "Y
Yeess";
ssttrriinngg ss22 = "N
Noo";
vvooiidd ff()
{
bbooooll bb11 = lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(vv11,vv11+ssttrrlleenn(vv11),vv22,vv22+ssttrrlleenn(vv22));
bbooooll bb22 = lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(ss11.bbeeggiinn(),ss11.eenndd(),ss22.bbeeggiinn(),ss22.eenndd());
bbooooll bb33 = lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(vv11,vv11+ssttrrlleenn(vv11),ss11.bbeeggiinn(),ss11.eenndd());
bbooooll bb44 = lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree(ss11.bbeeggiinn(),ss11.eenndd(),vv11,vv11+ssttrrlleenn(vv11),N
Nooccaassee);
}
The sequences need not be of the same type – all we need is to compare their elements – and the
comparison criterion can be supplied. This makes lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree() more general and
potentially a bit slower than ssttrriinngg’s compare. See also §20.3.8.
18.10 Permutations [algo.perm]
Given a sequence of four elements, we can order them in 4*3*2 ways. Each of these orderings is
called a permutation. For example, from the four characters aabbccdd we can produce 24 permutations:
aabbccdd aabbddcc aaccbbdd aaccddbb aaddbbcc aaddccbb bbaaccdd bbaaddcc
bbccaadd bbccddaa bbddaacc bbddccaa ccaabbdd ccaaddbb ccbbaadd ccbbddaa
ccddaabb ccddbbaa ddaabbcc ddaaccbb ddbbaacc ddbbccaa ddccaabb ddccbbaa
The nneexxtt__ppeerrm
muuttaattiioonn() and pprreevv__ppeerrm
muuttaattiioonn() functions deliver such permutations of a
sequence:
tteem
mppllaattee<ccllaassss B
Bii> bbooooll nneexxtt__ppeerrm
muuttaattiioonn(B
Bii ffiirrsstt, B
Bii llaasstt);
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss C
Cm
mpp> bbooooll nneexxtt__ppeerrm
muuttaattiioonn(B
Bii ffiirrsstt, B
Bii llaasstt, C
Cm
mpp ccm
mpp);
tteem
mppllaattee<ccllaassss B
Bii> bbooooll pprreevv__ppeerrm
muuttaattiioonn(B
Bii ffiirrsstt, B
Bii llaasstt);
tteem
mppllaattee<ccllaassss B
Bii, ccllaassss C
Cm
mpp> bbooooll pprreevv__ppeerrm
muuttaattiioonn(B
Bii ffiirrsstt, B
Bii llaasstt, C
Cm
mpp ccm
mpp);
The permutations of aabbccdd were produced like this:
iinntt m
maaiinn()
{
cchhaarr vv[] = "aabbccdd";
ccoouutt << v << ´\\tt´;
w
whhiillee(nneexxtt__ppeerrm
muuttaattiioonn(vv,vv+44)) ccoouutt << v << ´\\tt´;
}
The permutations are produced in lexicographical order (§18.9). The return value of
nneexxtt__ppeerrm
muuttaattiioonn() indicates whether a next permutation actually exists. If not, ffaallssee is returned
and the sequence is the permutation in which the elements are in lexicographical order.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
546
Algorithms and Function Objects
Chapter 18
18.11 C-Style Algorithms [algo.c]
From the C standard library, the C++ standard library inherited a few algorithms dealing with Cstyle strings (§20.4.1), plus a quicksort and a binary search, both limited to arrays.
The qqssoorrtt() and bbsseeaarrcchh() functions are presented in <ccssttddlliibb> and <ssttddlliibb.hh>. They each
operate on an array of n elements of size eelleem
m__ssiizzee using a less-than comparison function passed as
a pointer to function. The elements must be of a type without a user-defined copy constructor, copy
assignment, or destructor:
ttyyppeeddeeff iinntt(*____ccm
mpp)(ccoonnsstt vvooiidd*, ccoonnsstt vvooiidd*);
// typedef for presentation only
vvooiidd qqssoorrtt(vvooiidd* pp, ssiizzee__tt nn, ssiizzee__tt eelleem
m__ssiizzee, ____ccm
mpp);
// sort p
vvooiidd* bbsseeaarrcchh(ccoonnsstt vvooiidd* kkeeyy, vvooiidd* pp, ssiizzee__tt nn, ssiizzee__tt eelleem
m__ssiizzee, ____ccm
mpp);// find key in p
The use of qqssoorrtt() is described in §7.7.
These algorithms are provided solely for C compatibility; ssoorrtt() (§18.7.1) and sseeaarrcchh()
(§18.5.5) are more general and should also be more efficient.
18.12 Advice [algo.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
Prefer algorithms to loops; §18.5.1.
When writing a loop, consider whether it could be expressed as a general algorithm; §18.2.
Regularly review the set of algorithms to see if a new application has become obvious; §18.2.
Be sure that a pair of iterator arguments really do specify a sequence; §18.3.1.
Design so that the most frequently-used operations are simple and safe; §18.3, §18.3.1.
Express tests in a form that allows them to be used as predicates; §18.4.2.
Remember that predicates are functions and objects, not types; §18.4.2.
You can use binders to make unary predicates out of binary predicates; §18.4.4.1.
Use m
meem
m__ffuunn() and m
meem
m__ffuunn__rreeff() to apply algorithms on containers; §18.4.4.2.
Use ppttrr__ffuunn() when you need to bind an argument of a function; §18.4.4.3.
Remember that ssttrrccm
mpp() differs from == by returning 0 to indicate ‘‘equal;’’ §18.4.4.4.
Use ffoorr__eeaacchh() and ttrraannssffoorrm
m() only when there is no more-specific algorithm for a task;
§18.5.1.
Use predicates to apply algorithms using a variety of comparison and equality criteria;
§18.4.2.1, §18.6.3.1.
Use predicates and other function objects so as to use standard algorithms with a wider range
of meanings; §18.4.2.
The default == and < on pointers are rarely adequate for standard algorithms; §18.6.3.1.
Algorithms do not directly add or subtract elements from their argument sequences; §18.6.
Be sure that the less-than and equality predicates used on a sequence match; §18.6.3.1.
Sometimes, sorted sequences can be used to increase efficiency and elegance; §18.7.
Use qqssoorrtt() and bbsseeaarrcchh() for compatibility only; §18.11.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 18.13
Exercises
547
18.13 Exercises [algo.exercises]
The solutions to several exercises for this chapter can be found by looking at the source text of an
implementation of the standard library. Do yourself a favor: try to find your own solutions before
looking to see how your library implementer approached the problems.
1. (∗2) Learn O
O() notation. Give a realistic example in which an O
O(N
N*N
N) algorithm is faster
than an O
O(N
N) algorithm for some N
N>1100.
2. (∗2) Implement and test the four m
meem
m__ffuunn() and m
meem
m__ffuunn__rreeff() functions (§18.4.4.2).
3. (∗1) Write an algorithm m
maattcchh() that is like m
miissm
maattcchh(), except that it returns iterators to the
first corresponding pair that matches the predicate.
4. (∗1.5) Implement and test P
Prriinntt__nnaam
mee from §18.5.1.
5. (∗1) Sort a lliisstt using only standard library algorithms.
6. (∗2.5) Define versions of iisseeqq() (§18.3.1) for built-in arrays, iissttrreeaam
m, and iterator pairs.
Define a suitable set of overloads for the nonmodifying standard algorithms (§18.5) for IIsseeqqs.
Discuss how best to avoid ambiguities and an explosion in the number of template functions.
7. (∗2) Define an oosseeqq() to complement iisseeqq(). The output sequence given as the argument to
oosseeqq() should be replaced by the output produced by an algorithm using it. Define a suitable
set of overloads for at least three standard algorithms of your choice.
8. (∗1.5) Produce a vveeccttoorr of squares of numbers 1 through 100. Print a table of squares. Take
the square root of the elements of that vveeccttoorr and print the resulting vector.
9. (∗2) Write a set of functional objects that do bitwise logical operations on their operands. Test
these objects on vectors of cchhaarr, iinntt, and bbiittsseett<6677>.
10. (∗1) Write a bbiinnddeerr33() that binds the second and third arguments of a three-argument function
to produce a unary predicate. Give an example where bbiinnddeerr33() is a useful function.
11. (∗1.5) Write a small program that that removes adjacent repeated words from from a file file.
Hint: The program should remove a tthhaatt, a ffrroom
m, and a ffiillee from the previous statement.
12. (∗2.5) Define a format for records of references to papers and books kept in a file. Write a program that can write out records from the file identified by year of publication, name of author,
keyword in title, or name of publisher. The user should be able to request that the output be
sorted according to similar criteria.
13. (∗2) Implement a m
moovvee() algorithm in the style of ccooppyy() in such a way that the input and
output sequences can overlap. Be reasonably efficient when given random-access iterators as
arguments.
14. (∗1.5) Produce all anagrams of the word ffoooodd. That is, all four-letter combinations of the letters
ff, oo, oo, and dd. Generalize this program to take a word as input and produce anagrams of that
word.
15. (∗1.5) Write a program that produces anagrams of sentences; that is, a program that produces all
permutations of the words in the sentences (rather than permutations of the letters in the words).
16. (∗1.5) Implement ffiinndd__iiff() (§18.5.2) and then implement ffiinndd() using ffiinndd__iiff(). Find a way
of doing this so that the two functions do not need different names.
17. (∗2) Implement sseeaarrcchh() (§18.5.5). Provide an optimized version for random-access iterators.
18. (∗2) Take a sort algorithm (such as ssoorrtt() from your standard library or the Shell sort from
§13.5.2) and insert code so that it prints out the sequence being sorted after each swap of elements.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
548
Algorithms and Function Objects
Chapter 18
19. (∗2) There is no ssoorrtt() for bidirectional iterators. The conjecture is that copying to a vector
and then sorting is faster than sorting a sequence using bidirectional iterators. Implement a general sort for bidirectional iterators and test the conjecture.
20. (∗2.5) Imagine that you keep records for a group of sports fishermen. For each catch, keep a
record of species, length, weight, date of catch, name of fisherman, etc. Sort and print the
records according to a variety of criteria. Hint: iinnppllaaccee__m
meerrggee().
21. (∗2) Create lists of students taking Math, English, French, and Biology. Pick about 20 names
for each class out of a set of 40 names. List students who take both Math and English. List students who take French but not Biology or Math. List students who do not take a science course.
List students who take French and Math but neither English nor Biology.
22. (∗1.5) Write a rreem
moovvee() function that actually removes elements from a container.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
19
________________________________________
________________________________________________________________________________________________________________________________________________________________
Iterators and Allocators
The reason that data structures and algorithms
can work together seamlessly is ... that they
do not know anything about each other.
– Alex Stepanov
Iterators and sequences — operations on iterators — iterator traits — iterator categories
— inserters — reverse iterators — stream iterators — checked iterators — exceptions
and algorithms — allocators — the standard aallllooccaattoorr — user-defined allocators —
low-level memory functions — advice — exercises.
19.1 Introduction [iter.intro]
Iterators are the glue that holds containers and algorithms together. They provide an abstract view
of data so that the writer of an algorithm need not be concerned with concrete details of a myriad of
data structures. Conversely, the standard model of data access provided by iterators relieves containers from having to provide a more extensive set of access operations. Similarly, allocators are
used to insulate container implementations from details of access to memory.
Iterators support an abstract model of data as sequences of objects (§19.2). Allocators provide a
mapping from a lower-level model of data as arrays of bytes into the higher-level object model
(§19.4). The most common lower-level memory model is itself supported by a few standard functions (§19.4.4).
Iterators are a concept with which every programmer should be familiar. In contrast, allocators
are a support mechanism that a programmer rarely needs to worry about and few programmers will
ever need to write a new allocator.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
550
Iterators and Allocators
Chapter 19
19.2 Iterators and Sequences [iter.iter]
An iterator is a pure abstraction. That is, anything that behaves like an iterator is an iterator
(§3.8.2). An iterator is an abstraction of the notion of a pointer to an element of a sequence. Its key
concepts are
– ‘‘the element currently pointed to’’ (dereferencing, represented by operators * and ->),
– ‘‘point to next element’’ (increment, represented by operator ++), and
– equality (represented by operator ==).
For example, the built-in type iinntt* is an iterator for an iinntt[] and the class lliisstt<iinntt>::iitteerraattoorr is an
iterator for a lliisstt class.
A sequence is an abstraction of the notion ‘‘something where we can get from the beginning to
the end by using a next-element operation:’’
begin()
elem[0]
end()
elem[1]
elem[2]
...
elem[n-1]
.............
.
.
.
.
.
.
.............
Examples of such sequences are arrays (§5.2), vectors (§16.3), singly-linked lists (§17.8[17]),
doubly-linked lists (§17.2.2), trees (§17.4.1), input (§21.3.1), and output (§21.2.1). Each has its
own appropriate kind of iterator.
The iterator classes and functions are declared in namespace ssttdd and found in <iitteerraattoorr>.
An iterator is not a general pointer. Rather, it is an abstraction of the notion of a pointer into an
array. There is no concept of a ‘‘null iterator.’’ The test to determine whether an iterator points to
an element or not is conventionally done by comparing it against the end of its sequence (rather
than comparing it against a nnuullll element). This notion simplifies many algorithms by removing the
need for a special end case and generalizes nicely to sequences of arbitrary types.
An iterator that points to an element is said to be valid and can be dereferenced (using *, [], or
-> appropriately). An iterator can be invalid either because it hasn’t been initialized, because it
pointed into a container that was explicitly or implicitly resized (§16.3.6, §16.3.8), because the container into which it pointed was destroyed, or because it denotes the end of a sequence (§18.2). The
end of a sequence can be thought of as an iterator pointing to a hypothetical element position onepast-the-last element of a sequence.
19.2.1 Iterator Operations [iter.oper]
Not every kind of iterator supports exactly the same set of operations. For example, reading
requires different operations from writing, and a vveeccttoorr allows convenient and efficient random
access in a way that would be prohibitively expensive to provide for a lliisstt or an iissttrreeaam
m. Consequently, we classify iterators into five categories according to the operations they are capable of
providing efficiently (that is, in constant time; §17.1):
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.2.1
Iterator Operations
551
__________________________________________________________________________
Iterator Operations and Categories
___________________________________________________________________________
_________________________________________________________________________
output
input
forward bidirectional random-access
Category:
Abbreviation: O
Ouutt
IInn
F
Foorr
B
Bii
R
Raann
__________________________________________________________________________
Read:
=*p
=*p
=*p
=*p
Access:
->
->
->
-> []
*p=
*p=
*p=
*p=
Write:
++
++
++
++ -++ -- + - += -=
Iteration:
__________________________________________________________________________
Comparison:
== !=
== !=
== !=
== != < > >= <=
Both read and write are through the iterator dereferenced by *:
*pp = xx;
x = *pp;
// write x through p
// read through p into x
To be an iterator type, a type must provide an appropriate set of operations. These operations must
have their conventional meanings. That is, each operation must have the same effect it has on an
ordinary pointer.
Independently of its category, an iterator can allow ccoonnsstt or non-ccoonnsstt access to the object it
points to. You cannot write to an element using an iterator to ccoonnsstt – whatever its category. An
iterator provides a set of operators, but the type of the element pointed to is the final arbiter of what
can be done to that element.
Reads and writes copy objects, so element types must have the conventional copy semantics
(§17.1.4).
Only random-access iterators can have an integer added or subtracted for relative addressing.
However, except for output iterators, the distance between two iterators can always be found by
iterating through the elements, so a ddiissttaannccee() function is provided:
tteem
mppllaattee<ccllaassss IInn> ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee ddiissttaannccee(IInn ffiirrsstt, IInn llaasstt)
{
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee d = 00;
w
whhiillee (ffiirrsstt++!=llaasstt) dd++;
rreettuurrnn dd;
}
An iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee is defined for every iterator IInn to hold distances between
elements (§19.2.2).
This function is called ddiissttaannccee() rather than ooppeerraattoorr-() because it can be expensive and
the operators provided for an iterator all operate in constant time (§17.1). Counting elements one
by one is not the kind of operation I would like to invoke unwittingly for a large sequence. The
library also provides a far more efficient implementation of ddiissttaannccee() for a random-access iterator.
Similarly, aaddvvaannccee() is provided as a potentially slow +=:
tteem
mppllaattee <ccllaassss IInn, ccllaassss D
Diisstt> vvooiidd aaddvvaannccee(IInn ii, D
Diisstt nn);
// i+=n
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
552
Iterators and Allocators
Chapter 19
19.2.2 Iterator Traits [iter.traits]
We use iterators to gain information about the objects they point to and the sequences they point
into. For example, we can dereference an iterator and manipulate the resulting object and we can
find the number of elements in a sequence, given the iterators that describe it. To express such
operations, we must be able to refer to types related to an iterator such as ‘‘the type of the object
referred to by an iterator’’ and ‘‘the type of the distance between two iterators.’’ The related types
of an iterator are described by a small set of declarations in an iitteerraattoorr__ttrraaiittss template class:
tteem
mppllaattee<ccllaassss IItteerr> ssttrruucctt iitteerraattoorr__ttrraaiittss {
ttyyppeeddeeff ttyyppeennaam
mee IItteerr::iitteerraattoorr__ccaatteeggoorryy iitteerraattoorr__ccaatteeggoorryy;
// §19.2.3
ttyyppeeddeeff ttyyppeennaam
mee IItteerr::vvaalluuee__ttyyppee vvaalluuee__ttyyppee;
// type of element
ttyyppeeddeeff ttyyppeennaam
mee IItteerr::ddiiffffeerreennccee__ttyyppee ddiiffffeerreennccee__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee IItteerr::ppooiinntteerr ppooiinntteerr;
// return type of operator– >()
ttyyppeeddeeff ttyyppeennaam
mee IItteerr::rreeffeerreennccee rreeffeerreennccee;
// return type of operator*()
};
The ddiiffffeerreennccee__ttyyppee is the type used to represent the difference between two iterators, and the
iitteerraattoorr__ccaatteeggoorryy is a type indicating what operations the iterator supports. For ordinary pointers,
specializations (§13.5) for <T
T*> and <ccoonnsstt T
T*> are provided. In particular:
tteem
mppllaattee<ccllaassss T
T> ssttrruucctt iitteerraattoorr__ttrraaiittss<T
T*> {
// specialization for pointers
ttyyppeeddeeff rraannddoom
m__aacccceessss__iitteerraattoorr__ttaagg iitteerraattoorr__ccaatteeggoorryy;
ttyyppeeddeeff T vvaalluuee__ttyyppee;
ttyyppeeddeeff ppttrrddiiffff__tt ddiiffffeerreennccee__ttyyppee;
ttyyppeeddeeff T
T* ppooiinntteerr;
ttyyppeeddeeff T
T& rreeffeerreennccee;
};
That is, the difference between two pointers is represented by the standard library type ppttrrddiiffff__tt
from <ccssttddddeeff> (§6.2.1) and a pointer provides random access (§19.2.3). Given iitteerraattoorr__ttrraaiittss,
we can write code that depends on properties of an iterator parameter. The ccoouunntt() algorithm is
the classical example:
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T>
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee ccoouunntt(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt T
T& vvaall)
{
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee rreess = 00;
w
whhiillee (ffiirrsstt != llaasstt) iiff (*ffiirrsstt++ == vvaall) ++rreess;
rreettuurrnn rreess;
}
Here, the type of the result is expressed in terms of the iitteerraattoorr__ttrraaiittss of the input. This technique
is necessary because there is no language primitive for expressing an arbitrary type in terms of
another.
Instead of using iitteerraattoorr__ttrraaiittss, we might have specialized ccoouunntt() for pointers:
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T>
ttyyppeennaam
mee IInn::ddiiffffeerreennccee__ttyyppee ccoouunntt(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt T
T& vvaall);
tteem
mppllaattee<ccllaassss IInn, ccllaassss T
T> ppttrrddiiffff__tt ccoouunntt<T
T*,T
T>(T
T* ffiirrsstt, T
T* llaasstt, ccoonnsstt T
T& vvaall);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.2.2
Iterator Traits
553
However, this would have solved the problem for ccoouunntt() only. Had we used this technique for a
dozen algorithms, the information about distance types would have been replicated a dozen times.
In general, it is better to represent a design decision in one place (§23.4.2). In that way, the decision can – if necessary – be changed in one place.
Because iitteerraattoorr__ttrraaiittss<IItteerraattoorr> is defined for every iterator, we implicitly define an
iitteerraattoorr__ttrraaiittss whenever we design a new iterator type. If the default traits generated from the
general iitteerraattoorr__ttrraaiittss template are not right for our new iterator type, we provide a specialization
in a way similar to what the standard library does for pointer types. The iitteerraattoorr__ttrraaiittss that are
implicitly generated assume that the iterator is a class with the member types ddiiffffeerreennccee__ttyyppee,
vvaalluuee__ttyyppee, etc. In <iitteerraattoorr>, the library provides a base type that can be used to define those
member types:
tteem
mppllaattee<ccllaassss C
Caatt, ccllaassss T
T, ccllaassss D
Diisstt = ppttrrddiiffff__tt, ccllaassss P
Pttrr = T
T*, ccllaassss R
Reeff = T
T&>
ssttrruucctt iitteerraattoorr {
ttyyppeeddeeff C
Caatt iitteerraattoorr__ccaatteeggoorryy; // §19.2.3
ttyyppeeddeeff T vvaalluuee__ttyyppee;
// type of element
ttyyppeeddeeff D
Diisstt ddiiffffeerreennccee__ttyyppee; // type of iterator difference
ttyyppeeddeeff P
Pttrr ppooiinntteerr;
// return type for – >
ttyyppeeddeeff R
Reeff rreeffeerreennccee;
// return type for *
};
Note that rreeffeerreennccee and ppooiinntteerr are not iterators. They are intended to be the return types of ooppeerr-aattoorr*() and ooppeerraattoorr->(), respectively, for some iterator.
The iitteerraattoorr__ttrraaiittss are the key to the simplicity of many interfaces that rely on iterators and to
the efficient implementation of many algorithms.
19.2.3 Iterator Categories [iter.cat]
The different kinds of iterators – usually referred to as iterator categories – fit into a hierarchical
ordering:
Input
Forward
Bidirectional
Random access
Output
This is not a class inheritance diagram. An iterator category is a classification of a type based on
the operations it provides. Many otherwise unrelated types can belong to the same iterator category. For example, both ordinary pointers (§19.2.2) and C
Chheecckkeedd__iitteerrs (§19.3) are random-access
iterators.
As noted in Chapter 18, different algorithms require different kinds of iterators as arguments.
Also, the same algorithm can sometimes be implemented with different efficiencies for different
kinds of iterators. To support overload resolution based on iterator categories, the standard library
provides five classes representing the five iterator categories:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
554
Iterators and Allocators
ssttrruucctt
ssttrruucctt
ssttrruucctt
ssttrruucctt
ssttrruucctt
Chapter 19
iinnppuutt__iitteerraattoorr__ttaagg {};
oouuttppuutt__iitteerraattoorr__ttaagg {};
ffoorrw
waarrdd__iitteerraattoorr__ttaagg : ppuubblliicc iinnppuutt__iitteerraattoorr__ttaagg {};
bbiiddiirreeccttiioonnaall__iitteerraattoorr__ttaagg: ppuubblliicc ffoorrw
waarrdd__iitteerraattoorr__ttaagg {};
rraannddoom
m__aacccceessss__iitteerraattoorr__ttaagg: ppuubblliicc bbiiddiirreeccttiioonnaall__iitteerraattoorr__ttaagg {};
Looking at the operations supported by input and forward iterators (§19.2.1), we would expect
ffoorrw
waarrdd__iitteerraattoorr__ttaagg to be derived from oouuttppuutt__iitteerraattoorr__ttaagg as well as from iinnppuutt__iitteerraattoorr__ttaagg.
The reasons that it is not are obscure and probably invalid. However, I have yet to see an example
in which that derivation would have simplified real code.
The inheritance of tags is useful (only) to save us from defining separate versions of a function
where several – but not all – kinds of iterators can use the same algorithms. Consider how to
implement ddiissttaannccee:
tteem
mppllaattee<ccllaassss IInn>
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee ddiissttaannccee(IInn ffiirrsstt, IInn llaasstt);
There are two obvious alternatives:
[1] If IInn is a random-access iterator, we can subtract ffiirrsstt from llaasstt.
[2] Otherwise, we must increment an iterator from ffiirrsstt to llaasstt and count the distance.
We can express these two alternatives as a pair of helper functions:
tteem
mppllaattee<ccllaassss IInn>
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee
ddiisstt__hheellppeerr(IInn ffiirrsstt, IInn llaasstt, iinnppuutt__iitteerraattoorr__ttaagg)
{
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee d = 00;
w
whhiillee (ffiirrsstt++!=llaasstt) dd++;
// use increment only
rreettuurrnn dd;
}
tteem
mppllaattee<ccllaassss R
Raann>
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<R
Raann>::ddiiffffeerreennccee__ttyyppee
ddiisstt__hheellppeerr(R
Raann ffiirrsstt, R
Raann llaasstt, rraannddoom
m__aacccceessss__iitteerraattoorr__ttaagg)
{
rreettuurrnn llaasstt-ffiirrsstt;
// rely on random access
}
The iterator category tag arguments make it explicit what kind of iterator is expected. The iterator
tag is used exclusively for overload resolution; the tag takes no part in the actual computation. It is
a purely compile-time selection mechanism. In addition to automatic selection of a helper function,
this technique provides immediate type checking (§13.2.5).
It is now trivial to define ddiissttaannccee() by calling the appropriate helper function:
tteem
mppllaattee<ccllaassss IInn>
ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<IInn>::ddiiffffeerreennccee__ttyyppee ddiissttaannccee(IInn ffiirrsstt, IInn llaasstt)
{
rreettuurrnn ddiisstt__hheellppeerr(ffiirrsstt,llaasstt,iitteerraattoorr__ttrraaiittss<IInn>::iitteerraattoorr__ccaatteeggoorryy());
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.2.3
Iterator Categories
555
For a ddiisstt__hheellppeerr() to be called, the iitteerraattoorr__ttrraaiittss<IInn>::iitteerraattoorr__ccaatteeggoorryy used must be a
iinnppuutt__iitteerraattoorr__ttaagg or a rraannddoom
m__aacccceessss__iitteerraattoorr__ttaagg. However, there is no need for separate versions of ddiisstt__hheellppeerr() for forward or bidirectional iterators. Thanks to tag inheritance, those cases
are handled by the ddiisstt__hheellppeerr() which takes an iinnppuutt__iitteerraattoorr__ttaagg. The absence of a version for
oouuttppuutt__iitteerraattoorr__ttaagg reflects the fact that ddiissttaannccee() is not meaningful for output iterators:
vvooiidd ff(vveeccttoorr<iinntt>& vvii,
lliisstt<ddoouubbllee>& lldd,
iissttrreeaam
m__iitteerraattoorr<ssttrriinngg>& iiss11, iissttrreeaam
m__iitteerraattoorr<ssttrriinngg>& iiss22,
oossttrreeaam
m__iitteerraattoorr<cchhaarr>& ooss11, oossttrreeaam
m__iitteerraattoorr<cchhaarr>& ooss22)
{
ddiissttaannccee(vvii.bbeeggiinn(),vvii.eenndd());
// use subtraction algorithm
ddiissttaannccee(lldd.bbeeggiinn(),lldd.eenndd());
// use increment algorithm
ddiissttaannccee(iiss11,iiss22);
// use increment algorithm
ddiissttaannccee(ooss11,ooss22); // error: wrong iterator category, dist_helper() argument type mismatch
}
Calling ddiissttaannccee() for an iissttrreeaam
m__iitteerraattoorr probably doesn’t make much sense in a real program,
though. The effect would be to read the input, throw it away, and return the number of values
thrown away.
Using iitteerraattoorr__ttrraaiittss<T
T>::iitteerraattoorr__ccaatteeggoorryy allows a programmer to provide alternative
implementations so that a user who cares nothing about the implementation of algorithms automatically gets the most appropriate implementation for each data structure used. In other words, it
allows us to hide an implementation detail behind a convenient interface. Inlining can be used to
ensure that this elegance is not bought at the cost of run-time efficiency.
19.2.4 Inserters [iter.insert]
Producing output through an iterator into a container implies that elements following the one
pointed to by the iterator can be overwritten. This implies the possibility of overflow and consequent memory corruption. For example:
vvooiidd ff(vveeccttoorr<iinntt>& vvii)
{
ffiillll__nn(vvii.bbeeggiinn(),220000,77);
}
// assign 7 to vi[0]..[199]
If vvii has fewer than 220000 elements, we are in trouble.
In <iitteerraattoorr>, the standard library provides three iterator template classes to deal with this
problem, plus three functions to make it convenient to use those iterators:
tteem
mppllaattee <ccllaassss C
Coonntt> bbaacckk__iinnsseerrtt__iitteerraattoorr<C
Coonntt> bbaacckk__iinnsseerrtteerr(C
Coonntt& cc);
tteem
mppllaattee <ccllaassss C
Coonntt> ffrroonntt__iinnsseerrtt__iitteerraattoorr<C
Coonntt> ffrroonntt__iinnsseerrtteerr(C
Coonntt& cc);
tteem
mppllaattee <ccllaassss C
Coonntt, ccllaassss O
Ouutt> iinnsseerrtt__iitteerraattoorr<C
Coonntt> iinnsseerrtteerr(C
Coonntt& cc, O
Ouutt pp);
The bbaacckk__iinnsseerrtteerr() causes elements to be added to the end of the container, ffrroonntt__iinnsseerrtteerr()
causes elements to be added to the front, and ‘‘plain’’ iinnsseerrtteerr() causes elements to be added
before its iterator argument. For iinnsseerrtteerr(cc,pp), p must be a valid iterator for cc. Naturally, a container grows each time a value is written to it through an insert iterator.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
556
Iterators and Allocators
Chapter 19
When written to, an inserter inserts a new element into a sequence using ppuusshh__bbaacckk(),
ppuusshh__ffrroonntt(), or iinnsseerrtt() (§16.3.6) rather than overwriting an existing element. For example:
vvooiidd gg(vveeccttoorr<iinntt>& vvii)
{
ffiillll__nn(bbaacckk__iinnsseerrtteerr(vvii),220000,77);
}
// add 200 7s to the end of vi
Inserters are as simple and efficient as they are useful. For example:
tteem
mppllaattee <ccllaassss C
Coonntt>
ccllaassss iinnsseerrtt__iitteerraattoorr : ppuubblliicc iitteerraattoorr<oouuttppuutt__iitteerraattoorr__ttaagg,vvooiidd,vvooiidd,vvooiidd,vvooiidd> {
pprrootteecctteedd:
C
Coonntt& ccoonnttaaiinneerr;
// container to insert into
ttyyppeennaam
mee C
Coonntt::iitteerraattoorr iitteerr; // points into the container
ppuubblliicc:
eexxpplliicciitt iinnsseerrtt__iitteerraattoorr(C
Coonntt& xx, ttyyppeennaam
mee C
Coonntt::iitteerraattoorr ii)
: ccoonnttaaiinneerr(xx), iitteerr(ii) {}
iinnsseerrtt__iitteerraattoorr& ooppeerraattoorr=(ccoonnsstt ttyyppeennaam
mee C
Coonntt::vvaalluuee__ttyyppee& vvaall)
{
iitteerr = ccoonnttaaiinneerr.iinnsseerrtt(iitteerr,vvaall);
++iitteerr;
rreettuurrnn *tthhiiss;
}
iinnsseerrtt__iitteerraattoorr& ooppeerraattoorr*() { rreettuurrnn *tthhiiss; }
iinnsseerrtt__iitteerraattoorr& ooppeerraattoorr++() { rreettuurrnn *tthhiiss; }
iinnsseerrtt__iitteerraattoorr ooppeerraattoorr++(iinntt) { rreettuurrnn *tthhiiss; }
// prefix ++
// postfix ++
};
Clearly, inserters are output iterators.
An iinnsseerrtt__iitteerraattoorr is a special case of an output sequence. In parallel to the iisseeqq from §18.3.1,
we might define:
tteem
mppllaattee<ccllaassss C
Coonntt>
iinnsseerrtt__iitteerraattoorr<C
Coonntt>
oosseeqq(C
Coonntt& cc, ttyyppeennaam
mee C
Coonntt::iitteerraattoorr ffiirrsstt, ttyyppeennaam
mee C
Coonntt::iitteerraattoorr llaasstt)
{
rreettuurrnn iinnsseerrtt__iitteerraattoorr<C
Coonntt>(cc,cc.eerraassee(ffiirrsstt,llaasstt)); // erase is explained in §16.3.6
}
In other words, an output sequence removes its old elements and replaces them with the output.
For example:
vvooiidd ff(lliisstt<iinntt>& llii,vveeccttoorr<iinntt>& vvii) // replace second half of vi by a copy of li
{
ccooppyy(llii.bbeeggiinn(),llii.eenndd(),oosseeqq(vvii,vvii+vvii.ssiizzee()/22,vvii.eenndd()));
}
The container needs to be an argument to an oosseeqq because it is not possible to decrease the size of a
container, given only iterators into it (§18.6, §18.6.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.2.5
Reverse Iterators
557
19.2.5 Reverse Iterators [iter.reverse]
The standard containers provide rrbbeeggiinn() and rreenndd() for iterating through elements in reverse
order (§16.3.2). These member functions return rreevveerrssee__iitteerraattoorrs:
tteem
mppllaattee <ccllaassss IItteerr>
ccllaassss rreevveerrssee__iitteerraattoorr : ppuubblliicc iitteerraattoorr<iitteerraattoorr__ttrraaiittss<IItteerr>::iitteerraattoorr__ccaatteeggoorryy,
iitteerraattoorr__ttrraaiittss<IItteerr>::vvaalluuee__ttyyppee,
iitteerraattoorr__ttrraaiittss<IItteerr>::ddiiffffeerreennccee__ttyyppee,
iitteerraattoorr__ttrraaiittss<IItteerr>::ppooiinntteerr,
iitteerraattoorr__ttrraaiittss<IItteerr>::rreeffeerreennccee> {
pprrootteecctteedd:
IItteerr ccuurrrreenntt; // current points to the element after the one *this refers to.
ppuubblliicc:
ttyyppeeddeeff IItteerr iitteerraattoorr__ttyyppee;
rreevveerrssee__iitteerraattoorr() : ccuurrrreenntt() { }
eexxpplliicciitt rreevveerrssee__iitteerraattoorr(IItteerr xx) : ccuurrrreenntt(xx) { }
tteem
mppllaattee<ccllaassss U
U> rreevveerrssee__iitteerraattoorr(ccoonnsstt rreevveerrssee__iitteerraattoorr<U
U>& xx) : ccuurrrreenntt(xx.bbaassee()) { }
IItteerr bbaassee() ccoonnsstt { rreettuurrnn ccuurrrreenntt; } // current iterator value
rreeffeerreennccee ooppeerraattoorr*() ccoonnsstt { IItteerr ttm
mpp = ccuurrrreenntt; rreettuurrnn *--ttm
mpp; }
ppooiinntteerr ooppeerraattoorr->() ccoonnsstt;
rreeffeerreennccee ooppeerraattoorr[](ddiiffffeerreennccee__ttyyppee nn) ccoonnsstt;
rreevveerrssee__iitteerraattoorr& ooppeerraattoorr++() { --ccuurrrreenntt; rreettuurrnn *tthhiiss; }
// note: not ++
rreevveerrssee__iitteerraattoorr ooppeerraattoorr++(iinntt) { rreevveerrssee__iitteerraattoorr t = ccuurrrreenntt; --ccuurrrreenntt; rreettuurrnn tt; }
rreevveerrssee__iitteerraattoorr& ooppeerraattoorr--() { ++ccuurrrreenntt; rreettuurrnn *tthhiiss; }
// note: not – –
rreevveerrssee__iitteerraattoorr ooppeerraattoorr--(iinntt) { rreevveerrssee__iitteerraattoorr t = ccuurrrreenntt; ++ccuurrrreenntt; rreettuurrnn tt; }
rreevveerrssee__iitteerraattoorr ooppeerraattoorr+(ddiiffffeerreennccee__ttyyppee nn) ccoonnsstt;
rreevveerrssee__iitteerraattoorr& ooppeerraattoorr+=(ddiiffffeerreennccee__ttyyppee nn);
rreevveerrssee__iitteerraattoorr ooppeerraattoorr-(ddiiffffeerreennccee__ttyyppee nn) ccoonnsstt;
rreevveerrssee__iitteerraattoorr& ooppeerraattoorr-=(ddiiffffeerreennccee__ttyyppee nn);
};
A rreevveerrssee__iitteerraattoorr is implemented using an iitteerraattoorr called ccuurrrreenntt. That iitteerraattoorr can (only) point
to the elements of its sequence plus its one-past-the-end element. However, the rreevveerrssee__iitteerraattoorr’s
one-past-the-end element is the original sequence’s (inaccessible) one-before-the-beginning element. Thus, to avoid access violations, ccuurrrreenntt points to the element after the one the
rreevveerrssee__iitteerraattoorr refers to. This implies that * returns the value *(ccuurrrreenntt-11) and that ++ is
implemented using -- on ccuurrrreenntt.
A rreevveerrssee__iitteerraattoorr supports the operations that its initializer supports (only). For example:
vvooiidd ff(vveeccttoorr<iinntt>& vv, lliisstt<cchhaarr>& llsstt)
{
rreevveerrssee__iitteerraattoorr(vv.eenndd())[33] = 77;
// ok: random-access iterator
rreevveerrssee__iitteerraattoorr(llsstt.eenndd())[33] = ´44´;
// error: bidirectional iterator doesn’t support []
*(++++++rreevveerrssee__iitteerraattoorr(llsstt.eenndd())) = ´44´; // ok!
}
In addition, the library provides ==, !=, <, <=, >, >=, + and - for rreevveerrssee__iitteerraattoorrs.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
558
Iterators and Allocators
Chapter 19
19.2.6 Stream Iterators [iter.stream]
Ordinarily, I/O is done using the streams library (Chapter 21), a graphical user-interface system
(not covered by the C++ standard), or the C I/O functions (§21.8). These I/O interfaces are primarily aimed at reading and writing individual values of a variety of types. The standard library provides four iterator types to fit stream I/O into the general framework of containers and algorithms:
– oossttrreeaam
m__iitteerraattoorr: for writing to an oossttrreeaam
m (§3.4, §21.2.1).
– iissttrreeaam
m__iitteerraattoorr: for reading from an iissttrreeaam
m (§3.6, §21.3.1).
– oossttrreeaam
mbbuuff__iitteerraattoorr: for writing to a stream buffer (§21.6.1).
– iissttrreeaam
mbbuuff__iitteerraattoorr: for reading from a stream buffer (§21.6.2).
The idea is simply to present input and output of collections as sequences:
tteem
mppllaattee <ccllaassss T
T, ccllaassss C
Chh = cchhaarr, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss oossttrreeaam
m__iitteerraattoorr : ppuubblliicc iitteerraattoorr<oouuttppuutt__iitteerraattoorr__ttaagg,vvooiidd,vvooiidd,vvooiidd,vvooiidd> {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff T
Trr ttrraaiittss__ttyyppee;
ttyyppeeddeeff bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr> oossttrreeaam
m__ttyyppee;
oossttrreeaam
m__iitteerraattoorr(oossttrreeaam
m__ttyyppee& ss);
oossttrreeaam
m__iitteerraattoorr(oossttrreeaam
m__ttyyppee& ss, ccoonnsstt C
Chh* ddeelliim
m); // write delim after each output value
oossttrreeaam
m__iitteerraattoorr(ccoonnsstt oossttrreeaam
m__iitteerraattoorr&);
~oossttrreeaam
m__iitteerraattoorr();
oossttrreeaam
m__iitteerraattoorr& ooppeerraattoorr=(ccoonnsstt T
T& vvaall);
// write val to output
oossttrreeaam
m__iitteerraattoorr& ooppeerraattoorr*();
oossttrreeaam
m__iitteerraattoorr& ooppeerraattoorr++();
oossttrreeaam
m__iitteerraattoorr& ooppeerraattoorr++(iinntt);
};
This iterator accepts the usual write and increment operations of an output iterator and converts
them into output operations on an oossttrreeaam
m. For example:
vvooiidd ff()
{
oossttrreeaam
m__iitteerraattoorr<iinntt> ooss(ccoouutt);
// write ints to cout through os
*ooss = 77;
// output 7
++ooss;
// get ready for next output
*ooss = 7799;
// output 79
}
The ++ operation might trigger an actual output operation, or it might have no effect. Different
implementations will use different implementation strategies. Consequently, for code to be portable a ++ must occur between every two assignments to an oossttrreeaam
m__iitteerraattoorr. Naturally, every
standard algorithm is written that way – or it would not work for a vveeccttoorr. This is why
oossttrreeaam
m__iitteerraattoorr is defined this way.
An implementation of oossttrreeaam
m__iitteerraattoorr is trivial and is left as an exercise (§19.6[4]). The standard I/O supports different character types; cchhaarr__ttrraaiittss (§20.2) describes the aspects of a character
type that can be important for I/O and ssttrriinnggs.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.2.6
Stream Iterators
559
An input iterator for iissttrreeaam
ms is defined analogously:
tteem
mppllaattee <ccllaassss T
T, ccllaassss C
Chh = cchhaarr, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss D
Diisstt = ppttrrddiiffff__tt>
ccllaassss iissttrreeaam
m__iitteerraattoorr : ppuubblliicc iitteerraattoorr<iinnppuutt__iitteerraattoorr__ttaagg, T
T, D
Diisstt, ccoonnsstt T
T*, ccoonnsstt T
T&> {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff T
Trr ttrraaiittss__ttyyppee;
ttyyppeeddeeff bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr> iissttrreeaam
m__ttyyppee;
iissttrreeaam
m__iitteerraattoorr();
// end of input
iissttrreeaam
m__iitteerraattoorr(iissttrreeaam
m__ttyyppee& ss);
iissttrreeaam
m__iitteerraattoorr(ccoonnsstt iissttrreeaam
m__iitteerraattoorr&);
~iissttrreeaam
m__iitteerraattoorr();
ccoonnsstt T
T& ooppeerraattoorr*() ccoonnsstt;
ccoonnsstt T
T* ooppeerraattoorr->() ccoonnsstt;
iissttrreeaam
m__iitteerraattoorr& ooppeerraattoorr++();
iissttrreeaam
m__iitteerraattoorr ooppeerraattoorr++(iinntt);
};
This iterator is specified so that what would be conventional use for a container triggers input from
an iissttrreeaam
m. For example:
vvooiidd ff()
{
iissttrreeaam
m__iitteerraattoorr<iinntt> iiss(cciinn);
// read ints from cin through is
iinntt ii11 = *iiss;
// read an int
++iiss;
// get ready for next input
iinntt ii22 = *iiss;
// read an int
}
The default iissttrreeaam
m__iitteerraattoorr represents the end of input so that we can specify an input sequence:
vvooiidd ff(vveeccttoorr<iinntt>& vv)
{
ccooppyy(iissttrreeaam
m__iitteerraattoorr<iinntt>(cciinn),iissttrreeaam
m__iitteerraattoorr<iinntt>(),bbaacckk__iinnsseerrtteerr(vv));
}
To make this work, the standard library supplies == and != for iissttrreeaam
m__iitteerraattoorrs.
An implementation of iissttrreeaam
m__iitteerraattoorr is less trivial than an oossttrreeaam
m__iitteerraattoorr implementation, but it is still simple. Implementing an iissttrreeaam
m__iitteerraattoorr is also left as an exercise (§19.6[5]).
19.2.6.1 Stream Buffers [iter.streambuf]
As described in §21.6, stream I/O is based on the idea of oossttrreeaam
ms and iissttrreeaam
ms filling and emptying buffers from and to which the low-level physical I/O is done. It is possible to bypass the standard iostreams formatting and operate directly on the stream buffers (§21.6.4). That ability is also
provided to algorithms through the notion of iissttrreeaam
mbbuuff__iitteerraattoorrs and oossttrreeaam
mbbuuff__iitteerraattoorrs:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
560
Iterators and Allocators
Chapter 19
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss iissttrreeaam
mbbuuff__iitteerraattoorr
: ppuubblliicc iitteerraattoorr<iinnppuutt__iitteerraattoorr__ttaagg,C
Chh,ttyyppeennaam
mee T
Trr::ooffff__ttyyppee,C
Chh*,C
Chh&> {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff T
Trr ttrraaiittss__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee T
Trr::iinntt__ttyyppee iinntt__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr> ssttrreeaam
mbbuuff__ttyyppee;
ttyyppeeddeeff bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr> iissttrreeaam
m__ttyyppee;
ccllaassss pprrooxxyy;
// helper type
iissttrreeaam
mbbuuff__iitteerraattoorr() tthhrroow
w();
// end of buffer
iissttrreeaam
mbbuuff__iitteerraattoorr(iissttrreeaam
m__ttyyppee& iiss) tthhrroow
w(); // read from is’s streambuf
iissttrreeaam
mbbuuff__iitteerraattoorr(ssttrreeaam
mbbuuff__ttyyppee*) tthhrroow
w();
iissttrreeaam
mbbuuff__iitteerraattoorr(ccoonnsstt pprrooxxyy& pp) tthhrroow
w(); // read from p’s streambuf
C
Chh ooppeerraattoorr*() ccoonnsstt;
iissttrreeaam
mbbuuff__iitteerraattoorr& ooppeerraattoorr++();
pprrooxxyy ooppeerraattoorr++(iinntt);
// prefix
// postfix
bbooooll eeqquuaall(iissttrreeaam
mbbuuff__iitteerraattoorr&);
// both or neither streambuf at eof
};
In addition, == and != are supplied.
Reading from a ssttrreeaam
mbbuuff is a lower-level operation than reading from an iissttrreeaam
m. Consequently, the iissttrreeaam
mbbuuff__iitteerraattoorr interface is messier than the iissttrreeaam
m__iitteerraattoorr interface. However, once the iissttrreeaam
mbbuuff__iitteerraattoorr is properly initialized, *, ++, and = have their usual meanings
when used in the usual way.
The pprrooxxyy type is an implementation-defined helper type that allows the postfix ++ to be implemented without imposing constraints on the ssttrreeaam
mbbuuff implementation. A pprrooxxyy holds the result
value while the iterator is incremented:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh,T
Trr>::pprrooxxyy {
C
Chh vvaall;
bbaassiicc__iissttrreeaam
mbbuuff<C
Chh,T
Trr>* bbuuff;
pprrooxxyy(C
Chh vv, bbaassiicc__iissttrreeaam
mbbuuff<C
Chh,T
Trr>* bb) :vvaall(vv), bbuuff(bb) { }
ppuubblliicc:
C
Chh ooppeerraattoorr*() { rreettuurrnn vvaall; }
};
An oossttrreeaam
mbbuuff__iitteerraattoorr is defined similarly:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss oossttrreeaam
mbbuuff__iitteerraattoorr : ppuubblliicc iitteerraattoorr<oouuttppuutt__iitteerraattoorr__ttaagg,vvooiidd,vvooiidd,vvooiidd,vvooiidd>{
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff T
Trr ttrraaiittss__ttyyppee;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.2.6.1
Stream Buffers
561
ttyyppeeddeeff bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr> ssttrreeaam
mbbuuff__ttyyppee;
ttyyppeeddeeff bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr> oossttrreeaam
m__ttyyppee;
oossttrreeaam
mbbuuff__iitteerraattoorr(oossttrreeaam
m__ttyyppee& ooss) tthhrroow
w();
oossttrreeaam
mbbuuff__iitteerraattoorr(ssttrreeaam
mbbuuff__ttyyppee*) tthhrroow
w();
oossttrreeaam
mbbuuff__iitteerraattoorr& ooppeerraattoorr=(C
Chh);
// write to os’s streambuf
oossttrreeaam
mbbuuff__iitteerraattoorr& ooppeerraattoorr*();
oossttrreeaam
mbbuuff__iitteerraattoorr& ooppeerraattoorr++();
oossttrreeaam
mbbuuff__iitteerraattoorr& ooppeerraattoorr++(iinntt);
bbooooll ffaaiilleedd() ccoonnsstt tthhrroow
w();
// true if Tr::eof() seen
};
19.3 Checked Iterators [iter.checked]
A programmer can provide iterators in addition to those provided by the standard library. This is
often necessary when providing a new kind of container, and sometimes a new kind of iterator is a
good way to support a different way of using existing containers. As an example, I here describe
an iterator that range checks access to its container.
Using standard containers reduces the amount of explicit memory management. Using standard
algorithms reduces the amount of explicit addressing of elements in containers. Using the standard
library together with language facilities that maintain type safety dramatically reduces run-time
errors compared to traditional C coding styles. However, the standard library still relies on the programmer to avoid access beyond the limits of a container. If by accident element xx[xx.ssiizzee()+77]
of some container x is accessed, then unpredictable – and usually bad – things happen. Using a
range-checked vveeccttoorr, such as V
Veecc (§3.7.1), helps in some cases. More cases can be handled by
checking every access through an iterator.
To achieve this degree of checking without placing a serious notational burden on the programmer, we need checked iterators and a convenient way of attaching them to containers. To make a
C
Chheecckkeedd__iitteerr, we need a container and an iterator into that container. As for binders (§18.4.4.1),
inserters (§19.2.4), etc., I provide functions for making a C
Chheecckkeedd__iitteerr:
tteem
mppllaattee<ccllaassss C
Coonntt, ccllaassss IItteerr> C
Chheecckkeedd__iitteerr<C
Coonntt,IItteerr> m
maakkee__cchheecckkeedd(C
Coonntt& cc, IItteerr ii)
{
rreettuurrnn C
Chheecckkeedd__iitteerr<C
Coonntt,IItteerr>(cc,ii);
}
tteem
mppllaattee<ccllaassss C
Coonntt> C
Chheecckkeedd__iitteerr<C
Coonntt,ttyyppeennaam
mee C
Coonntt::iitteerraattoorr> m
maakkee__cchheecckkeedd(C
Coonntt& cc)
{
rreettuurrnn C
Chheecckkeedd__iitteerr<C
Coonntt,ttyyppeennaam
mee C
Coonntt::iitteerraattoorr>(cc,cc.bbeeggiinn());
}
These functions offer the notational convenience of deducing the types from arguments rather than
stating those types explicitly. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
562
Iterators and Allocators
Chapter 19
vvooiidd ff(vveeccttoorr<iinntt>& vv, ccoonnsstt vveeccttoorr<iinntt>& vvcc)
{
ttyyppeeddeeff C
Chheecckkeedd__iitteerr<vveeccttoorr<iinntt>,vveeccttoorr<iinntt>::iitteerraattoorr> C
CII;
C
CII pp11 = m
maakkee__cchheecckkeedd(vv,vv.bbeeggiinn()+33);
C
CII pp22 = m
maakkee__cchheecckkeedd(vv);
// by default: point to first element
ttyyppeeddeeff C
Chheecckkeedd__iitteerr<ccoonnsstt vveeccttoorr<iinntt>,vveeccttoorr<iinntt>::ccoonnsstt__iitteerraattoorr> C
CIIC
C;
C
CIIC
C pp33 = m
maakkee__cchheecckkeedd(vvcc,vvcc.bbeeggiinn()+33);
C
CIIC
C pp44 = m
maakkee__cchheecckkeedd(vvcc);
ccoonnsstt vveeccttoorr<iinntt>& vvvv = vv;
C
CIIC
C pp55 = m
maakkee__cchheecckkeedd(vv,vvvv.bbeeggiinn());
}
By default, ccoonnsstt containers have ccoonnsstt iterators, so their C
Chheecckkeedd__iitteerrs must also be constant iterators. The iterator pp55 shows one way of getting a ccoonnsstt iterator for a non-ccoonnsstt iterator.
This demonstrates why C
Chheecckkeedd__iitteerr needs two template parameters: one for the container type
and one to express the ccoonnsstt/non-ccoonnsstt distinction.
The names of these C
Chheecckkeedd__iitteerr types become fairly long and unwieldy, but that doesn’t matter when iterators are used as arguments to a generic algorithm. For example:
tteem
mppllaattee<ccllaassss IItteerr> vvooiidd m
myyssoorrtt(IItteerr ffiirrsstt, IItteerr llaasstt);
vvooiidd ff(vveeccttoorr<iinntt>& cc)
{
ttrryy {
m
myyssoorrtt(m
maakkee__cchheecckkeedd(cc), m
maakkee__cchheecckkeedd(cc,cc.eenndd());
}
ccaattcchh (oouutt__ooff__bboouunnddss) {
cceerrrr<<"ooooppss: bbuugg iinn m
myyssoorrtt()\\nn";
aabboorrtt();
}
}
An early version of such an algorithm is exactly where I would most suspect a range error so that
using checked iterators would make sense.
The representation of a C
Chheecckkeedd__iitteerr is a pointer to a container plus an iterator pointing into
that container:
tteem
mppllaattee<ccllaassss C
Coonntt, ccllaassss IItteerr = ttyyppeennaam
mee C
Coonntt::iitteerraattoorr>
ccllaassss C
Chheecckkeedd__iitteerr : ppuubblliicc iitteerraattoorr__ttrraaiittss<IItteerr> {
IItteerr ccuurrrr; // iterator for current position
C
Coonntt* cc; // pointer to current container
// ...
};
Deriving from iitteerraattoorr__ttrraaiittss is one technique for defining the desired ttyyppeeddeeffs. The obvious
alternative – deriving from iitteerraattoorr – would be verbose in this case (as it was for
rreevveerrssee__iitteerraattoorr; §19.2.5). Just as there is no requirement that an iterator should be a class, there
is no requirement that iterators that are classes should be derived from iitteerraattoorr.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.3
Checked Iterators
563
The C
Chheecckkeedd__iitteerr operations are all fairly trivial:
tteem
mppllaattee<ccllaassss C
Coonntt, ccllaassss IItteerr = ttyyppeennaam
mee C
Coonntt::iitteerraattoorr>
ccllaassss C
Chheecckkeedd__iitteerr : ppuubblliicc iitteerraattoorr__ttrraaiittss<IItteerr> {
// ...
ppuubblliicc:
vvooiidd vvaalliidd(IItteerr pp)
{
iiff (cc->eenndd() == pp) rreettuurrnn;
ffoorr (IItteerr pppp = cc->bbeeggiinn(); pppp!=cc->eenndd(); ++pppp) iiff (pppp == pp) rreettuurrnn;
tthhrroow
w oouutt__ooff__bboouunnddss()
}
ffrriieenndd bbooooll ooppeerraattoorr==(ccoonnsstt C
Chheecckkeedd__iitteerr& ii, ccoonnsstt C
Chheecckkeedd__iitteerr& jj)
{
rreettuurrnn ii.cc==jj.cc && ii.ccuurrrr==jj.ccuurrrr;
}
// no default initializer.
// use default copy constructor and copy assignment.
C
Chheecckkeedd__iitteerr(C
Coonntt& xx, IItteerr pp) : cc(&xx), ccuurrrr(pp) { vvaalliidd(pp); }
rreeffeerreennccee__ttyyppee ooppeerraattoorr*()
{
iiff (ccuurrrr==cc->eenndd()) tthhrroow
w oouutt__ooff__bboouunnddss();
rreettuurrnn *ccuurrrr;
}
ppooiinntteerr__ttyyppee ooppeerraattoorr->()
{
rreettuurrnn &*ccuurrrr;
}
// checked by *
C
Chheecckkeedd__iitteerr ooppeerraattoorr+(D
Diisstt dd)
// for random-access iterators only
{
iiff (cc->eenndd()-ccuurrrr<=dd) tthhrroow
w oouutt__ooff__bboouunnddss();
rreettuurrnn C
Chheecckkeedd__iitteerr(cc,ccuurrrr+dd);
}
rreeffeerreennccee__ttyyppee ooppeerraattoorr[](D
Diisstt dd)
// for random-access iterators only
{
iiff (cc->eenndd()-ccuurrrr<=dd) tthhrroow
w oouutt__ooff__bboouunnddss();
rreettuurrnn cc[dd];
}
C
Chheecckkeedd__iitteerr& ooppeerraattoorr++()
// prefix ++
{
iiff (ccuurrrr == cc->eenndd()) tthhrroow
w oouutt__ooff__bboouunnddss();
++ccuurrrr;
rreettuurrnn *tthhiiss;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
564
Iterators and Allocators
C
Chheecckkeedd__iitteerr ooppeerraattoorr++(iinntt)
{
C
Chheecckkeedd__iitteerr ttm
mpp = *tthhiiss;
++*tthhiiss;
rreettuurrnn ttm
mpp;
}
Chapter 19
// postfix ++
// checked by prefix ++
C
Chheecckkeedd__iitteerr& ooppeerraattoorr--()
// prefix -{
iiff (ccuurrrr == cc->bbeeggiinn()) tthhrroow
w oouutt__ooff__bboouunnddss();
--ccuurrrr;
rreettuurrnn *tthhiiss;
}
C
Chheecckkeedd__iitteerr ooppeerraattoorr--(iinntt)
{
C
Chheecckkeedd__iitteerr ttm
mpp = *tthhiiss;
--*tthhiiss;
rreettuurrnn ttm
mpp;
}
// postfix --
// checked by prefix --
ddiiffffeerreennccee__ttyyppee iinnddeexx() { rreettuurrnn ccuurrrr-cc.bbeeggiinn(); } // random-access only
IItteerr uunncchheecckkeedd() { rreettuurrnn ccuurrrr; }
// +, -, < , etc. (§19.6[6])
};
AC
Chheecckkeedd__iitteerr can be initialized only for a particular iterator pointing into a particular container.
In a full-blown implementation, a more efficient version of vvaalliidd() should be provided for
random-access iterators (§19.6[6]). Once a C
Chheecckkeedd__iitteerr is initialized, every operation that
changes its position is checked to make sure the iterator still points into the container. An attempt
to make the iterator point outside the container causes an oouutt__ooff__bboouunnddss exception to be thrown.
For example:
vvooiidd ff(lliisstt<ssttrriinngg>& llss)
{
iinntt ccoouunntt = 00;
ttrryy {
C
Chheecckkeedd__iitteerr< lliisstt<ssttrriinngg> > pp(llss,llss.bbeeggiinn());
w
whhiillee (ttrruuee) {
++pp;
// sooner or later this will reach the end
++ccoouunntt;
}
}
ccaattcchh(oouutt__ooff__bboouunnddss) {
ccoouutt << "oovveerrrruunn aafftteerr " << ccoouunntt << " ttrriieess\\nn";
}
}
AC
Chheecckkeedd__iitteerr knows which container it is pointing into. This allows it to catch some, but not all,
cases in which iterators into a container have been invalidated by an operation on it (§16.3.8). To
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.3
Checked Iterators
565
protect against all such cases, a different and more expensive iterator design would be needed (see
§19.6[7]).
Note that postincrement (postfix ++) involves a temporary and preincrement does not. For this
reason, it is best to prefer ++pp over pp++ for iterators.
Because a C
Chheecckkeedd__iitteerr keeps a pointer to a container, it cannot be used for a built-in array
directly. When necessary, a cc__aarrrraayy (§17.5.4) can be used.
To complete the notion of checked iterators, we must make them trivial to use. There are two
basic approaches:
[1] Define a checked container type that behaves like other containers, except that it provides
only a limited set of constructors and its bbeeggiinn(), eenndd(), etc., supply C
Chheecckkeedd__iitteerrs rather
than ordinary iterators.
[2] Define a handle that can be initialized by an arbitrary container and that provides checked
access functions to its container (§19.6[8]).
The following template attaches checked iterators to a container:
tteem
mppllaattee<ccllaassss C
C> ccllaassss C
Chheecckkeedd : ppuubblliicc C {
ppuubblliicc:
eexxpplliicciitt C
Chheecckkeedd(ssiizzee__tt nn) :C
C(nn) { }
C
Chheecckkeedd() :C
C() { }
ttyyppeeddeeff C
Chheecckkeedd__iitteerr<C
C> iitteerraattoorr;
ttyyppeeddeeff C
Chheecckkeedd__iitteerr<C
C,C
C::ccoonnsstt__iitteerraattoorr> ccoonnsstt__iitteerraattoorr;
ttyyppeennaam
mee C
C::iitteerraattoorr bbeeggiinn() { rreettuurrnn iitteerraattoorr(*tthhiiss,C
C::bbeeggiinn()); }
ttyyppeennaam
mee C
C::iitteerraattoorr eenndd() { rreettuurrnn iitteerraattoorr(*tthhiiss,C
C::eenndd()); }
ttyyppeennaam
mee C
C::ccoonnsstt__iitteerraattoorr bbeeggiinn() ccoonnsstt { rreettuurrnn ccoonnsstt__iitteerraattoorr(*tthhiiss,C
C::bbeeggiinn()); }
ttyyppeennaam
mee C
C::ccoonnsstt__iitteerraattoorr eenndd() ccoonnsstt { rreettuurrnn ccoonnsstt__iitteerraattoorr(*tthhiiss,C
C::eenndd()); }
ttyyppeennaam
mee C
C::rreeffeerreennccee__ttyyppee ooppeerraattoorr[](ssiizzee__tt nn) { rreettuurrnn C
Chheecckkeedd__iitteerr<C
C>(*tthhiiss)[nn]; }
C
C& bbaassee() { rreettuurrnn ssttaattiicc__ccaasstt<C
C&>(*tthhiiss); } // get hold of the base container
};
This allows us to write:
C
Chheecckkeedd< vveeccttoorr<iinntt> > vveecc(1100);
C
Chheecckkeedd< lliisstt<ddoouubbllee> > llsstt;
vvooiidd ff()
{
iinntt vv11 = vveecc[55];
// ok
iinntt vv22 = vveecc[1155];
// throws out_of_bounds
// ...
llsstt.ppuusshh__bbaacckk(vv22);
m
myyssoorrtt(vveecc.bbeeggiinn(),vveecc.eenndd());
ccooppyy(vveecc.bbeeggiinn(),vveecc.eenndd(),llsstt.bbeeggiinn(),llsstt.eenndd());
}
If a container is resized, iterators – including C
Chheecckkeedd__iitteerrs – into it may become invalid. In that
case, the C
Chheecckkeedd__iitteerr can be re-initialized:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
566
Iterators and Allocators
vvooiidd gg()
{
C
Chheecckkeedd__iitteerr<iinntt> pp(vvii);
// ..
iinntt i = pp.iinnddeexx();
vvii.rreessiizzee(110000);
p=C
Chheecckkeedd__iitteerr<iinntt>(vvii,vvii.bbeeggiinn()+ii);
}
Chapter 19
// get current position
// p becomes invalid
// restore current position
The old – and invalid – current position is lost. I provided iinnddeexx() as a means of storing and
restoring a C
Chheecckkeedd__iitteerr. If necessary, a reference to the container used as the base of the C
Chheecckkeedd
container can be extracted using bbaassee().
19.3.1 Exceptions, Containers, and Algorithms [iter.except]
You could argue that using both standard algorithms and checked iterators is like wearing both belt
and suspenders: either should keep you safe. However, experience shows that for many people and
for many applications a dose of paranoia is reasonable – especially during times when a program
goes through frequent changes that involve several people.
One way of using run-time checks is to keep them in the code only while debugging. The
checks are then removed before the program is shipped. This practice has been compared to wearing a life jacket while paddling around close to the shore and then removing it before setting out
onto the open sea. However, some uses of run-time checks do impose significant time and space
overheads, so insisting on such checks at all times is not realistic. In any case, it is unwise to optimize without measurements, so before removing checks, do an experiment to see if worthwhile
improvements actually emerge from doing so. To do such an experiment, we must be able to
remove run-time checks easily (see §24.3.7.1). Once measurements have been done, we could
remove the run-time testing from the most run-time critical – and hopefully most thoroughly tested
– code and leave the rest of the code checked as a relatively cheap form of insurance.
Using a C
Chheecckkeedd__iitteerr allows us to detect many mistakes. It does not, however, make it easy to
recover from these errors. People rarely write code that is 100% robust against every ++, --, *,
[], ->, and = potentially throwing an exception. This leaves us with two obvious strategies:
[1] Catch exceptions close to the point from which they are thrown so that the writer of the
exception handler has a decent chance of knowing what went wrong and can take appropriate action.
[2] Catch the exception at a high level of a program, abandon a significant portion of a computation, and consider all data structures written to during the failed computation suspect
(maybe there are no such data structures or maybe they can be sanity checked).
It is irresponsible to catch an exception from some unknown part of a program and proceed under
the assumption that no data structure is left in an undesirable state, unless there is a further level of
error handling that will catch subsequent errors. A simple example of this is when a final check (by
computer or human) is done before the results are accepted. In such cases, it can be simpler and
cheaper to proceed blithely rather than to try to catch every error at a low level. This would be an
example of a simplification made possible by a multilevel error recovery scheme (§14.9).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.4
Allocators
567
19.4 Allocators [iter.alloc]
An aallllooccaattoorr is used to insulate implementers of algorithms and containers that must allocate memory from the details of physical memory. An allocator provides standard ways of allocating and
deallocating memory and standard names of types used as pointers and references. Like an iterator,
an allocator is a pure abstraction. Any type that behaves like an allocator is an allocator.
The standard library provides a standard allocator intended to serve most users of a given implementation well. In addition, users can provide allocators that represent alternative views of memory. For example, we can write allocators that use shared memory, garbage-collected memory,
memory from preallocated pools of objects (§19.4.2), etc.
The standard containers and algorithms obtain and access memory through the facilities provided by an allocator. Thus, by providing a new allocator we provide the standard containers with
a way of using a new and different kind of memory.
19.4.1 The Standard Allocator [iter.alloc.std]
The standard aallllooccaattoorr template from <m
meem
moorryy> allocates memory using ooppeerraattoorr nneew
w()
(§6.2.6) and is by default used by all standard containers:
tteem
mppllaattee <ccllaassss T
T> ccllaassss aallllooccaattoorr {
ppuubblliicc:
ttyyppeeddeeff T vvaalluuee__ttyyppee;
ttyyppeeddeeff ssiizzee__tt ssiizzee__ttyyppee;
ttyyppeeddeeff ppttrrddiiffff__tt ddiiffffeerreennccee__ttyyppee;
ttyyppeeddeeff T
T* ppooiinntteerr;
ttyyppeeddeeff ccoonnsstt T
T* ccoonnsstt__ppooiinntteerr;
ttyyppeeddeeff T
T& rreeffeerreennccee;
ttyyppeeddeeff ccoonnsstt T
T& ccoonnsstt__rreeffeerreennccee;
ppooiinntteerr aaddddrreessss(rreeffeerreennccee rr) ccoonnsstt { rreettuurrnn &rr; }
ccoonnsstt__ppooiinntteerr aaddddrreessss(ccoonnsstt__rreeffeerreennccee rr) ccoonnsstt { rreettuurrnn &rr; }
aallllooccaattoorr() tthhrroow
w();
tteem
mppllaattee <ccllaassss U
U> aallllooccaattoorr(ccoonnsstt aallllooccaattoorr<U
U>&) tthhrroow
w();
~aallllooccaattoorr() tthhrroow
w();
ppooiinntteerr aallllooccaattee(ssiizzee__ttyyppee nn, aallllooccaattoorr<vvooiidd>::ccoonnsstt__ppooiinntteerr hhiinntt = 00);// space for n Ts
vvooiidd ddeeaallllooccaattee(ppooiinntteerr pp, ssiizzee__ttyyppee nn); // deallocate n Ts, don’t destroy
vvooiidd ccoonnssttrruucctt(ppooiinntteerr pp, ccoonnsstt T
T& vvaall) { nneew
w(pp) T
T(vvaall); }
// initialize *p by val
vvooiidd ddeessttrrooyy(ppooiinntteerr pp) { pp->~T
T(); }
// destroy *p but don’t deallocate
ssiizzee__ttyyppee m
maaxx__ssiizzee() ccoonnsstt tthhrroow
w();
tteem
mppllaattee <ccllaassss U
U>
ssttrruucctt rreebbiinndd { ttyyppeeddeeff aallllooccaattoorr<U
U> ootthheerr; }; // in effect: typedef allocator<U> other
};
tteem
mppllaattee<ccllaassss T
T> bbooooll ooppeerraattoorr==(ccoonnsstt aallllooccaattoorr<T
T>&, ccoonnsstt aallllooccaattoorr<T
T>&) tthhrroow
w();
tteem
mppllaattee<ccllaassss T
T> bbooooll ooppeerraattoorr!=(ccoonnsstt aallllooccaattoorr<T
T>&, ccoonnsstt aallllooccaattoorr<T
T>&) tthhrroow
w();
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
568
Iterators and Allocators
Chapter 19
An aallllooccaattee(nn) operation allocates space for n objects that can be deallocated by a corresponding
call of ddeeaallllooccaattee(pp,nn). Note that ddeeaallllooccaattee() also takes a number-of-elements argument nn.
This allows for close-to-optimal allocators that maintain only minimal information about allocated
memory. On the other hand, such allocators require that the user always provide the right n when
they ddeeaallllooccaattee().
The default aallllooccaattoorr uses ooppeerraattoorr nneew
w(ssiizzee__tt) to obtain memory and ooppeerraattoorr
ddeelleettee(vvooiidd*) to free it. This implies that the nneew
w__hhaannddlleerr() might be called and
oouutt__ooff__m
meem
moorryy might be thrown in case of memory exhaustion (§6.2.6.2).
Note that aallllooccaattee() is not obliged to call a lower-level allocator each time. Often, a better
strategy is for the allocator to maintain a free list of space ready to hand out with minimal time
overhead (§19.4.2).
The optional hhiinntt argument to aallllooccaattee() is completely implementation-dependent. However,
it is intended as a help to allocators for systems where locality is important. For example, an allocator might try to allocate space for related objects on the same page in a paging system. The type
of the hhiinntt argument is the ppooiinntteerr from the ultra-simplified specialization:
tteem
mppllaattee <> ccllaassss aallllooccaattoorr<vvooiidd> {
ppuubblliicc:
ttyyppeeddeeff vvooiidd* ppooiinntteerr;
ttyyppeeddeeff ccoonnsstt vvooiidd* ccoonnsstt__ppooiinntteerr;
// note: no reference
ttyyppeeddeeff vvooiidd vvaalluuee__ttyyppee;
tteem
mppllaattee <ccllaassss U
U>
ssttrruucctt rreebbiinndd { ttyyppeeddeeff aallllooccaattoorr<U
U> ootthheerr; }; // in effect: typedef allocator<U> other
};
The aallllooccaattoorr<vvooiidd>::ppooiinntteerr type acts as a universal pointer type and is ccoonnsstt vvooiidd* for all standard allocators.
Unless the documentation for an allocator says otherwise, the user has two reasonable choices
when calling aallllooccaattee():
[1] Don’t give a hint.
[2] Use a pointer to an object that is frequently used together with the new object as the hint; for
example, the previous element in a sequence.
Allocators are intended to save implementers of containers from having to deal with raw memory
directly. As an example, consider how a vveeccttoorr implementation might use memory:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss vveeccttoorr {
ppuubblliicc:
ttyyppeeddeeff ttyyppeennaam
mee A
A::ppooiinntteerr iitteerraattoorr;
// ...
pprriivvaattee:
A aalllloocc;
// allocator object
iitteerraattoorr vv;
// pointer to elements
// ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.4.1
The Standard Allocator
569
ppuubblliicc:
eexxpplliicciitt vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall = T
T(), ccoonnsstt A
A& a = A
A())
: aalllloocc(aa)
{
v = aalllloocc.aallllooccaattee(nn);
ffoorr(iitteerraattoorr p = vv; pp<vv+nn; ++pp) aalllloocc.ccoonnssttrruucctt(pp,vvaall);
// ...
}
vvooiidd rreesseerrvvee(ssiizzee__ttyyppee nn)
{
iiff (nn<=ccaappaacciittyy()) rreettuurrnn;
iitteerraattoorr p = aalllloocc.aallllooccaattee(nn);
iitteerraattoorr q = vv;
w
whhiillee (qq<vv+ssiizzee()) {
aalllloocc.ccoonnssttrruucctt(pp++,*qq);
aalllloocc.ddeessttrrooyy(qq++);
}
aalllloocc.ddeeaallllooccaattee(vv,ccaappaacciittyy());
v = pp;
// ...
// copy existing elements
// free old space
}
// ...
};
The aallllooccaattoorr operations are expressed in terms of ppooiinntteerr and rreeffeerreennccee ttyyppeeddeeffs to give the user
a chance to supply alternative types for accessing memory. This is very hard to do in general. For
example, it is not possible to define a perfect reference type within the C++ language. However,
language and library implementers can use these ttyyppeeddeeffs to support types that couldn’t be provided by an ordinary user. An example would be an allocator that provided access to a persistent
store. Another example would be a ‘‘long’’ pointer type for accessing main memory beyond what
a default pointer (usually 32 bits) could address.
The ordinary user can supply an unusual pointer type to an allocator for specific uses. The
equivalent cannot be done for references, but that may be an acceptable constraint for an experiment or a specialized system.
An allocator is designed to make it easy to handle objects of the type specified by its template
parameter. However, most container implementations require objects of additional types. For
example, the implementer of a lliisstt will need to allocate L
Liinnkk objects. Usually, such L
Liinnkks must be
allocated using their lliisstt’s allocator.
The curious rreebbiinndd type is provided to allow an allocator to allocate objects of arbitrary type.
Consider:
ttyyppeeddeeff ttyyppeennaam
mee A
A::rreebbiinndd<L
Liinnkk>::ootthheerr L
Liinnkk__aalllloocc;
If A is an aallllooccaattoorr, then rreebbiinndd<L
Liinnkk>::ootthheerr is ttyyppeeddeeff’d to mean aallllooccaattoorr<L
Liinnkk>, so the previous ttyyppeeddeeff is an indirect way of saying:
ttyyppeeddeeff aallllooccaattoorr<L
Liinnkk> L
Liinnkk__aalllloocc;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
570
Iterators and Allocators
Chapter 19
The indirection frees us from having to mention aallllooccaattoorr directly. It expresses the L
Liinnkk__aalllloocc
type in terms of a template parameter A
A. For example:
tteem
mppllaattee <ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> > ccllaassss lliisstt {
pprriivvaattee:
ccllaassss L
Liinnkk { /* ... */ };
ttyyppeeddeeff ttyyppeennaam
mee A
A::rreebbiinndd<L
Liinnkk>::ootthheerr L
Liinnkk__aalllloocc;
// allocator<Link>
L
Liinnkk__aalllloocc aa; // link allocator
A aalllloocc;
// list allocator
// ...
ppuubblliicc:
ttyyppeeddeeff ttyyppeennaam
mee A
A::ppooiinntteerr iitteerraattoorr;
// ...
iitteerraattoorr iinnsseerrtt(iitteerraattoorr ppooss, ccoonnsstt T
T& x )
{
L
Liinnkk__aalllloocc::ppooiinntteerr p = aa.aallllooccaattee(11);
// ...
}
// ...
// get a Link
};
Because L
Liinnkk is a member of lliisstt, it is parameterized by an allocator. Consequently, L
Liinnkks from
lliisstts with different allocators are of different types, just like the lliisstts themselves (§17.3.3).
19.4.2 A User-Defined Allocator [iter.alloc.user]
Implementers of containers often aallllooccaattee() and ddeeaallllooccaattee() objects one at a time. For a naive
implementation of aallllooccaattee(), this implies lots of calls of operator nneew
w, and not all implementations of operator nneew
w are efficient when used like that. As an example of a user-defined allocator, I
present a scheme for using pools of fixed-sized pieces of memory from which the allocator can
aallllooccaattee() more efficiently than can a conventional and more general ooppeerraattoorr nneew
w().
I happen to have a pool allocator that does approximately the right thing, but it has the wrong
interface (because it was designed years before allocators were invented). This P
Pooooll class implements the notion of a pool of fixed-sized elements from which a user can do fast allocations and
deallocations. It is a low-level type that deals with memory directly and worries about alignment:
ccllaassss P
Pooooll {
ssttrruucctt L
Liinnkk { L
Liinnkk* nneexxtt; };
ssttrruucctt C
Chhuunnkk {
eennuum
m { ssiizzee = 88*11002244-1166 };
C
Chhuunnkk* nneexxtt;
cchhaarr m
meem
m[ssiizzee];
};
C
Chhuunnkk* cchhuunnkkss;
ccoonnsstt uunnssiiggnneedd iinntt eessiizzee;
L
Liinnkk* hheeaadd;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.4.2
P
Pooooll(P
Pooooll&);
vvooiidd ooppeerraattoorr=(P
Pooooll&);
vvooiidd ggrroow
w();
ppuubblliicc:
P
Pooooll(uunnssiiggnneedd iinntt nn);
~P
Pooooll();
vvooiidd* aalllloocc();
vvooiidd ffrreeee(vvooiidd* bb);
A User-Defined Allocator
571
// copy protection
// copy protection
// make pool larger
// n is the size of elements
// allocate one element
// put an element back into the pool
};
iinnlliinnee vvooiidd* P
Pooooll::aalllloocc()
{
iiff (hheeaadd==00) ggrroow
w();
L
Liinnkk* p = hheeaadd;
hheeaadd = pp->nneexxtt;
rreettuurrnn pp;
}
// return first element
iinnlliinnee vvooiidd P
Pooooll::ffrreeee(vvooiidd* bb)
{
L
Liinnkk* p = ssttaattiicc__ccaasstt<L
Liinnkk*>(bb);
pp->nneexxtt = hheeaadd;
// put b back as first element
hheeaadd = pp;
}
P
Pooooll::P
Pooooll(uunnssiiggnneedd iinntt sszz)
: eessiizzee(sszz<ssiizzeeooff(L
Liinnkk*)?ssiizzeeooff(L
Liinnkk*):sszz)
{
hheeaadd = 00;
cchhuunnkkss = 00;
}
P
Pooooll::~P
Pooooll() // free all chunks
{
C
Chhuunnkk* n = cchhuunnkkss;
w
whhiillee (nn) {
C
Chhuunnkk* p = nn;
n = nn->nneexxtt;
ddeelleettee pp;
}
}
vvooiidd P
Pooooll::ggrroow
w() // allocate new ‘chunk,’ organize it as a linked list of elements of size ’esize’
{
C
Chhuunnkk* n = nneew
w C
Chhuunnkk;
nn->nneexxtt = cchhuunnkkss;
cchhuunnkkss = nn;
ccoonnsstt iinntt nneelleem
m=C
Chhuunnkk::ssiizzee/eessiizzee;
cchhaarr* ssttaarrtt = nn->m
meem
m;
cchhaarr* llaasstt = &ssttaarrtt[(nneelleem
m-11)*eessiizzee];
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
572
Iterators and Allocators
Chapter 19
ffoorr (cchhaarr* p = ssttaarrtt; pp<llaasstt; pp+=eessiizzee)
// assume sizeof(Link)<=esize
rreeiinntteerrpprreett__ccaasstt<L
Liinnkk*>(pp)->nneexxtt = rreeiinntteerrpprreett__ccaasstt<L
Liinnkk*>(pp+eessiizzee);
rreeiinntteerrpprreett__ccaasstt<L
Liinnkk*>(llaasstt)->nneexxtt = 00;
hheeaadd = rreeiinntteerrpprreett__ccaasstt<L
Liinnkk*>(ssttaarrtt);
}
To add a touch of realism, I’ll use P
Pooooll unchanged as part of the implementation of my allocator,
rather than rewrite it to give it the right interface. The pool allocator is intended for fast allocation
and deallocation of single elements and that is what my P
Pooooll class supports. Extending this implementation to handle allocations of arbitrary numbers of objects and to objects of arbitrary size (as
required by rreebbiinndd()) is left as an exercise (§19.6[9]).
Given P
Pooooll, the definition of P
Pooooll__aalllloocc is trivial;
tteem
mppllaattee <ccllaassss T
T> ccllaassss P
Pooooll__aalllloocc {
pprriivvaattee:
ssttaattiicc P
Pooooll m
meem
m;
// pool of elements of sizeof(T)
ppuubblliicc:
// like the standard allocator (§19.4.1)
};
tteem
mppllaattee <ccllaassss T
T> P
Pooooll P
Pooooll__aalllloocc<T
T>::m
meem
m(ssiizzeeooff(T
T));
tteem
mppllaattee <ccllaassss T
T> P
Pooooll__aalllloocc<T
T>::P
Pooooll__aalllloocc() { }
tteem
mppllaattee <ccllaassss T
T>
T
T* P
Pooooll__aalllloocc<T
T>::aallllooccaattee(ssiizzee__ttyyppee nn, vvooiidd* = 00)
{
iiff (nn == 11) rreettuurrnn ssttaattiicc__ccaasstt<T
T*>(m
meem
m.aalllloocc());
// ...
}
tteem
mppllaattee <ccllaassss T
T>
vvooiidd P
Pooooll__aalllloocc<T
T>::ddeeaallllooccaattee(ppooiinntteerr pp, ssiizzee__ttyyppee nn)
{
iiff (nn == 11) {
m
meem
m.ffrreeee(pp);
rreettuurrnn;
}
// ...
}
This allocator can now be used in the obvious way:
vveeccttoorr<iinntt,P
Pooooll__aalllloocc> vv;
m
maapp<ssttrriinngg,nnuum
mbbeerr,P
Pooooll__aalllloocc> m
m;
// use exactly as usual
vveeccttoorr<iinntt> vv22 = vv;
// error: different allocator parameters
I chose to make the P
Pooooll for a P
Pooooll__aalllloocc static because of a restriction that the standard library
imposes on allocators used by the standard containers: the implementation of a standard container
is allowed to treat every object of its allocator type as equivalent. This can lead to significant
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.4.2
A User-Defined Allocator
573
performance advantages. For example, because of this restriction, memory need not be set aside for
allocators in L
Liinnkk objects (which are typically parameterized by the allocator of the container for
which they are L
Liinnkks; §19.4.1), and operations that may access elements of two sequences (such as
ssw
waapp()) need not check whether the objects manipulated all have the same allocator. However,
the restriction does imply that such allocators cannot use per-object data.
Before applying this kind of optimization, make sure that it is necessary. I expect that many
default aallllooccaattoorrs will implement exactly this kind of classic C++ optimization – thus saving you
the bother.
19.4.3 Generalized Allocators [iter.general]
An aallllooccaattoorr is a simplified and optimized variant of the idea of passing information to a container
through a template parameter (§13.4.1, §16.2.3). For example, it makes sense to require that every
element in a container is allocated by the container’s allocator. However, if two lliisstts of the same
type were allowed to have different allocators, then sspplliiccee() (§17.2.2.1) couldn’t be implemented
through relinking. Instead, sspplliiccee() would have to be defined in terms of copying of elements to
protect against the rare cases in which we want to splice elements from a lliisstt with one allocator into
another with a different allocator of the same allocator type. Similarly, if allocators were allowed
to be perfectly general, the rreebbiinndd mechanism that allows an allocator to allocate elements of arbitrary types would have to be more elaborate. Consequently, a standard allocator is assumed to hold
no per-object data and an implementation of a standard may take advantage of that.
Surprisingly, the apparently Draconian restriction against per-object information in allocators is
not particularly serious. Most allocators do not need per-object data and can be made to run faster
without such data. Allocators can still hold data on a per-allocator-type basis. If separate data is
needed, separate allocator types can be used. For example:
tteem
mppllaattee<ccllaassss T
T, ccllaassss D
D> ccllaassss M
Myy__aalllloocc { // allocator for T implemented using D
D dd;
// data needed for My_alloc<T,D>
// ...
};
ttyyppeeddeeff M
Myy__aalllloocc<iinntt,P
Peerrssiisstteenntt__iinnffoo> P
Peerrssiisstteenntt;
ttyyppeeddeeff M
Myy__aalllloocc<iinntt,SShhaarreedd__iinnffoo> SShhaarreedd;
ttyyppeeddeeff M
Myy__aalllloocc<iinntt,D
Deeffaauulltt__iinnffoo> D
Deeffaauulltt;
lliisstt<iinntt,P
Peerrssiisstteenntt> llsstt11;
lliisstt<iinntt,SShhaarreedd> llsstt22;
lliisstt<iinntt,D
Deeffaauulltt> llsstt33;
The lists llsstt11, llsstt22, and llsstt33 are of different types. Therefore, we must use general algorithms
(Chapter 18) when operating on two of these lists rather than specialized list operations (§17.2.2.1).
This implies that copying rather than relinking is done, so having different allocators poses no
problems.
The restriction against per-object data in allocators is imposed because of the stringent demands
on the run-time and space efficiency of the standard library. For example, the space overhead of
allocator data for a list probably wouldn’t be significant. However, it could be serious if each link
of a list suffered overhead.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
574
Iterators and Allocators
Chapter 19
Consider how the allocator technique could be used when the efficiency constraints of the standard library don’t apply. This would be the case for a nonstandard library that wasn’t meant to
deliver high performance for essentially every data structure and every type in a program and for
some special-purpose implementations of the standard library. In such cases, an allocator can be
used to carry the kind of information that often inhabits universal base classes (§16.2.2). For example, an allocator could be designed to answer requests about where its objects are allocated, present
data representing object layout, and answer questions such as ‘‘is this element in this container?’’
It could also provide controls for a container that acts as a cache for memory in permanent storage,
provide association between the container and other objects, etc.
In this way, arbitrary services can be provided transparently to the ordinary container operations. However, it is best to distinguish between issues relating to storage of data and issues of the
use of data. The latter do not belong in a generalized allocator, but they could be provided through
a separate template argument.
19.4.4 Uninitialized Memory [iter.memory]
In addition to the standard aallllooccaattoorr, the <m
meem
moorryy> header provides a few functions for dealing
with uninitialized memory. They share the dangerous and occasionally essential property of using
a type name T to refer to space sufficient to hold an object of type T rather than to a properly constructed object of type T
T.
The library provides three ways to copy values into uninitialized space:
tteem
mppllaattee <ccllaassss IInn, ccllaassss F
Foorr>
F
Foorr uunniinniittiiaalliizzeedd__ccooppyy(IInn ffiirrsstt, IInn llaasstt, F
Foorr rreess)
// copy into res
{
ttyyppeeddeeff ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<F
Foorr>::vvaalluuee__ttyyppee V
V;
w
whhiillee (ffiirrsstt != llaasstt)
nneew
w (ssttaattiicc__ccaasstt<vvooiidd*>(&*rreess++)) V
V(*ffiirrsstt++);
rreettuurrnn rreess;
// construct in res (§10.4.11)
}
tteem
mppllaattee <ccllaassss F
Foorr, ccllaassss T
T>
vvooiidd uunniinniittiiaalliizzeedd__ffiillll(F
Foorr ffiirrsstt, F
Foorr llaasstt, ccoonnsstt T
T& vvaall)
{
ttyyppeeddeeff ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<F
Foorr>::vvaalluuee__ttyyppee V
V;
w
whhiillee (ffiirrsstt != llaasstt) nneew
w (ssttaattiicc__ccaasstt<vvooiidd*>(&*ffiirrsstt++)) V
V(vvaall);
// construct in first
}
tteem
mppllaattee <ccllaassss F
Foorr, ccllaassss SSiizzee, ccllaassss T
T>
vvooiidd uunniinniittiiaalliizzeedd__ffiillll__nn(F
Foorr ffiirrsstt, SSiizzee nn, ccoonnsstt T
T& vvaall)
{
ttyyppeeddeeff ttyyppeennaam
mee iitteerraattoorr__ttrraaiittss<F
Foorr>::vvaalluuee__ttyyppee V
V;
w
whhiillee (nn--) nneew
w (ssttaattiicc__ccaasstt<vvooiidd*>(&*ffiirrsstt++)) V
V(vvaall); // construct in first
}
These functions are intended primarily for implementers of containers and algorithms. For example, rreesseerrvvee() and rreessiizzee() (§16.3.8) are most easily implemented using these functions
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.4.4
Uninitialized Memory
575
(§19.6[10]). It would clearly be most unfortunate if an uninitialized object escaped from the internals of a container into the hands of general users.
Algorithms often require temporary space to perform acceptably. Often, such temporary space
is best allocated in one operation but not initialized until a particular location is actually needed.
Consequently, the library provides a pair of functions for allocating and deallocating uninitialized
space:
tteem
mppllaattee <ccllaassss T
T> ppaaiirr<T
T*,ppttrrddiiffff__tt> ggeett__tteem
mppoorraarryy__bbuuffffeerr(ppttrrddiiffff__tt);// allocate, don’t initialize
tteem
mppllaattee <ccllaassss T
T> vvooiidd rreettuurrnn__tteem
mppoorraarryy__bbuuffffeerr(T
T*);
// deallocate, don’t destroy
A ggeett__tteem
mppoorraarryy__bbuuffffeerr<X
X>(nn) operation tries to allocate space for n or more objects of type X
X.
If it succeeds in allocating some memory, it returns a pointer to the first uninitialized space and the
number of objects of type X that will fit into that space; otherwise, the sseeccoonndd value of the pair is
zero. The idea is that a system may keep a number of fixed-sized buffers ready for fast allocation
so that requesting space for n objects may yield space for more than nn. It may also yield less, however, so one way of using ggeett__tteem
mppoorraarryy__bbuuffffeerr() is to optimistically ask for a lot and then use
what happens to be available.
A buffer obtained by ggeett__tteem
mppoorraarryy__bbuuffffeerr() must be freed for other use by a call of
rreettuurrnn__tteem
mppoorraarryy__bbuuffffeerr(). Just as ggeett__tteem
mppoorraarryy__bbuuffffeerr() allocates without constructing,
rreettuurrnn__tteem
mppoorraarryy__bbuuffffeerr() frees without destroying. Because ggeett__tteem
mppoorraarryy__bbuuffffeerr() is lowlevel and likely to be optimized for managing temporary buffers, it should not be used as an alternative to nneew
w or aallllooccaattoorr::aallllooccaattee() for obtaining longer-term storage.
The standard algorithms that write into a sequence assume that the elements of that sequence
have been previously initialized. That is, the algorithms use assignment rather than copy construction for writing. Consequently, we cannot use uninitialized memory as the immediate target of an
algorithm. This can be unfortunate because assignment can be significantly more expensive than
initialization. Besides, we are not interested in the values we are about to overwrite anyway (or we
wouldn’t be overwriting them). The solution is to use a rraaw
w__ssttoorraaggee__iitteerraattoorr from <m
meem
moorryy>
that initializes instead of assigns:
tteem
mppllaattee <ccllaassss O
Ouutt, ccllaassss T
T>
ccllaassss rraaw
w__ssttoorraaggee__iitteerraattoorr : ppuubblliicc iitteerraattoorr<oouuttppuutt__iitteerraattoorr__ttaagg,vvooiidd,vvooiidd,vvooiidd,vvooiidd> {
O
Ouutt pp;
ppuubblliicc:
eexxpplliicciitt rraaw
w__ssttoorraaggee__iitteerraattoorr(O
Ouutt pppp) : pp(pppp) { }
rraaw
w__ssttoorraaggee__iitteerraattoorr& ooppeerraattoorr*() { rreettuurrnn *pp; }
rraaw
w__ssttoorraaggee__iitteerraattoorr& ooppeerraattoorr=(ccoonnsstt T
T& vvaall)
{
T
T* pppp = &*pp;
nneew
w(pppp) T
T(vvaall);
// place val in pp (§10.4.11)
rreettuurrnn pp;
}
rraaw
w__ssttoorraaggee__iitteerraattoorr& ooppeerraattoorr++() { rreettuurrnn ++pp; }
rraaw
w__ssttoorraaggee__iitteerraattoorr ooppeerraattoorr++(iinntt) { rreettuurrnn pp++; }
};
For example, we might write a template that copies the contents of a vveeccttoorr into a buffer:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
576
Iterators and Allocators
Chapter 19
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A> T
T* tteem
mppoorraarryy__dduupp(vveeccttoorr<T
T,A
A>& vv)
{
T
T* p = ggeett__tteem
mppoorraarryy__bbuuffffeerr<T
T>(vv.ssiizzee()).ffiirrsstt;
iiff (pp == 00) rreettuurrnn 00;
ccooppyy(vv.bbeeggiinn(),vv.eenndd(),rraaw
w__ssttoorraaggee__iitteerraattoorr<T
T*,T
T>(pp));
rreettuurrnn pp;
}
Had nneew
w been used instead of ggeett__tteem
mppoorraarryy__bbuuffffeerr(), initialization would have been done.
Once initialization is avoided, the rraaw
w__ssttoorraaggee__iitteerraattoorr becomes necessary for dealing with the
uninitialized space. In this example, the caller of tteem
mppoorraarryy__dduupp() is responsible for calling
ddeessttrrooyy__tteem
mppoorraarryy__bbuuffffeerr() for the pointer it received.
19.4.5 Dynamic Memory [iter.dynamic]
The functions used to implement the nneew
w and ddeelleettee operators are declared in <nneew
w> together with
a few related facilities:
ccllaassss bbaadd__aalllloocc : ppuubblliicc eexxcceeppttiioonn { /* ... */ };
ssttrruucctt nnootthhrroow
w__tt {};
eexxtteerrnn ccoonnsstt nnootthhrroow
w__tt nnootthhrroow
w;
// indicator for allocation that doesn’t throw exceptions
ttyyppeeddeeff vvooiidd (*nneew
w__hhaannddlleerr)();
nneew
w__hhaannddlleerr sseett__nneew
w__hhaannddlleerr(nneew
w__hhaannddlleerr nneew
w__pp) tthhrroow
w();
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt) tthhrroow
w(bbaadd__aalllloocc);
vvooiidd ooppeerraattoorr ddeelleettee(vvooiidd*) tthhrroow
w();
vvooiidd* ooppeerraattoorr nneew
w(ssiizzee__tt, ccoonnsstt nnootthhrroow
w__tt&) tthhrroow
w();
vvooiidd ooppeerraattoorr ddeelleettee(vvooiidd*, ccoonnsstt nnootthhrroow
w__tt&) tthhrroow
w();
vvooiidd* ooppeerraattoorr nneew
w[](ssiizzee__tt) tthhrroow
w(bbaadd__aalllloocc);
vvooiidd ooppeerraattoorr ddeelleettee[](vvooiidd*) tthhrroow
w();
vvooiidd* ooppeerraattoorr nneew
w[](ssiizzee__tt, ccoonnsstt nnootthhrroow
w__tt&) tthhrroow
w();
vvooiidd ooppeerraattoorr ddeelleettee[](vvooiidd*, ccoonnsstt nnootthhrroow
w__tt&) tthhrroow
w();
vvooiidd* ooppeerraattoorr nneew
w (ssiizzee__tt, vvooiidd* pp) tthhrroow
w() { rreettuurrnn pp; }
vvooiidd ooppeerraattoorr ddeelleettee (vvooiidd* pp, vvooiidd*) tthhrroow
w() { }
// placement (§10.4.11)
vvooiidd* ooppeerraattoorr nneew
w[](ssiizzee__tt, vvooiidd* pp) tthhrroow
w() { rreettuurrnn pp; }
vvooiidd ooppeerraattoorr ddeelleettee[](vvooiidd* pp, vvooiidd*) tthhrroow
w() { }
The nnootthhrroow
w versions of ooppeerraattoorr nneew
w() allocate as usual, but if allocation fails, they return 0
rather than throwing bbaadd__aalllloocc. For example:
vvooiidd ff()
{
iinntt* p = nneew
w iinntt[110000000000]; // may throw bad_alloc
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 19.4.5
Dynamic Memory
577
iiff (iinntt* q = nneew
w(nnootthhrroow
w) iinntt[110000000000]) { // will not throw exception
// allocation succeeded
}
eellssee {
// allocation failed
}
}
This allows us to use pre-exception error-handling strategies for allocation.
19.4.6 C-Style Allocation [iter.c]
From C, C++ inherited a functional interface to dynamic memory. It can be found in <ccssttddlliibb>:
vvooiidd* m
maalllloocc(ssiizzee__tt ss);
vvooiidd* ccaalllloocc(ssiizzee__tt nn, ssiizzee__tt ss);
vvooiidd ffrreeee(vvooiidd* pp);
vvooiidd* rreeaalllloocc(vvooiidd* pp, ssiizzee__tt ss);
// allocate s bytes
// allocate n times s bytes initialized to 0
// free space allocated by malloc() or calloc()
// change the size of the array pointed to by p to s;
// if that cannot be done, allocate s bytes, copy
// the array pointed to by p to it, and free p
These functions should be avoided in favor of nneew
w, ddeelleettee, and standard containers. These functions deal with uninitialized memory. In particular, ffrreeee() does not invoke destructors for the
memory it frees. An implementation of nneew
w and ddeelleettee may use these functions, but there is no
guarantee that it does. For example, allocating an object using nneew
w and deleting it using ffrreeee() is
asking for trouble. If you feel the need to use rreeaalllloocc(), consider relying on a standard container
instead; doing that is usually simpler and just as efficient (§16.3.5).
The library also provides a set of functions intended for efficient manipulation of bytes.
Because C originally accessed untyped bytes through cchhaarr* pointers, these functions are found in
<ccssttrriinngg>. The vvooiidd* pointers are treated as if they were cchhaarr* pointers within these functions:
vvooiidd* m
meem
mccppyy(vvooiidd* pp, ccoonnsstt vvooiidd* qq, ssiizzee__tt nn);
vvooiidd* m
meem
mm
moovvee(vvooiidd* pp, ccoonnsstt vvooiidd* qq, ssiizzee__tt nn);
// copy non-overlapping areas
// copy potentially overlapping areas
Like ssttrrccppyy() (§20.4.1), these functions copy n bytes from q to p and return pp. The ranges copied
by m
meem
mm
moovvee() may overlap. However, m
meem
mccppyy() assumes that the ranges do not overlap and is
usually optimized to take advantage of that assumption. Similarly:
vvooiidd* m
meem
mcchhrr(ccoonnsstt vvooiidd* pp, iinntt bb, ssiizzee__tt nn); // like strchr() (§20.4.1): find b in p[0]..p[n-1]
iinntt m
meem
mccm
mpp(ccoonnsstt vvooiidd* pp, ccoonnsstt vvooiidd* qq, ssiizzee__tt nn); // like strcmp(): compare byte sequences
vvooiidd* m
meem
msseett(vvooiidd* pp, iinntt bb, ssiizzee__tt nn);
// set n bytes to b, return p
Many implementations provide highly optimized versions of these functions.
19.5 Advice [iter.advice]
[1] When writing an algorithm, decide which kind of iterator is needed to provide acceptable efficiency and express the algorithm using the operators supported by that kind of iterator (only);
§19.2.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
578
Iterators and Allocators
Chapter 19
[2] Use overloading to provide more-efficient implementations of an algorithm when given as
arguments iterators that offer more than minimal support for the algorithm; §19.2.3.
[3] Use iitteerraattoorr__ttrraaiittss to express suitable algorithms for different iterator categories; §19.2.2.
[4] Remember to use ++ between accesses of iissttrreeaam
m__iitteerraattoorrs and oossttrreeaam
m__iitteerraattoorrs; §19.2.6.
[5] Use inserters to avoid container overflow; §19.2.4.
[6] Use extra checking during debugging and remove checking later only where necessary;
§19.3.1.
[7] Prefer ++pp to pp++; §19.3.
[8] Use uninitialized memory to improve the performance of algorithms that expand data structures; §19.4.4.
[9] Use temporary buffers to improve the performance of algorithms that require temporary data
structures; §19.4.4.
[10] Think twice before writing your own allocator; §19.4.
[11] Avoid m
maalllloocc(), ffrreeee(), rreeaalllloocc(), etc.; §19.4.6.
[12] You can simulate a ttyyppeeddeeff of a template by the technique used for rreebbiinndd; §19.4.1.
19.6 Exercises [iter.exercises]
1. (∗1.5) Implement rreevveerrssee() from §18.6.7. Hint: See §19.2.3.
2. (∗1.5) Write an output iterator, SSiinnkk, that doesn’t actually write anywhere. When can SSiinnkk be
useful?
3. (∗2) Implement rreevveerrssee__iitteerraattoorr (§19.2.5).
4. (∗1.5) Implement oossttrreeaam
m__iitteerraattoorr (§19.2.6).
5. (∗2) Implement iissttrreeaam
m__iitteerraattoorr (§19.2.6).
6. (∗2.5) Complete C
Chheecckkeedd__iitteerr (§19.3).
7. (∗2.5) Redesign C
Chheecckkeedd__iitteerr to check for invalidated iterators.
8. (∗2) Design and implement a handle class that can act as a proxy for a container by providing a
complete container interface to its users. Its implementation should consist of a pointer to a
container plus implementations of container operations that do range checking.
9. (∗2.5) Complete or reimplement P
Pooooll__aalllloocc (§19.4.2) so that it provides all of the facilities of
the standard library aallllooccaattoorr (§19.4.1). Compare the performance of aallllooccaattoorr and
P
Pooooll__aalllloocc to see if there is any reason to use a P
Pooooll__aalllloocc on your system.
10. (∗2.5) Implement vveeccttoorr using allocators rather than nneew
w and ddeelleettee.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
20
________________________________________
________________________________________________________________________________________________________________________________________________________________
Strings
Prefer the standard to the offbeat.
– Strunk & White
Strings — characters — cchhaarr__ttrraaiittss — bbaassiicc__ssttrriinngg — iterators — element access —
constructors — error handling — assignment — conversions — comparisons — insertion — concatenation — find and replace — size and capacity — string I/O — C-style
strings — character classification — C library functions — advice — exercises.
20.1 Introduction [string.intro]
A string is a sequence of characters. The standard library ssttrriinngg provides string manipulation operations such as subscripting (§20.3.3), assignment (§20.3.6), comparison (§20.3.8), appending
(§20.3.9), concatenation (§20.3.10), and searching for substrings (§20.3.11). No general substring
facility is provided by the standard, so one is provided here as an example of standard string use
(§20.3.11). A standard string can be a string of essentially any kind of character (§20.2).
Experience shows that it is impossible to design the perfect ssttrriinngg. People’s taste, expectations,
and needs differ too much for that. So, the standard library ssttrriinngg isn’t ideal. I would have made
some design decisions differently, and so would you. However, it serves many needs well, auxiliary functions to serve further needs are easily provided, and ssttdd::ssttrriinngg is generally known and
available. In most cases, these factors are more important than any minor improvement we could
provide. Writing string classes has great educational value (§11.12, §13.2), but for code meant to
be widely used, the standard library ssttrriinngg is the one to use.
From C, C++ inherited the notion of strings as zero-terminated arrays of cchhaarr and a set of functions for manipulating such C-style strings (§20.4.1).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
580
Strings
Chapter 20
20.2 Characters [string.char]
‘‘Character’’ is itself an interesting concept. Consider the character C
C. The C that you see as a
curved line on the page (or screen), I typed into my computer many months ago. There, it lives as
the numeric value 6677 in an 8-bit byte. It is the third letter in the Latin alphabet, the usual abbreviation for the sixth atom (Carbon), and, incidentally, the name of a programming language (§1.6).
What matters in the context of programming with strings is that there is a correspondence between
squiggles with conventional meaning, called characters, and numeric values. To complicate matters, the same character can have different numeric values in different character sets, not every
character set has values for every character, and many different character sets are in common use.
A character set is a mapping between a character (some conventional symbol) and an integer value.
C++ programmers usually assume that the standard American character set (ASCII) is available,
but C++ makes allowances for the possibility that some characters may be missing in a
programmer’s environment. For example, in the absence of characters such as [ and {, keywords
and digraphs can be used (§C.3.1).
Character sets with characters not in ASCII offer a greater challenge. Languages such as Chinese, Danish, French, Icelandic, and Japanese cannot be written properly using ASCII only.
Worse, the character sets used for these languages can be mutually incompatible. For example, the
characters used for European languages using Latin alphabets almost fit into a 256-character character set. Unfortunately, different sets are still used for different languages and some different
characters have ended up with the same integer value. For example, French (using Latin1) doesn’t
coexist well with Icelandic (which therefore requires Latin2). Ambitious attempts to present every
character known to man in a single character set have helped a lot, but even 16-bit character sets –
such as Unicode – are not enough to satisfy everyone. The 32-bit character sets that could – as far
as I know – hold every character are not widely used.
Basically, the C++ approach is to allow a programmer to use any character set as the character
type in strings. An extended character set or a portable numeric encoding can be used (§C.3.3).
20.2.1 Character Traits [string.traits]
As shown in §13.2, a string can, in principle, use any type with proper copy operations as its character type. However, efficiency can be improved and implementations can be simplified for types
that don’t have user-defined copy operations. Consequently, the standard ssttrriinngg requires that a
type used as its character type does not have user-defined copy operations. This also helps to make
I/O of strings simple and efficient.
The properties of a character type are defined by its cchhaarr__ttrraaiittss. A cchhaarr__ttrraaiittss is a specialization of the template:
tteem
mppllaattee<ccllaassss C
Chh> ssttrruucctt cchhaarr__ttrraaiittss { };
All cchhaarr__ttrraaiittss are defined in ssttdd, and the standard ones are presented in <ssttrriinngg>. The general
cchhaarr__ttrraaiittss itself has no properties; only cchhaarr__ttrraaiittss specializations for a particular character type
have. Consider cchhaarr__ttrraaiittss<cchhaarr>:
tteem
mppllaattee<> ssttrruucctt cchhaarr__ttrraaiittss<cchhaarr> {
ttyyppeeddeeff cchhaarr cchhaarr__ttyyppee;
// type of character
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.2.1
Character Traits
ssttaattiicc vvooiidd aassssiiggnn(cchhaarr__ttyyppee&, ccoonnsstt cchhaarr__ttyyppee&);
581
// = for char_type
// integer representation of characters:
ttyyppeeddeeff iinntt iinntt__ttyyppee;
// type of integer value of character
ssttaattiicc cchhaarr__ttyyppee ttoo__cchhaarr__ttyyppee(ccoonnsstt iinntt__ttyyppee&);
// int to char conversion
ssttaattiicc iinntt__ttyyppee ttoo__iinntt__ttyyppee(ccoonnsstt cchhaarr__ttyyppee&);
// char to int conversion
ssttaattiicc bbooooll eeqq__iinntt__ttyyppee(ccoonnsstt iinntt__ttyyppee&, ccoonnsstt iinntt__ttyyppee&); // ==
// char_type comparisons:
ssttaattiicc bbooooll eeqq(ccoonnsstt cchhaarr__ttyyppee&, ccoonnsstt cchhaarr__ttyyppee&);
ssttaattiicc bbooooll lltt(ccoonnsstt cchhaarr__ttyyppee&, ccoonnsstt cchhaarr__ttyyppee&);
// ==
// <
// operations on s[n] arrays:
ssttaattiicc cchhaarr__ttyyppee* m
moovvee(cchhaarr__ttyyppee* ss, ccoonnsstt cchhaarr__ttyyppee* ss22, ssiizzee__tt nn);
ssttaattiicc cchhaarr__ttyyppee* ccooppyy(cchhaarr__ttyyppee* ss, ccoonnsstt cchhaarr__ttyyppee* ss22, ssiizzee__tt nn);
ssttaattiicc cchhaarr__ttyyppee* aassssiiggnn(cchhaarr__ttyyppee* ss, ssiizzee__tt nn, cchhaarr__ttyyppee aa);
ssttaattiicc iinntt ccoom
mppaarree(ccoonnsstt cchhaarr__ttyyppee* ss, ccoonnsstt cchhaarr__ttyyppee* ss22, ssiizzee__tt nn);
ssttaattiicc ssiizzee__tt lleennggtthh(ccoonnsstt cchhaarr__ttyyppee*);
ssttaattiicc ccoonnsstt cchhaarr__ttyyppee* ffiinndd(ccoonnsstt cchhaarr__ttyyppee* ss, iinntt nn, ccoonnsstt cchhaarr__ttyyppee&);
// I/O related:
ttyyppeeddeeff ssttrreeaam
mooffff ooffff__ttyyppee;
ttyyppeeddeeff ssttrreeaam
mppooss ppooss__ttyyppee;
ttyyppeeddeeff m
mbbssttaattee__tt ssttaattee__ttyyppee;
// offset in stream
// position in stream
// multi-byte stream state
ssttaattiicc iinntt__ttyyppee eeooff();
// end-of-file
ssttaattiicc iinntt__ttyyppee nnoott__eeooff(ccoonnsstt iinntt__ttyyppee& ii); // i unless i equals eof(); if not any value!=eof()
ssttaattiicc ssttaattee__ttyyppee ggeett__ssttaattee(ppooss__ttyyppee pp); // multibyte conversion state of character in p
};
The implementation of the standard string template, bbaassiicc__ssttrriinngg (§20.3), relies on these types and
functions. A type used as a character type for bbaassiicc__ssttrriinngg must provide a cchhaarr__ttrraaiittss specialization that supplies them all.
For a type to be a cchhaarr__ttyyppee, it must be possible to obtain an integer value corresponding to
each character. The type of that integer is iinntt__ttyyppee, and the conversion between it and the
cchhaarr__ttyyppee is done by ttoo__cchhaarr__ttyyppee() and ttoo__iinntt__ttyyppee(). For a cchhaarr, this conversion is trivial.
Both m
moovvee(ss,ss22,nn) and ccooppyy(ss,ss22,nn) copy n characters from ss22 to s using
aassssiiggnn(ss[ii],ss22[ii]). The difference is that m
moovvee() works correctly even if ss22 is in the [ss,ss+nn[
range. Thus, ccooppyy() can be faster. This mirrors the standard C library functions m
meem
mccppyy() and
m
meem
mm
moovvee() (§19.4.6). A call aassssiiggnn(ss,nn,xx) assigns n copies of x into s using aassssiiggnn(ss[ii],xx).
The ccoom
mppaarree() function uses lltt() and eeqq() to compare characters. It returns an iinntt, where 0
represents an exact match, a negative number means that its first argument comes lexicographically
before the second, and a positive number means that its first argument comes after its second. This
mirrors the standard C library function ssttrrccm
mpp() (§20.4.1).
The I/O-related functions are used by the implementation of low-level I/O (§21.6.4).
A wide character – that is, an object of type w
wcchhaarr__tt (§4.3) – is like a cchhaarr, except that it takes
up two or more bytes. The properties of a w
wcchhaarr__tt are described by cchhaarr__ttrraaiittss<w
wcchhaarr__tt>:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
582
Strings
Chapter 20
tteem
mppllaattee<> ssttrruucctt cchhaarr__ttrraaiittss<w
wcchhaarr__tt> {
ttyyppeeddeeff w
wcchhaarr__tt cchhaarr__ttyyppee;
ttyyppeeddeeff w
wiinntt__tt iinntt__ttyyppee;
ttyyppeeddeeff w
wssttrreeaam
mooffff ooffff__ttyyppee;
ttyyppeeddeeff w
wssttrreeaam
mppooss ppooss__ttyyppee;
// like char_traits<char>
};
Aw
wcchhaarr__tt is typically used to hold characters of a 16-bit character set such as Unicode.
20.3 Basic_string [string.string]
The standard library string facilities are based on the template bbaassiicc__ssttrriinngg that provides member
types and operations similar to those provided by standard containers (§16.3):
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss ssttdd::bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
};
This template and its associated facilities are defined in namespace ssttdd and presented by <ssttrriinngg>.
Two ttyyppeeddeeffs provide conventional names for common string types:
ttyyppeeddeeff bbaassiicc__ssttrriinngg<cchhaarr> ssttrriinngg;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<w
wcchhaarr__tt> w
wssttrriinngg;
The bbaassiicc__ssttrriinngg is similar to vveeccttoorr (§16.3), except that bbaassiicc__ssttrriinngg provides some typical string
operations, such as searching for substrings, instead of the complete set of operations offered by
vveeccttoorr. A ssttrriinngg is unlikely to be implemented by a simple array or vveeccttoorr. Many common uses of
strings are better served by implementations that minimize copying, use no free store for short
strings, allow for simple modification of longer strings, etc. (see §20.6[12]). The number of ssttrriinngg
functions reflects the importance of string manipulation and also the fact that some machines provide specialized hardware instructions for string manipulation. Such functions are most easily utilized by a library implementer if there is a standard library function with similar semantics.
Like other standard library types, a bbaassiicc__ssttrriinngg<T
T> is a concrete type (§2.5.3, §10.3) without
virtual functions. It can be used as a member when designing more sophisticated text manipulation
classes, but it is not intended to be a base for derived classes (§25.2.1; see also §20.6[10]).
20.3.1 Types [string.types]
Like vveeccttoorr, bbaassiicc__ssttrriinngg makes its related types available through a set of member type names:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// types (much like vector, list, etc.: §16.3.1):
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.1
Types
ttyyppeeddeeff T
Trr ttrraaiittss__ttyyppee;
583
// specific to basic_string
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeennaam
mee T
Trr::cchhaarr__ttyyppee vvaalluuee__ttyyppee;
A aallllooccaattoorr__ttyyppee;
ttyyppeennaam
mee A
A::ssiizzee__ttyyppee ssiizzee__ttyyppee;
ttyyppeennaam
mee A
A::ddiiffffeerreennccee__ttyyppee ddiiffffeerreennccee__ttyyppee;
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeeddeeff
ttyyppeennaam
mee
ttyyppeennaam
mee
ttyyppeennaam
mee
ttyyppeennaam
mee
A
A::rreeffeerreennccee rreeffeerreennccee;
A
A::ccoonnsstt__rreeffeerreennccee ccoonnsstt__rreeffeerreennccee;
A
A::ppooiinntteerr ppooiinntteerr;
A
A::ccoonnsstt__ppooiinntteerr ccoonnsstt__ppooiinntteerr;
ttyyppeeddeeff implementation_defined iitteerraattoorr;
ttyyppeeddeeff implementation_defined ccoonnsstt__iitteerraattoorr;
ttyyppeeddeeff ssttdd::rreevveerrssee__iitteerraattoorr<iitteerraattoorr> rreevveerrssee__iitteerraattoorr;
ttyyppeeddeeff ssttdd::rreevveerrssee__iitteerraattoorr<ccoonnsstt__iitteerraattoorr> ccoonnsstt__rreevveerrssee__iitteerraattoorr;
// ...
};
The bbaassiicc__ssttrriinngg notion supports strings of many kinds of characters in addition to the simple
bbaassiicc__ssttrriinngg<cchhaarr> known as ssttrriinngg. For example:
ttyyppeeddeeff bbaassiicc__ssttrriinngg<uunnssiiggnneedd cchhaarr> U
Ussttrriinngg;
ssttrruucctt JJcchhaarr { /* ... */ };
// Japanese character type
ttyyppeeddeeff bbaassiicc__ssttrriinngg<JJcchhaarr> JJssttrriinngg;
Strings of such characters can be used just like strings of cchhaarr as far as the semantics of the characters allows. For example:
U
Ussttrriinngg ffiirrsstt__w
woorrdd(ccoonnsstt U
Ussttrriinngg& uuss)
{
U
Ussttrriinngg::ssiizzee__ttyyppee ppooss = uuss.ffiinndd(´ ´);
rreettuurrnn U
Ussttrriinngg(uuss,00,ppooss);
}
JJssttrriinngg ffiirrsstt__w
woorrdd(ccoonnsstt JJssttrriinngg& jjss)
{
JJssttrriinngg::ssiizzee__ttyyppee ppooss = jjss.ffiinndd(´ ´);
rreettuurrnn JJssttrriinngg(jjss,00,ppooss);
}
// see §20.3.11
// see §20.3.4
// see §20.3.11
// see §20.3.4
Naturally, templates that take string arguments can also be used:
tteem
mppllaattee<ccllaassss SS> S ffiirrsstt__w
woorrdd(ccoonnsstt SS& ss)
{
ttyyppeennaam
mee SS::ssiizzee__ttyyppee ppooss = ss.ffiinndd(´ ´); // see §20.3.11
rreettuurrnn SS(ss,00,ppooss);
// see §20.3.4
}
A bbaassiicc__ssttrriinngg<C
Chh> can contain any character of the set C
Chh. In particular, ssttrriinngg can contain a 0
(zero).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
584
Strings
Chapter 20
20.3.2 Iterators [string.begin]
Like other containers, a ssttrriinngg provides iterators for ordinary and reverse iteration:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// iterators (like vector, list, etc.: §16.3.2):
iitteerraattoorr bbeeggiinn();
ccoonnsstt__iitteerraattoorr bbeeggiinn() ccoonnsstt;
iitteerraattoorr eenndd();
ccoonnsstt__iitteerraattoorr eenndd() ccoonnsstt;
rreevveerrssee__iitteerraattoorr rrbbeeggiinn();
ccoonnsstt__rreevveerrssee__iitteerraattoorr rrbbeeggiinn() ccoonnsstt;
rreevveerrssee__iitteerraattoorr rreenndd();
ccoonnsstt__rreevveerrssee__iitteerraattoorr rreenndd() ccoonnsstt;
// ...
};
Because ssttrriinngg has the required member types and the functions for obtaining iterators, ssttrriinnggs can
be used together with the standard algorithms (Chapter 18). For example:
vvooiidd ff(ssttrriinngg& ss)
{
ssttrriinngg::iitteerraattoorr p = ffiinndd(ss.bbeeggiinn(),ss.eenndd(),´aa´);
// ...
}
The most common operations on ssttrriinnggs are supplied directly by ssttrriinngg. Hopefully, these versions
will be optimized for ssttrriinnggs beyond what would be easy to do for general algorithms.
The standard algorithms (Chapter 18) are not as useful for strings as one might think. General
algorithms tend to assume that the elements of a container are meaningful in isolation. This is typically not the case for a string. The meaning of a string is encoded in its exact sequence of characters. Thus, sorting a string (that is, sorting the characters in a string) destroys its meaning, whereas
sorting a general container typically makes it more useful.
The ssttrriinngg iterators are not range checked.
20.3.3 Element Access [string.elem]
Individual characters of a ssttrriinngg can be accessed through subscripting:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// element access (like vector: §16.3.3):
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.3
Element Access
585
ccoonnsstt__rreeffeerreennccee ooppeerraattoorr[](ssiizzee__ttyyppee nn) ccoonnsstt; // unchecked access
rreeffeerreennccee ooppeerraattoorr[](ssiizzee__ttyyppee nn);
ccoonnsstt__rreeffeerreennccee aatt(ssiizzee__ttyyppee nn) ccoonnsstt;
rreeffeerreennccee aatt(ssiizzee__ttyyppee nn);
// checked access
// ...
};
Out-of-range access causes aatt() to throw an oouutt__ooff__rraannggee.
Compared to vveeccttoorr, ssttrriinngg lacks ffrroonntt() and bbaacckk(). To refer to the first and the last character of a ssttrriinngg, we must say ss[00] and ss[ss.lleennggtthh()-11], respectively. The pointer/array equivalence (§5.3) doesn’t hold for ssttrriinnggss. If s is a ssttrriinngg, &ss[00] is not the same as ss.
20.3.4 Constructors [string.ctor]
The set of initialization and copy operations for a ssttrriinngg differs from what is provided for other
containers (§16.3.4) in many details:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// constructors, etc. (a bit like vector and list: §16.3.4):
eexxpplliicciitt bbaassiicc__ssttrriinngg(ccoonnsstt A
A& a = A
A());
bbaassiicc__ssttrriinngg(ccoonnsstt bbaassiicc__ssttrriinngg& ss,
ssiizzee__ttyyppee ppooss = 00, ssiizzee__ttyyppee n = nnppooss, ccoonnsstt A
A& a = A
A());
bbaassiicc__ssttrriinngg(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn, ccoonnsstt A
A& a = A
A());
bbaassiicc__ssttrriinngg(ccoonnsstt C
Chh* pp, ccoonnsstt A
A& a = A
A());
bbaassiicc__ssttrriinngg(ssiizzee__ttyyppee nn, C
Chh cc, ccoonnsstt A
A& a = A
A());
tteem
mppllaattee<ccllaassss IInn> bbaassiicc__ssttrriinngg(IInn ffiirrsstt, IInn llaasstt, ccoonnsstt A
A& a = A
A());
~bbaassiicc__ssttrriinngg();
ssttaattiicc ccoonnsstt ssiizzee__ttyyppee nnppooss;
// ‘‘all characters’’ marker
// ...
};
A ssttrriinngg can be initialized by a C-style string, by another ssttrriinngg, by part of a C-style string, by part
of a ssttrriinngg, or from a sequence of characters. However, a ssttrriinngg cannot be initialized by a character or an integer:
vvooiidd ff(cchhaarr* pp,vveeccttoorr<cchhaarr>&vv)
{
ssttrriinngg ss00;
// the empty string
ssttrriinngg ss0000 = "";
// also the empty string
ssttrriinngg ss11 = ´aa´;
ssttrriinngg ss22 = 77;
ssttrriinngg ss33(77);
// error: no conversion from char to string
// error: no conversion from int to string
// error: no constructor taking one int argument
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
586
Strings
Chapter 20
ssttrriinngg ss44(77,´aa´);
// 7 copies of ’a’; that is "aaaaaaa"
ssttrriinngg ss55 = "F
Frrooddoo";
ssttrriinngg ss66 = ss55;
// copy of "Frodo"
// copy of s5
ssttrriinngg ss77(ss55,33,22);
ssttrriinngg ss88(pp+77,33);
ssttrriinngg ss99(pp,77,33);
// s5[3] and s5[4]; that is "do"
// p[7], p[8], and p[9]
// string(string(p),7,3), possibly expensive
ssttrriinngg ss1100(vv.bbeeggiinn(),vv.eenndd());
// copy all characters from v
}
Characters are numbered starting at position 0 so that a string is a sequence of characters numbered
0 to lleennggtthh()-11.
The lleennggtthh() of a string is simply a synonym for its ssiizzee(); both functions return the number
of characters in the string. Note that they do not count a C-string-style, zero-terminator character
(§20.4.1). An implementation of bbaassiicc__ssttrriinngg stores its length rather than relying on a terminator.
Substrings are expressed as a character position plus a number of characters. The default value
nnppooss is initialized to the largest possible value and used to mean ‘‘all of the elements.’’
There is no constructor for creating a string of n unspecified characters. The closest we come to
that is the constructor that makes a string of n copies of a given character. The length of a string is
determined by the number of characters it holds at any give time. This allows the compiler to save
the programmer from silly mistakes such as the definitions of ss22 and ss33 in the previous example.
The copy constructor is the constructor taking four arguments. Three of those arguments have
defaults. For efficiency, that constructor could be implemented as two separate constructors. The
user wouldn’t be able to tell without actually looking at the generated code.
The constructor that is a template member is the most general. It allows a string to be initialized with values from an arbitrary sequence. In particular, it allows a string to be initialized with
elements of a different character type as long as a conversion exists. For example:
vvooiidd ff(ssttrriinngg ss)
{
w
wssttrriinngg w
wss(ss.bbeeggiinn(),ss.eenndd());
// ...
}
// copy all characters from s
Each w
wcchhaarr__tt in w
wss is initialized by its corresponding cchhaarr from ss.
20.3.5 Errors [string.error]
Often, strings are simply read, written, printed, stored, compared, copied, etc. This causes no problems, or, at worst, performance problems. However, once we start manipulating individual substrings and characters to compose new string values from existing ones, we sooner or later make
mistakes that could cause us to write beyond the end of a string.
For explicit access to individual characters, aatt() checks and throws oouutt__ooff__rraannggee() if we try
to access beyond the end of the string; [] does not.
Most string operations take a character position plus a number of characters. A position larger
than the size of the string throws an oouutt__ooff__rraannggee exception. A ‘‘too large’’ character count is
simply taken to be equivalent to ‘‘the rest’’ of the characters. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.5
vvooiidd ff()
{
ssttrriinngg
ssttrriinngg
ssttrriinngg
ssttrriinngg
}
Errors
587
s = "SSnnoobbooll44";
ss22(ss,110000,22); // character position beyond end of string: throw out_of_range()
ss33(ss,22,110000); // character count too large: equivalent to s3(s,2,s.size()– 2)
ss44(ss,22,ssttrriinngg::nnppooss); // the characters starting from s[2]
Thus, ‘‘too large’’ positions are to be avoided, but ‘‘too large’’ character counts are useful. In fact,
nnppooss is really just the largest possible value for ssiizzee__ttyyppee.
We could try to give a negative position or character count:
vvooiidd gg(ssttrriinngg& ss)
{
ssttrriinngg ss55(ss,-22,33); // large position!: throw out_of_range()
ssttrriinngg ss66(ss,33,-22); // large character count!: ok
}
However, the ssiizzee__ttyyppee used to represent positions and counts is an uunnssiiggnneedd type, so a negative
number is simply a confusing way of specifying a large positive number (§16.3.4).
Note that the functions used to find substrings of a ssttrriinngg (§20.3.11) return nnppooss if they don’t
find anything. Thus, they don’t throw exceptions. However, later using nnppooss as a character position does.
A pair of iterators is another way of specifying a substring. The first iterator identifies a position, and the difference between two iterators is a character count. As usual, iterators are not range
checked.
Where a C-style string is used, range checking is harder. When given a C-style string (a pointer
to cchhaarr) as an argument, bbaassiicc__ssttrriinngg functions assume the pointer is not 00. When given character
positions for C-style strings, they assume that the C-style string is long enough for the position to
be valid. Be careful! In this case, being careful means being paranoid, except when using character
literals.
All strings have lleennggtthh()<nnppooss. In a few cases, such as inserting one string into another
(§20.3.9), it is possible (although not likely) to construct a string that is too long to be represented.
In that case, a lleennggtthh__eerrrroorr is thrown. For example:
ssttrriinngg ss(ssttrriinngg::nnppooss,´aa´);
// throw length_error()
20.3.6 Assignment [string.assign]
Naturally, assignment is provided for strings:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// assignment (a bit like vector and list: §16.3.4):
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
588
Strings
Chapter 20
bbaassiicc__ssttrriinngg& ooppeerraattoorr=(ccoonnsstt bbaassiicc__ssttrriinngg& ss);
bbaassiicc__ssttrriinngg& ooppeerraattoorr=(ccoonnsstt C
Chh* pp);
bbaassiicc__ssttrriinngg& ooppeerraattoorr=(C
Chh cc);
bbaassiicc__ssttrriinngg& aassssiiggnn(ccoonnsstt bbaassiicc__ssttrriinngg&);
bbaassiicc__ssttrriinngg& aassssiiggnn(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee ppooss, ssiizzee__ttyyppee nn);
bbaassiicc__ssttrriinngg& aassssiiggnn(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn);
bbaassiicc__ssttrriinngg& aassssiiggnn(ccoonnsstt C
Chh* pp);
bbaassiicc__ssttrriinngg& aassssiiggnn(ssiizzee__ttyyppee nn, C
Chh cc);
tteem
mppllaattee<ccllaassss IInn> bbaassiicc__ssttrriinngg& aassssiiggnn(IInn ffiirrsstt, IInn llaasstt);
// ...
};
Like other standard containers, ssttrriinnggs have value semantics. That is, when one string is assigned
to another, the assigned string is copied and two separate strings with the same value exist after the
assignment. For example:
vvooiidd gg()
{
ssttrriinngg ss11 = "K
Knnoolldd";
ssttrriinngg ss22 = "T
Toott";
ss11 = ss22;
ss22[11] = ´uu´;
// two copies of "Tot"
// s2 is "Tut", s1 is still "Tot"
}
Assignment with a single character to a string is supported even though initialization by a single
character isn’t:
vvooiidd ff()
{
ssttrriinngg s = ´aa´; // error: initialization by char
s = ´aa´;
// ok: assignment
s = "aa";
s = ss;
}
Being able to assign a cchhaarr to a ssttrriinngg isn’t much use and could even be considered error-prone.
However, appending a cchhaarr using += is at times essential (§20.3.9), and it would be odd to be able
to say ss+=´cc´ but not ss=ss+´cc´.
The name aassssiiggnn() is used for the assignments, which are the counterparts to multiple argument constructors (§16.3.4, §20.3.4).
As mentioned in §11.12, it is possible to optimize a ssttrriinngg so that copying doesn’t actually take
place until two copies of a ssttrriinngg are needed. The design of the standard ssttrriinngg encourages implementations that minimize actual copying. This makes read-only uses of strings and passing of
strings as function arguments much cheaper than one could naively have assumed. However, it
would be equally naive for programmers not to check their implementations before writing code
that relied on ssttrriinngg copy being optimized (§20.6[13]).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.7
Conversion to C-Style Strings
589
20.3.7 Conversion to C-Style Strings [string.conv]
As shown in §20.3.4, a ssttrriinngg can be initialized by a C-style string and C-style strings can be
assigned to ssttrriinnggs. Conversely, it is possible to place a copy of the characters of a ssttrriinngg into an
array:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// conversion to C-style string:
ccoonnsstt C
Chh* cc__ssttrr() ccoonnsstt;
ccoonnsstt C
Chh* ddaattaa() ccoonnsstt;
ssiizzee__ttyyppee ccooppyy(C
Chh* pp, ssiizzee__ttyyppee nn, ssiizzee__ttyyppee ppooss = 00) ccoonnsstt;
// ...
};
The ddaattaa() function writes the characters of the string into an array and returns a pointer to that
array. The array is owned by the ssttrriinngg, and the user should not try to delete it. The user also cannot rely on its value after a subsequent call on a non-ccoonnsstt function on the string. The cc__ssttrr()
function is like ddaattaa(), except that it adds a 0 (zero) at the end as a C-string-style terminator. For
example:
vvooiidd ff()
{
ssttrriinngg s = "eeqquuiinnooxx";
ccoonnsstt cchhaarr* pp11 = ss.ddaattaa();
pprriinnttff("pp11 = %ss\\nn",pp11);
pp11[22] = ´aa´;
ss[22] = ´aa´;
cchhaarr c = pp11[11];
ccoonnsstt cchhaarr* pp22 = ss.cc__ssttrr();
pprriinnttff("pp22 = %ss\\nn",pp22);
// s.length()==7
// p1 points to seven characters
// bad: missing terminator
// error: p1 points to a const array
// bad: access of s.data() after modification of s
// p2 points to eight characters
// ok: c_str() adds terminator
}
In other words, ddaattaa() produces an array of characters, whereas cc__ssttrr() produces a C-style string.
These functions are primarily intended to allow simple use of functions that take C-style strings.
Consequently, cc__ssttrr() tends to be more useful than ddaattaa(). For example:
vvooiidd ff(ssttrriinngg ss)
{
iinntt i = aattooii(ss.cc__ssttrr());
// ...
}
// get int value of digits in string (§20.4.1)
Typically, it is best to leave characters in a ssttrriinngg until you need them. However, if you can’t use
the characters immediately, you can copy them into an array rather than leave them in the buffer
allocated by cc__ssttrr() or ddaattaa(). The ccooppyy() function is provided for that. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
590
Strings
Chapter 20
cchhaarr* cc__ssttrriinngg(ccoonnsstt ssttrriinngg& ss)
{
cchhaarr* p = nneew
w cchhaarr[ss.lleennggtthh()+11]; // note: +1
ss.ccooppyy(pp,ssttrriinngg::nnppooss);
pp[ss.lleennggtthh()] = 00;
// note: add terminator
rreettuurrnn pp;
}
A call ss.ccooppyy(pp,nn,m
m) copies at most n characters to p starting with ss[m
m]. If there are fewer
than n characters in s to copy, ccooppyy() simply copies all the characters there are.
Note that a ssttrriinngg can contain the 0 character. Functions manipulating C-style strings will
interprete such as 0 as a terminator. Be careful to put 00s into a string only if you don’t apply Cstyle functions to it or if you put the 0 there exactly to be a terminator.
Conversion to a C-style string could have been provided by an ooppeerraattoorr ccoonnsstt cchhaarr*() rather
than cc__ssttrr(). This would have provided the convenience of an implicit conversion at the cost of
surprises in cases in which such a conversion was unexpected.
If you find cc__ssttrr() appearing in your program with great frequency, it is probably because you
rely heavily on C-style interfaces. Often, an interface that relies on ssttrriinnggs rather than C-style
strings is available and can be used to eliminate the conversions. Alternatively, you can avoid most
of the explicit calls of cc__ssttrr() by providing additional definitions of the functions that caused you
to write the cc__ssttrr() calls:
eexxtteerrnn "C
C" iinntt aattooii(ccoonnsstt cchhaarr*);
iinntt aattooii(ccoonnsstt ssttrriinngg& ss)
{
rreettuurrnn aattooii(ss.cc__ssttrr());
}
20.3.8 Comparisons [string.compare]
Strings can be compared to strings of their own type and to arrays of characters with the same character type:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
iinntt ccoom
mppaarree(ccoonnsstt bbaassiicc__ssttrriinngg& ss) ccoonnsstt; // combined > and ==
iinntt ccoom
mppaarree(ccoonnsstt C
Chh* pp) ccoonnsstt;
iinntt ccoom
mppaarree(ssiizzee__ttyyppee ppooss, ssiizzee__ttyyppee nn, ccoonnsstt bbaassiicc__ssttrriinngg& ss) ccoonnsstt;
iinntt ccoom
mppaarree(ssiizzee__ttyyppee ppooss, ssiizzee__ttyyppee nn,
ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee ppooss22, ssiizzee__ttyyppee nn22) ccoonnsstt;
iinntt ccoom
mppaarree(ssiizzee__ttyyppee ppooss, ssiizzee__ttyyppee nn, ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn22 = nnppooss) ccoonnsstt;
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.8
Comparisons
591
When an argument n is supplied, only the n first characters will be compared. The comparison criterion used is cchhaarr__ttrraaiittss<C
Chh>’s ccoom
mppaarree() (§20.2.1). Thus, ss.ccoom
mppaarree(ss22) returns 0 if the
strings have the same value, a negative number if s is lexicographically before ss22, and a positive
number otherwise.
A user cannot supply a comparison criterion the way it was done in §13.4. When that degree of
flexibility is needed, we can use lleexxiiccooggrraapphhiiccaall__ccoom
mppaarree() (§18.9), define a function like the
one in §13.4, or write an explicit loop. For example, the ttoouuppppeerr() function (§20.4.2) allows us to
write case-insensitive comparisons:
iinntt ccm
mpp__nnooccaassee(ccoonnsstt ssttrriinngg& ss, ccoonnsstt ssttrriinngg& ss22)
{
ssttrriinngg::ccoonnsstt__iitteerraattoorr p = ss.bbeeggiinn();
ssttrriinngg::ccoonnsstt__iitteerraattoorr pp22 = ss22.bbeeggiinn();
w
whhiillee (pp!=ss.eenndd() && pp22!=ss22.eenndd()) {
iiff (ttoouuppppeerr(*pp)!=ttoouuppppeerr(*pp22)) rreettuurrnn (ttoouuppppeerr(*pp)<ttoouuppppeerr(*pp22)) ? -11 : 11;
++pp;
++pp22;
}
rreettuurrnn (ss22.ssiizzee()==ss.ssiizzee()) ? 0 : (ss.ssiizzee()<ss22.ssiizzee()) ? -11 : 11; // size is unsigned
}
vvooiidd ff(ccoonnsstt ssttrriinngg& ss, ccoonnsstt ssttrriinngg& ss22)
{
iiff (ss == ss22) {
// case sensitive compare of s and s2
// ...
}
iiff (ccm
mpp__nnooccaassee(ss,ss22)== 00) {
// ...
}
// case insensitive compare of s and s2
// ...
}
The usual comparison operators ==, !=, >, <, >=, and <= are provided for bbaassiicc__ssttrriinnggs:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbooooll ooppeerraattoorr==(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbooooll ooppeerraattoorr==(ccoonnsstt C
Chh*, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbooooll ooppeerraattoorr==(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, ccoonnsstt C
Chh*);
// similar declarations for !=, >, <, >=, and <=
Comparison operators are nonmember functions so that conversions can be applied in the same way
to both operands (§11.2.3). The versions taking C-style strings are provided to optimize comparisons against string literals. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
592
Strings
Chapter 20
vvooiidd ff(ccoonnsstt ssttrriinngg& nnaam
mee)
{
iiff (nnaam
mee =="O
Obbeelliixx" || "A
Asstteerriixx"==nnaam
mee) {
// ...
}
}
// use optimized ==
20.3.9 Insert [string.insert]
Once created, a string can be manipulated in many ways. Of the operations that modify the value
of a string, is one of the most common is appending to it – that is, adding characters to the end.
Insertion at other points of a string is rarer:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// add characters after (*this)[length()– 1]:
bbaassiicc__ssttrriinngg& ooppeerraattoorr+=(ccoonnsstt bbaassiicc__ssttrriinngg& ss);
bbaassiicc__ssttrriinngg& ooppeerraattoorr+=(ccoonnsstt C
Chh* pp);
bbaassiicc__ssttrriinngg& ooppeerraattoorr+=(C
Chh cc);
vvooiidd ppuusshh__bbaacckk(C
Chh cc);
bbaassiicc__ssttrriinngg& aappppeenndd(ccoonnsstt bbaassiicc__ssttrriinngg& ss);
bbaassiicc__ssttrriinngg& aappppeenndd(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee ppooss, ssiizzee__ttyyppee nn);
bbaassiicc__ssttrriinngg& aappppeenndd(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn);
bbaassiicc__ssttrriinngg& aappppeenndd(ccoonnsstt C
Chh* pp);
bbaassiicc__ssttrriinngg& aappppeenndd(ssiizzee__ttyyppee nn, C
Chh cc);
tteem
mppllaattee<ccllaassss IInn> bbaassiicc__ssttrriinngg& aappppeenndd(IInn ffiirrsstt, IInn llaasstt);
// insert characters before (*this)[pos]:
bbaassiicc__ssttrriinngg& iinnsseerrtt(ssiizzee__ttyyppee
bbaassiicc__ssttrriinngg& iinnsseerrtt(ssiizzee__ttyyppee
bbaassiicc__ssttrriinngg& iinnsseerrtt(ssiizzee__ttyyppee
bbaassiicc__ssttrriinngg& iinnsseerrtt(ssiizzee__ttyyppee
bbaassiicc__ssttrriinngg& iinnsseerrtt(ssiizzee__ttyyppee
ppooss, ccoonnsstt bbaassiicc__ssttrriinngg& ss);
ppooss, ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee ppooss22, ssiizzee__ttyyppee nn);
ppooss, ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn);
ppooss, ccoonnsstt C
Chh* pp);
ppooss, ssiizzee__ttyyppee nn, C
Chh cc);
// insert characters before p:
iitteerraattoorr iinnsseerrtt(iitteerraattoorr pp, C
Chh cc);
vvooiidd iinnsseerrtt(iitteerraattoorr pp, ssiizzee__ttyyppee nn, C
Chh cc);
tteem
mppllaattee<ccllaassss IInn> vvooiidd iinnsseerrtt(iitteerraattoorr pp, IInn ffiirrsstt, IInn llaasstt);
// ...
};
Basically, the variety of operations provided for initializing a string and assigning to a string is also
available for appending and for inserting characters before some character position.
The += operator is provided as the conventional notation for the most common forms of
append. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.9
Insert
593
ssttrriinngg ccoom
mpplleettee__nnaam
mee(ccoonnsstt ssttrriinngg& ffiirrsstt__nnaam
mee, ccoonnsstt ssttrriinngg& ffaam
miillyy__nnaam
mee)
{
ssttrriinngg s = ffiirrsstt__nnaam
mee;
s += ´ ´;
s += ffaam
miillyy__nnaam
mee;
rreettuurrnn ss;
}
Appending to the end can be noticeably more efficient than inserting into other positions. For
example:
ssttrriinngg ccoom
mpplleettee__nnaam
mee22(ccoonnsstt ssttrriinngg& ffiirrsstt__nnaam
mee, ccoonnsstt ssttrriinngg& ffaam
miillyy__nnaam
mee)// poor algorithm
{
ssttrriinngg s = ffaam
miillyy__nnaam
mee;
ss.iinnsseerrtt(ss.bbeeggiinn(),´ ´);
rreettuurrnn ss.iinnsseerrtt(00,ffiirrsstt__nnaam
mee);
}
Insertion usually forces the ssttrriinngg implementation to do extra memory management and to move
characters around.
Because ssttrriinngg has a ppuusshh__bbaacckk() operation (§16.3.5), a bbaacckk__iinnsseerrtteerr can be used for a
ssttrriinngg exactly as for general containers.
20.3.10 Concatenation [string.cat]
Appending is a special form of concatenation. Concatenation – constructing a string out of two
strings by placing one after the other – is provided by the + operator:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>
ooppeerraattoorr+(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A> ooppeerraattoorr+(ccoonnsstt C
Chh*, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A> ooppeerraattoorr+(C
Chh, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A> ooppeerraattoorr+(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, ccoonnsstt C
Chh*);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A> ooppeerraattoorr+(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, C
Chh);
As usual, + is defined as a nonmember function. For templates with several template parameters,
this implies a notational disadvantage, since the template parameters are mentioned repeatedly.
On the other hand, use of concatenation is obvious and convenient. For example:
ssttrriinngg ccoom
mpplleettee__nnaam
mee33(ccoonnsstt ssttrriinngg& ffiirrsstt__nnaam
mee, ccoonnsstt ssttrriinngg& ffaam
miillyy__nnaam
mee)
{
rreettuurrnn ffiirrsstt__nnaam
mee + ´ ´ + ffaam
miillyy__nnaam
mee;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
594
Strings
Chapter 20
This notational convenience may be bought at the cost of some run-time overhead compared to
ccoom
mpplleettee__nnaam
mee(). One extra temporary (§11.3.2) is needed in ccoom
mpplleettee__nnaam
mee33(). In my experience, this is rarely important, but it is worth remembering when writing an inner loop of a program where performance matters. In that case, we might even consider avoiding a function call by
making ccoom
mpplleettee__nnaam
mee() inline and composing the result string in place using lower-level operations (§20.6[14]).
20.3.11 Find [string.find]
There is a bewildering variety of functions for finding substrings:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// find subsequence (like search() §18.5.5):
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ffiinndd(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee i = 00) ccoonnsstt;
ffiinndd(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn) ccoonnsstt;
ffiinndd(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee i = 00) ccoonnsstt;
ffiinndd(C
Chh cc, ssiizzee__ttyyppee i = 00) ccoonnsstt;
// find subsequence searching backwards from the end (like find_end(), §18.5.5):
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
rrffiinndd(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
rrffiinndd(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn) ccoonnsstt;
rrffiinndd(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
rrffiinndd(C
Chh cc, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
// find character (like find_first_of() in §18.5.2):
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ffiinndd__ffiirrsstt__ooff(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee i = 00) ccoonnsstt;
ffiinndd__ffiirrsstt__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn) ccoonnsstt;
ffiinndd__ffiirrsstt__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee i = 00) ccoonnsstt;
ffiinndd__ffiirrsstt__ooff(C
Chh cc, ssiizzee__ttyyppee i = 00) ccoonnsstt;
// find character from argument searching backwards from the end:
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ffiinndd__llaasstt__ooff(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
ffiinndd__llaasstt__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn) ccoonnsstt;
ffiinndd__llaasstt__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
ffiinndd__llaasstt__ooff(C
Chh cc, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
// find character not in argument:
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ffiinndd__ffiirrsstt__nnoott__ooff(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee i = 00) ccoonnsstt;
ffiinndd__ffiirrsstt__nnoott__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn) ccoonnsstt;
ffiinndd__ffiirrsstt__nnoott__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee i = 00) ccoonnsstt;
ffiinndd__ffiirrsstt__nnoott__ooff(C
Chh cc, ssiizzee__ttyyppee i = 00) ccoonnsstt;
// find character not in argument searching backwards from the end:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.11
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
ssiizzee__ttyyppee
// ...
Find
595
ffiinndd__llaasstt__nnoott__ooff(ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
ffiinndd__llaasstt__nnoott__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn) ccoonnsstt;
ffiinndd__llaasstt__nnoott__ooff(ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
ffiinndd__llaasstt__nnoott__ooff(C
Chh cc, ssiizzee__ttyyppee i = nnppooss) ccoonnsstt;
};
These are all ccoonnsstt members. That is, they exist to locate a substring for some use, but they do not
change the value of the string to which they are applied.
The meaning of the bbaassiicc__ssttrriinngg::ffiinndd functions can be understood from their general algorithm equivalents. Consider an example:
vvooiidd ff()
{
ssttrriinngg s = "aaccccddccddee";
ssttrriinngg::ssiizzee__ttyyppee ii11 = ss.ffiinndd("ccdd");
ssttrriinngg::ssiizzee__ttyyppee ii22 = ss.rrffiinndd("ccdd");
ssttrriinngg::ssiizzee__ttyyppee ii33 = ss.ffiinndd__ffiirrsstt__ooff("ccdd");
ssttrriinngg::ssiizzee__ttyyppee ii44 = ss.ffiinndd__llaasstt__ooff("ccdd");
ssttrriinngg::ssiizzee__ttyyppee ii55 = ss.ffiinndd__ffiirrsstt__nnoott__ooff("ccdd");
ssttrriinngg::ssiizzee__ttyyppee ii66 = ss.ffiinndd__llaasstt__nnoott__ooff("ccdd");
}
// i1 = 2
// i2 = 4
// i3 = 1
// i4 = 5
// i5 = 0
// i6 = 6
s[2]==’c’ && s[3]==’d’
s[4]==’c’ && s[5]==’d’
s[1] == ’c’
s[5] == ’d’
s[0]!=’c’ && s[0]!=’d’
s[6]!=’c’ && s[6]!=’d’
If a ffiinndd() function fails to find anything, it returns nnppooss, which represents an illegal character
position. If nnppooss is used as a character position, rraannggee__eerrrroorr will be thrown (§20.3.5).
Note that result of a ffiinndd() is an uunnssiiggnneedd value.
20.3.12 Replace [string.replace]
Once a position in a string is identified, the value of individual character positions can be changed
using subscripting or whole substrings can be replaced with new characters using rreeppllaaccee():
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// replace [ (*this)[i], (*this)[i+n] [ with other characters:
bbaassiicc__ssttrriinngg& rreeppllaaccee(ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn, ccoonnsstt bbaassiicc__ssttrriinngg& ss);
bbaassiicc__ssttrriinngg& rreeppllaaccee(ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn,
ccoonnsstt bbaassiicc__ssttrriinngg& ss, ssiizzee__ttyyppee ii22, ssiizzee__ttyyppee nn22);
bbaassiicc__ssttrriinngg& rreeppllaaccee(ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn, ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn22);
bbaassiicc__ssttrriinngg& rreeppllaaccee(ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn, ccoonnsstt C
Chh* pp);
bbaassiicc__ssttrriinngg& rreeppllaaccee(ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn, ssiizzee__ttyyppee nn22, C
Chh cc);
bbaassiicc__ssttrriinngg& rreeppllaaccee(iitteerraattoorr ii, iitteerraattoorr ii22, ccoonnsstt bbaassiicc__ssttrriinngg& ss);
bbaassiicc__ssttrriinngg& rreeppllaaccee(iitteerraattoorr ii, iitteerraattoorr ii22, ccoonnsstt C
Chh* pp, ssiizzee__ttyyppee nn);
bbaassiicc__ssttrriinngg& rreeppllaaccee(iitteerraattoorr ii, iitteerraattoorr ii22, ccoonnsstt C
Chh* pp);
bbaassiicc__ssttrriinngg& rreeppllaaccee(iitteerraattoorr ii, iitteerraattoorr ii22, ssiizzee__ttyyppee nn, C
Chh cc);
tteem
mppllaattee<ccllaassss IInn> bbaassiicc__ssttrriinngg& rreeppllaaccee(iitteerraattoorr ii, iitteerraattoorr ii22, IInn jj, IInn jj22);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
596
Strings
Chapter 20
// remove characters from string (‘‘replace with nothing’’):
bbaassiicc__ssttrriinngg& eerraassee(ssiizzee__ttyyppee i = 00, ssiizzee__ttyyppee n = nnppooss);
iitteerraattoorr eerraassee(iitteerraattoorr ii);
iitteerraattoorr eerraassee(iitteerraattoorr ffiirrsstt, iitteerraattoorr llaasstt);
// ...
};
Note that the number of new characters need not be the same as the number of characters previously in the string. The size of the string is changed to accommodate the new substring. In particular, eerraassee() simply removes a substring and adjusts its size accordingly. For example:
vvooiidd ff()
{
ssttrriinngg s = "bbuutt I hhaavvee hheeaarrdd iitt w
woorrkkss eevveenn iiff yyoouu ddoonn´tt bbeelliieevvee iinn iitt";
ss.eerraassee(00,44);
// erase initial "but "
ss.rreeppllaaccee(ss.ffiinndd("eevveenn"),44,"oonnllyy");
ss.rreeppllaaccee(ss.ffiinndd("ddoonn´tt"),55,""); // erase by replacing with ""
}
The simple call eerraassee(), with no argument, makes the string into an empty string. This is the
operation that is called cclleeaarr() for general containers (§16.3.6).
The variety of rreeppllaaccee() functions matches that of assignment. After all, rreeppllaaccee() is an
assignment to a substring.
20.3.13 Substrings [string.sub]
The ssuubbssttrr() function lets you specify a substring as a position plus a length:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// address substring:
bbaassiicc__ssttrriinngg ssuubbssttrr(ssiizzee__ttyyppee i = 00, ssiizzee__ttyyppee n = nnppooss) ccoonnsstt;
// ...
};
The ssuubbssttrr() function is simply a way of reading a part of a string. On the other hand, rreeppllaaccee()
lets you write to a substring. Both rely on the low-level position plus number of characters notation. However, ffiinndd() lets us find substrings by value. Together, they allow us to define a substring that can be used for both reading and writing:
tteem
mppllaattee<ccllaassss C
Chh> ccllaassss B
Baassiicc__ssuubbssttrriinngg {
ppuubblliicc:
ttyyppeeddeeff ttyyppeennaam
mee bbaassiicc__ssttrriinngg<C
Chh>::ssiizzee__ttyyppee ssiizzee__ttyyppee;
B
Baassiicc__ssuubbssttrriinngg(bbaassiicc__ssttrriinngg<C
Chh>& ss, ssiizzee__ttyyppee ii, ssiizzee__ttyyppee nn);
// s[i]..s[i+n– 1]
B
Baassiicc__ssuubbssttrriinngg(bbaassiicc__ssttrriinngg<C
Chh>& ss, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh>& ss22); // s2 in s
B
Baassiicc__ssuubbssttrriinngg(bbaassiicc__ssttrriinngg<C
Chh>& ss, ccoonnsstt C
Chh* pp);
// *p in s
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.13
B
Baassiicc__ssuubbssttrriinngg& ooppeerraattoorr=(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh>&);
B
Baassiicc__ssuubbssttrriinngg& ooppeerraattoorr=(ccoonnsstt B
Baassiicc__ssuubbssttrriinngg<C
Chh>&);
B
Baassiicc__ssuubbssttrriinngg& ooppeerraattoorr=(ccoonnsstt C
Chh*);
B
Baassiicc__ssuubbssttrriinngg& ooppeerraattoorr=(C
Chh);
ooppeerraattoorr bbaassiicc__ssttrriinngg<C
Chh>() ccoonnsstt;
ooppeerraattoorr C
Chh* () ccoonnsstt;
pprriivvaattee:
bbaassiicc__ssttrriinngg<C
Chh>* ppss;
ssiizzee__ttyyppee ppooss;
ssiizzee__ttyyppee nn;
};
Substrings
597
// write through to *ps
// read from *ps
The implementation is largely trivial. For example:
tteem
mppllaattee<ccllaassss C
Chh>
B
Baassiicc__ssuubbssttrriinngg<C
Chh>::B
Baassiicc__ssuubbssttrriinngg(bbaassiicc__ssttrriinngg<C
Chh>& ss, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh>& ss22)
:ppss(&ss), nn(ss22.lleennggtthh())
{
ppooss = ss.ffiinndd(ss22);
}
tteem
mppllaattee<ccllaassss C
Chh>
B
Baassiicc__ssuubbssttrriinngg<C
Chh>& B
Baassiicc__ssuubbssttrriinngg<C
Chh>::ooppeerraattoorr=(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh>& ss)
{
ppss->rreeppllaaccee(ppooss,nn,ss); // write through to *ps
rreettuurrnn *tthhiiss;
}
tteem
mppllaattee<ccllaassss C
Chh> B
Baassiicc__ssuubbssttrriinngg<C
Chh>::ooppeerraattoorr bbaassiicc__ssttrriinngg<C
Chh>() ccoonnsstt
{
rreettuurrnn bbaassiicc__ssttrriinngg<C
Chh>(*ppss,ppooss,nn);
// copy from *ps
}
If ss22 isn’t found in ss, ppooss will be nnppooss. Attempts to read or write it will throw rraannggee__eerrrroorr
(§20.3.5).
This B
Baassiicc__ssuubbssttrriinngg can be used like this:
ttyyppeeddeeff B
Baassiicc__ssuubbssttrriinngg<cchhaarr> SSuubbssttrriinngg;
vvooiidd ff()
{
ssttrriinngg s = "M
Maarryy hhaadd a lliittttllee llaam
mbb";
SSuubbssttrriinngg(ss,"llaam
mbb") = "ffuunn";
SSuubbssttrriinngg(ss,"aa lliittttllee") = "nnoo";
ssttrriinngg ss22 = "JJooee" + SSuubbssttrriinngg(ss,ss.ffiinndd(´ ´),ssttrriinngg::nnppooss);
}
Naturally, this would be much more interesting if SSuubbssttrriinngg could do some pattern matching
(§20.6[7]).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
598
Strings
Chapter 20
20.3.14 Size and Capacity [string.capacity]
Memory-related issues are handled much as they are for vveeccttoorr (§16.3.8):
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh>, ccllaassss A = aallllooccaattoorr<C
Chh> >
ccllaassss bbaassiicc__ssttrriinngg {
ppuubblliicc:
// ...
// size, capacity, etc. (like §16.3.8):
ssiizzee__ttyyppee ssiizzee() ccoonnsstt;
ssiizzee__ttyyppee m
maaxx__ssiizzee() ccoonnsstt;
ssiizzee__ttyyppee lleennggtthh() ccoonnsstt { rreettuurrnn ssiizzee(); }
bbooooll eem
mppttyy() ccoonnsstt { rreettuurrnn ssiizzee()==00; }
// number of characters (§20.3.4)
// largest possible string
vvooiidd rreessiizzee(ssiizzee__ttyyppee nn, C
Chh cc);
vvooiidd rreessiizzee(ssiizzee__ttyyppee nn) { rreessiizzee(nn,C
Chh()); }
ssiizzee__ttyyppee ccaappaacciittyy() ccoonnsstt;
vvooiidd rreesseerrvvee(ssiizzee__ttyyppee rreess__aarrgg = 00);
// like vector: §16.3.8
// like vector: §16.3.8
aallllooccaattoorr__ttyyppee ggeett__aallllooccaattoorr() ccoonnsstt;
};
A call rreesseerrvvee(rreess__aarrgg) throws lleennggtthh__eerrrroorr if rreess__aarrgg>m
maaxx__ssiizzee().
20.3.15 I/O Operations [string.io]
One of the main uses of ssttrriinnggs is as the target of input and as the source of output:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&, bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&,ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& ggeettlliinnee(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&, bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, C
Chh eeooll);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& ggeettlliinnee(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&, bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
The << operator writes a string to an oossttrreeaam
m (§21.2.1). The >> operator reads a whitespaceterminated word (§3.6, §21.3.1) to its string, expanding the string as needed to hold the word. Initial whitespace is skipped, and the terminating whitespace character is not entered into the string.
The ggeettlliinnee() function reads a line terminated by eeooll to its string, expanding string as needed
to hold the line (§3.6). If no eeooll argument is provided, a newline ´\\nn´ is used as the delimiter. The
line terminator is removed from the stream but not entered into the string. Because a ssttrriinngg
expands to hold the input, there is no reason to leave the terminator in the stream or to provide a
count of characters read in the way ggeett() and ggeettlliinnee() do for character arrays (§21.3.4).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.3.16
Swap
599
20.3.16 Swap [string.swap]
As for vveeccttoorrs (§16.3.9), a ssw
waapp() function for strings can be much more efficient than the general
algorithm, so a specific version is provided:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
vvooiidd ssw
waapp(bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&);
20.4 The C Standard Library [string.cstd]
The C++ standard library inherited the C-style string functions from the C standard library. This
section lists some of the most useful C string functions. The description is not meant to be exhaustive; for further information, check your reference manual. Beware that implementers often add
their own nonstandard functions to the standard header files, so it is easy to get confused about
which functions are guaranteed to be available on every implementation.
The headers presenting the standard C library facilities are listed in §16.1.2. Memory management functions can be found in §19.4.6, C I/O functions in §21.8, and the C math library in §22.3.
The functions concerned with startup and termination are described in §3.2 and §9.4.1.1, and the
facilities for reading unspecified function arguments are presented in §7.6. C-style functions for
wide character strings are found in <ccw
wcchhaarr> and <w
wcchhaarr.hh>.
20.4.1 C-Style Strings [string.c]
Functions for manipulating C-style strings are found in <ssttrriinngg.hh> and <ccssttrriinngg>:
cchhaarr* ssttrrccppyy(cchhaarr* pp, ccoonnsstt cchhaarr* qq);
cchhaarr* ssttrrccaatt(cchhaarr* pp, ccoonnsstt cchhaarr* qq);
cchhaarr* ssttrrnnccppyy(cchhaarr* pp, ccoonnsstt cchhaarr* qq, iinntt nn);
cchhaarr* ssttrrnnccaatt(cchhaarr* pp, ccoonnsstt cchhaarr* qq, iinntt nn);
ssiizzee__tt ssttrrlleenn(ccoonnsstt cchhaarr* pp);
// copy from q into p (incl. terminator)
// append from q to p (incl. terminator)
// copy n char from q into p
// append n char from q to p
// length of p (not counting the terminator)
iinntt ssttrrccm
mpp(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq);
iinntt ssttrrnnccm
mpp(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq, iinntt nn);
// compare: p and q
// compare first n char
cchhaarr* ssttrrcchhrr(cchhaarr* pp, iinntt cc);
// find first c in p
ccoonnsstt cchhaarr* ssttrrcchhrr(ccoonnsstt cchhaarr* pp, iinntt cc);
cchhaarr* ssttrrrrcchhrr(cchhaarr* pp, iinntt cc);
// find last c in p
ccoonnsstt cchhaarr* ssttrrrrcchhrr(ccoonnsstt cchhaarr* pp, iinntt cc);
cchhaarr* ssttrrssttrr(cchhaarr* pp, ccoonnsstt cchhaarr* qq);
// find first q in p
ccoonnsstt cchhaarr* ssttrrssttrr(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq);
cchhaarr* ssttrrppbbrrkk(cchhaarr* pp, ccoonnsstt cchhaarr* qq);
// find first char from q in p
ccoonnsstt cchhaarr* ssttrrppbbrrkk(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq);
ssiizzee__tt ssttrrssppnn(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq); // number of char in p before any char in q
ssiizzee__tt ssttrrccssppnn(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq); // number of char in p before a char not in q
A pointer is assumed to be nonzero, and the array of cchhaarr that it points to is assumed to be terminated by 00. The ssttrrnn-functions pad with 0 if there are not n characters to copy. String comparisons
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
600
Strings
Chapter 20
return 0 if the strings are equal, a negative number if the first argument is lexicographically before
the second, and a positive number otherwise.
Naturally, C doesn’t provide the pairs of overloaded functions. However, they are needed in
C++ for ccoonnsstt safety. For example:
vvooiidd ff(ccoonnsstt cchhaarr* ppcccc, cchhaarr* ppcc) // C++
{
*ssttrrcchhrr(ppcccc,´aa´) = ´bb´; // error: cannot assign to const char
*ssttrrcchhrr(ppcc,´aa´) = ´bb´; // ok, but sloppy: there might not be an ’a’ in pc
}
The C++ ssttrrcchhrr() does not allow you to write to a ccoonnsstt. However, a C program may ‘‘take
advantage’’ of the weaker type checking in the C ssttrrcchhrr():
cchhaarr* ssttrrcchhrr(ccoonnsstt cchhaarr* pp, iinntt cc); /* C standard library function, not C++ */
vvooiidd gg(ccoonnsstt cchhaarr* ppcccc, cchhaarr* ppcc) /* C, will not compile in C++ */
{
*ssttrrcchhrr(ppcccc,´aa´) = ´bb´; /* converts const to non-const: ok in C, error in C++ */
*ssttrrcchhrr(ppcc,´aa´) = ´bb´; /* ok in C and C++ */
}
Whenever possible, C-style strings are best avoided in favor of ssttrriinnggs. C-style strings and their
associated standard functions can be used to produce very efficient code, but even experienced C
and C++ programmers are prone to make uncaught ‘‘silly errors’’ when using them. However, no
C++ programmer can avoid seeing some of these functions in old code. Here is a nonsense example illustrating the most common functions:
vvooiidd ff(cchhaarr* pp, cchhaarr* qq)
{
iiff (pp==qq) rreettuurrnn;
iiff (ssttrrccm
mpp(pp,qq)==00) {
iinntt i = ssttrrlleenn(pp);
// ...
}
cchhaarr bbuuff[220000];
ssttrrccppyy(bbuuff,pp);
ssttrrnnccppyy(bbuuff,pp,220000);
// pointers are equal
// string values are equal
// number of characters (not counting the terminator)
// copy p into buf (including the terminator)
// sloppy: will overflow some day.
// copy 200 char from p into buf
// sloppy: will fail to copy the terminator some day.
// ...
}
Input and output of C-style strings are usually done using the pprriinnttff family of functions (§21.8).
In <ssttddlliibb.hh> and <ccssttddlliibb>, the standard library provides useful functions for converting
strings representing numeric values into numeric values:
ddoouubbllee aattooff(ccoonnsstt cchhaarr* pp);
iinntt aattooii(ccoonnsstt cchhaarr* pp);
lloonngg aattooll(ccoonnsstt cchhaarr* pp);
// convert p to double
// convert p to int
// convert p to long
Leading whitespace is ignored. If the string doesn’t represent a number, zero is returned. For
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.4.1
C-Style Strings
601
example, the value of aattooii("sseevveenn") is 00. If the string represents a number that cannot be represented in the intended result type, eerrrrnnoo (§16.1.2, §22.3) is set to E
ER
RA
AN
NG
GE
E and an appropriately
huge or tiny value is returned.
20.4.2 Character Classification [string.isalpha]
In <ccttyyppee.hh> and <ccccttyyppee>, the standard library provides a set of useful functions for dealing with
ASCII and similar character sets:
iinntt
iinntt
iinntt
iinntt
iinntt
iinntt
iinntt
iinntt
iinntt
iinntt
iinntt
iissaallpphhaa(iinntt);
iissuuppppeerr(iinntt);
iisslloow
weerr(iinntt);
iissddiiggiitt(iinntt);
iissxxddiiggiitt(iinntt);
iissssppaaccee(iinntt);
iissccnnttrrll(iinntt);
iissppuunncctt(iinntt);
iissaallnnuum
m(iinntt);
iisspprriinntt(iinntt);
iissggrraapphh(iinntt);
// letter: ’a’..’z’ ’A’..’Z’ in C locale (§20.2.1, §21.7)
// upper case letter: ’A’..’Z’ in C locale (§20.2.1, §21.7)
// lower case letter: ’a’..’z’ in C locale (§20.2.1, §21.7)
// ’0’..’9’
// ’0’..’9’ or letter
// ’ ’ ’\t’ ’\v’ return newline formfeed
// control character (ASCII 0..31 and 127)
// punctuation: none of the above
// isalpha() isdigit()
// printable: ascii ’ ’..’˜’
// isalpha() isdigit() ispunct()
iinntt ttoouuppppeerr(iinntt cc); // uppercase equivalent to c
iinntt ttoolloow
weerr(iinntt cc); // lowercase equivalent to c
All are usually implemented by a simple lookup, using the character as an index into a table of
character attributes. This means that constructs such as:
iiff ((´aa´<=cc && cc<=´zz´) || (´A
A´<=cc && cc<=´Z
Z´)) {
// ...
}
// alphabetic
are inefficient in addition to being tedious to write and error-prone (on a machine with the EBCDIC
character set, this will accept nonalphabetic characters).
These functions take iinntt arguments, and the integer passed must be representable as an
uunnssiiggnneedd cchhaarr or E
EO
OF
F (which is most often -11). This can be a problem on systems where cchhaarr is
signed (see §20.6[11]).
Equivalent functions for wide characters are found in <ccw
wttyyppee> and <w
wttyyppee.hh>.
20.5 Advice [string.advice]
[1] Prefer ssttrriinngg operations to C-style string functions; §20.4.1.
[2] Use ssttrriinnggs as variables and members, rather than as base classes; §20.3, §25.2.1.
[3] You can pass ssttrriinnggs as value arguments and return them by value to let the system take care
of memory management; §20.3.6.
[4] Use aatt() rather than iterators or [] when you want range checking; §20.3.2, §20.3.5.
[5] Use iterators and [] rather than aatt() when you want to optimize speed; §20.3.2, §20.3.5.
[6] Directly or indirectly, use ssuubbssttrr() to read substrings and rreeppllaaccee() to write substrings;
§20.3.12, §20.3.13.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
602
Strings
Chapter 20
[7] Use the ffiinndd() operations to localize values in a ssttrriinngg (rather than writing an explicit loop);
§20.3.11.
[8] Append to a ssttrriinngg when you need to add characters efficiently; §20.3.9.
[9] Use ssttrriinnggs as targets of non-time-critical character input; §20.3.15.
[10] Use ssttrriinngg::nnppooss to indicate ‘‘the rest of the ssttrriinngg;’’ §20.3.5.
[11] If necessary, implement heavily-used ssttrriinnggs using low-level operations (rather than using
low-level data structures everywhere); §20.3.10.
[12] If you use ssttrriinnggs, catch rraannggee__eerrrroorr and oouutt__ooff__rraannggee somewhere; §20.3.5.
[13] Be careful not to pass a cchhaarr* with the value 0 to a string function; §20.3.7.
[14] Use cc__ssttrr rather to produce a C-style string representation of a ssttrriinngg only when you have to;
§20.3.7.
[15] Use iissaallpphhaa(), iissddiiggiitt(), etc., when you need to know the classification of a character rather
that writing your own tests on character values; §20.4.2.
20.6 Exercises [string.exercises]
The solutions to several exercises for this chapter can be found by looking at the source text of an
implementation of the standard library. Do yourself a favor: try to find your own solutions before
looking to see how your library implementer approached the problems.
1. (∗2) Write a function that takes two ssttrriinnggs and returns a ssttrriinngg that is the concatenation of the
strings with a dot in the middle. For example, given ffiillee and w
wrriittee, the function returns
ffiillee.w
wrriittee. Do the same exercise with C-style strings using only C facilities such as m
maalllloocc()
and ssttrrlleenn(). Compare the two functions. What are reasonable criteria for a comparison?
2. (∗2) Make a list of differences between vveeccttoorr and bbaassiicc__ssttrriinngg. Which differences are important?
3. (∗2) The string facilities are not perfectly regular. For example, you can assign a cchhaarr to a
string, but you cannot initialize a ssttrriinngg with a cchhaarr. Make a list of such irregularities. Which
could have been eliminated without complicating the use of strings? What other irregularities
would this introduce?
4. (∗1.5) Class bbaassiicc__ssttrriinngg has a lot of members. Which could be made nonmember functions
without loss of efficiency or notational convenience?
5. (∗1.5) Write a version of bbaacckk__iinnsseerrtteerr() (§19.2.4) that works for bbaassiicc__ssttrriinngg.
6. (∗2) Complete B
Baassiicc__ssuubbssttrriinngg from §20.3.13 and integrate it with a SSttrriinngg type that overloads
() to mean ‘‘substring of’’ and otherwise acts like ssttrriinngg.
7. (∗2.5) Write a ffiinndd() function that finds the first match for a simple regular expression in a
ssttrriinngg. Use ? to mean ‘‘any character,’’ * to mean any number of characters not matching the
next part of the regular expression, and [aabbcc] to mean any character from the set specified
between the square braces (here aa, bb, and cc). Other characters match themselves. For example,
ffiinndd(ss,"nnaam
mee:") returns a pointer to the first occurrence of nnaam
mee: in ss;
ffiinndd(ss,"[nnN
N]aam
mee:") returns a pointer to the first occurrence of nnaam
mee: or N
Naam
mee: in ss; and
ffiinndd(ss,"[nnN
N]aam
mee(*)") returns a pointer to the first occurence of N
Naam
mee or nnaam
mee followed
by a (possibly empty) parenthesized sequences of characters in ss.
8. (∗2.5) What operations do you find missing from the simple regular expression function from
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 20.6
Exercises
603
§20.6[7]? Specify and add them. Compare the expressiveness of your regular expression
matcher to that of a widely distributed one. Compare the performance of your regular expression matcher to that of a widely distributed one.
9. (∗2.5) Use a regular expression library to implement pattern-matching operations on a SSttrriinngg
class that has an associated SSuubbssttrriinngg class.
10. (∗2.5) Consider writing an ‘‘ideal’’ class for general text processing. Call it T
Teexxtt. What facilities should it have? What implementation constraints and overheads are imposed by your set of
‘‘ideal’’ facilities?
11. (∗1.5) Define a set of overloaded versions for iissaallpphhaa(), iissddiiggiitt(), etc., so that these functions
work correctly for cchhaarr, uunnssiiggnneedd cchhaarr, and ssiiggnneedd cchhaarr.
12. (∗2.5) Write a SSttrriinngg class optimized for strings having no more than eight characters. Compare its performance to that of the SSttrriinngg from §11.12 and your implementation’s version of the
standard library ssttrriinngg. Is it possible to design a string that combines the advantages of a string
optimized for very short strings with the advantages of a perfectly general string?
13. (∗2) Measure the performance of copying of ssttrriinnggs. Does your implementation’s implementation of ssttrriinngg adequately optimize copying?
14. (∗2.5) Compare the performance of the three ccoom
mpplleettee__nnaam
mee() functions from §20.3.9 and
§20.3.10. Try to write a version of ccoom
mpplleettee__nnaam
mee() that runs as fast as possible. Keep a
record of mistakes found during its implementation and testing.
15. (∗2.5) Imagine that reading medium-long strings (most are 5 to 25 characters long) from cciinn is
the bottleneck in your system. Write an input function that reads such strings as fast as you can
think of. You can choose the interface to that function to optimize for speed rather than for convenience. Compare the result to your implementation’s >> for ssttrriinnggs.
16. (∗1.5) Write a function iittooss(iinntt) that returns a ssttrriinngg representing its iinntt argument.
.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
604
Strings
Chapter 20
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
21
________________________________________
________________________________________________________________________________________________________________________________________________________________
Streams
What you see is all you get.
– Brian Kernighan
Input and output — oossttrreeaam
ms — output of built-in types — output of user-defined types
— virtual output functions — iissttrreeaam
ms — input of built-in types — unformatted input
— stream state — input of user-defined types — I/O exceptions — tying of streams —
sentries — formatting integer and floating-point output — fields and adjustments —
manipulators — standard manipulators — user-defined manipulators — file streams —
closing streams — string streams — stream buffers — locale — stream callbacks —
pprriinnttff() — advice — exercises.
21.1 Introduction [io.intro]
Designing and implementing a general input/output facility for a programming language is notoriously difficult. Traditionally, I/O facilities have been designed exclusively to handle a few built-in
data types. However, a nontrivial C++ program uses many user-defined types, and the input and
output of values of those types must be handled. An I/O facility should be easy, convenient, and
safe to use; efficient and flexible; and, above all, complete. Nobody has come up with a solution
that pleases everyone. It should therefore be possible for a user to provide alternative I/O facilities
and to extend the standard I/O facilities to cope with special applications.
C++ was designed to enable a user to define new types that are as efficient and convenient to
use as built-in types. It is therefore a reasonable requirement that an I/O facility for C++ should be
provided in C++ using only facilities available to every programmer. The stream I/O facilities presented here are the result of an effort to meet this challenge:
§21.2 Output: What the application programmer thinks of as output is really the conversion of
objects of types, such as iinntt, cchhaarr*, and E
Em
mppllooyyeeee__rreeccoorrdd, into sequences of characters. The facilities for writing built-in and user-defined types to output are described.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
606
Streams
Chapter 21
§21.3 Input: The facilities for requesting input of characters, strings, and values of other builtin and user-defined types are presented.
§21.4 Formatting: There are often specific requirements for the layout of the output. For
example, iinntts may have to be printed in decimal and pointers in hexadecimal or
floating-point numbers must appear with exactly specified precision. Formatting controls and the programming techniques used to provide them are discussed.
§21.5 Files and Streams: By default, every C++ program can use standard streams, such as
standard output (ccoouutt), standard input (cciinn), and error output (cceerrrr). To use other
devices or files, streams must be created and attached to those files or devices. The
mechanisms for opening and closing files and for attaching streams to files and ssttrriinnggs
are described.
§21.6 Buffering: To make I/O efficient, we must use a buffering strategy that is suitable for
both the data written (read) and the destination it is written to (read from). The basic
techniques for buffering streams are presented.
§21.7 Locale: A llooccaallee is an object that specifies how numbers are printed, what characters are
considered letters, etc. It encapsulates many cultural differences. Locales are implicitly
used by the I/O system and are only briefly described here.
§21.8 C I/O: The pprriinnttff() function from the C <ssttddiioo.hh> library and the C library’s relation
to the C++ <iioossttrreeaam
m> library are discussed.
Knowledge of the techniques used to implement the stream library is not needed to use the library.
Also, the techniques used for different implementations will differ. However, implementing I/O is
a challenging task. An implementation contains examples of techniques that can be applied to
many other programming and design tasks. Therefore, the techniques used to implement I/O are
worthy of study.
This chapter discusses the stream I/O system to the point where you should be able to appreciate its structure, to use it for most common kinds of I/O, and to extend it to handle new userdefined types. If you need to implement the standard streams, provide a new kind of stream, or
provide a new locale, you need a copy of the standard, a good systems manual, and/or examples of
working code in addition to what is presented here.
The key components of the stream I/O systems can be represented graphically like this:
iiooss__bbaassee:
locale independent format state
bbaassiicc__iiooss<
<>
>:
locale dependent format state
stream state
.
.
.
.
.
.
.
.
.
bbaassiicc__iioossttrreeaam
m<
<>
>:
formatting (<<, >>, etc.)
setup/cleanup
bbaassiicc__ssttrreeaam
mbbuuff<
<>
>:
buffering
llooccaallee:
format information
character buffer
real destination/source
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.1
Introduction
607
The dotted arrow from bbaassiicc__iioossttrreeaam
m<> indicates that bbaassiicc__iiooss<> is a virtual base class; the
solid arrows represent pointers. The classes marked with <> are templates parameterized by a
character type and containing a llooccaallee.
The streams concept and the general notation it provides can be applied to a large class of communication problems. Streams have been used for transmitting objects between machines
(§25.4.1), for encrypting message streams (§21.10[22]), for data compression, for persistent storage
of objects, and much more. However, the discussion here is restricted to simple character-oriented
input and output.
Declarations of stream I/O classes and templates (sufficient to refer to them but not to apply
operations to them) and standard ttyyppeeddeeffs are presented in <iioossffw
wdd>. This header is occasionally
needed when you want to include some but not all of the I/O headers.
21.2 Output
[io.out]
Type-safe and uniform treatment of both built-in and user-defined types can be achieved by using a
single overloaded function name for a set of output functions. For example:
ppuutt(cceerrrr,"xx = "); // cerr is the error output stream
ppuutt(cceerrrr,xx);
ppuutt(cceerrrr,´\\nn´);
The type of the argument determines which ppuutt function will be invoked for each argument. This
solution is used in several languages. However, it is repetitive. Overloading the operator << to
mean ‘‘put to’’ gives a better notation and lets the programmer output a sequence of objects in a
single statement. For example:
cceerrrr << "xx = " << x << ´\\nn´;
If x is an iinntt with the value 112233, this statement would print
x = 112233
followed by a newline onto the standard error output stream, cceerrrr. Similarly, if x is of type ccoom
m-pplleexx (§22.5) with the value (11,22.44), the statement will print
x = (11,22.44)
on cceerrrr. This style can be used as long as x is of a type for which operator << is defined and a user
can trivially define operator << for a new type.
An output operator is needed to avoid the verbosity that would have resulted from using an output function. But why <<? It is not possible to invent a new lexical token (§11.2). The assignment operator was a candidate for both input and output, but most people seemed to prefer to use
different operators for input and output. Furthermore, = binds the wrong way; that is, ccoouutt=aa=bb
means ccoouutt=(aa=bb) rather than (ccoouutt=aa)=bb (§6.2). I tried the operators < and >, but the meanings ‘‘less than’’ and ‘‘greater than’’ were so firmly implanted in people’s minds that the new I/O
statements were for all practical purposes unreadable.
The operators << and >> are not used frequently enough for built-in types to cause that problem. They are symmetric in a way that can be used to suggest ‘‘to’’ and ‘‘from.’’ When they are
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
608
Streams
Chapter 21
used for I/O, I refer to << as put to and to >> as get from. People who prefer more technicalsounding names call them inserters and extractors, respectively. The precedence of << is low
enough to allow arithmetic expressions as operands without using parentheses. For example:
ccoouutt << "aa*bb+cc=" << aa*bb+cc << ´\\nn´;
Parentheses must be used to write expressions containing operators with precedence lower than
<<’s. For example:
ccoouutt << "aa^bb|cc=" << (aa^bb|cc) << ´\\nn´;
The left shift operator (§6.2.4) can be used in an output statement, but of course it, too, must appear
within parentheses:
ccoouutt << "aa<<bb=" << (aa<<bb) << ´\\nn´;
21.2.1 Output Streams [io.ostream]
An oossttrreeaam
m is a mechanism for converting values of various types into sequences of characters.
Usually, these characters are then output using lower-level output operations. There are many
kinds of characters (§20.2) that can be characterized by cchhaarr__ttrraaiittss (§20.2.1). Consequently, an
oossttrreeaam
m is a specialization for a particular kind of character of a general bbaassiicc__oossttrreeaam
m template:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss ssttdd::bbaassiicc__oossttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
vviirrttuuaall ~bbaassiicc__oossttrreeaam
m();
// ...
};
This template and its associated output operations are defined in namespace ssttdd and presented by
<oossttrreeaam
m>, which contains the output-related parts of <iioossttrreeaam
m>.
The bbaassiicc__oossttrreeaam
m template parameters control the type of characters that is used by the implementation; they do not affect the types of values that can be output. Streams implemented using
ordinary cchhaarrs and streams implemented using wide characters are directly supported by every
implementation:
ttyyppeeddeeff bbaassiicc__oossttrreeaam
m<cchhaarr> oossttrreeaam
m;
ttyyppeeddeeff bbaassiicc__oossttrreeaam
m<w
wcchhaarr__tt> w
woossttrreeaam
m;
On many systems, it is possible to optimize writing of wide characters through w
woossttrreeaam
m to an
extent that is hard to match for streams using bytes as the unit of output.
It is possible to define streams for which the physical I/O is not done in terms of characters.
However, such streams are beyond the scope of the C++ standard and beyond the scope of this book
(§21.10[15]).
The bbaassiicc__iiooss base class is presented in <iiooss>. It controls formatting (§21.4), locale (§21.7),
and access to buffers (§21.6). It also defines a few types for notational convenience:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.2.1
Output Streams
609
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss ssttdd::bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff T
Trr ttrraaiittss__ttyyppee;
ttyyppeeddeeff ttyyppeennaam
mee T
Trr::iinntt__ttyyppee iinntt__ttyyppee; // type of integer value of character
ttyyppeeddeeff ttyyppeennaam
mee T
Trr::ppooss__ttyyppee ppooss__ttyyppee; // position in buffer
ttyyppeeddeeff ttyyppeennaam
mee T
Trr::ooffff__ttyyppee ooffff__ttyyppee; // offset in buffer
// ... see also §21.3.3, §21.3.7, §21.4.4, §21.6.3, and §21.7.1 ...
};
The iiooss__bbaassee base class contains information and operations that are independent of the character
type used, such as the precision used for floating-point output. It therefore doesn’t need to be a
template.
In addition to the ttyyppeeddeeffs in iiooss__bbaassee, the stream I/O library uses a signed integral type
ssttrreeaam
mssiizzee to represent the number of characters transferred in an I/O operation and the size of I/O
buffers. Similarly, a ttyyppeeddeeff called ssttrreeaam
mooffff is supplied for expressing offsets in streams and
buffers.
Several standard streams are declared in <iioossttrreeaam
m>:
oossttrreeaam
m ccoouutt;
oossttrreeaam
m cceerrrr;
oossttrreeaam
m cclloogg;
// standard output stream of char
// standard unbuffered output stream for error messages
// standard output stream for error messages
w
woossttrreeaam
m w
wccoouutt;
w
woossttrreeaam
m w
wcceerrrr;
w
woossttrreeaam
m w
wcclloogg;
// wide stream corresponding to cout
// wide stream corresponding to cerr
// wide stream corresponding to clog
The cceerrrr and cclloogg streams refer to the same output destination; they simply differ in the buffering
they provide. The ccoouutt writes to the same destination as C’s ssttddoouutt (§21.8), while cceerrrr and cclloogg
write to the same destination as C’s ssttddeerrrr. The programmer can create more streams as needed
(see §21.5).
21.2.2 Output of Built-In Types [io.out.builtin]
The class oossttrreeaam
m is defined with the operator << (‘‘put to’’) to handle output of the built-in types:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__oossttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(sshhoorrtt nn);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(iinntt nn);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(lloonngg nn);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(uunnssiiggnneedd sshhoorrtt nn);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(uunnssiiggnneedd iinntt nn);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(uunnssiiggnneedd lloonngg nn);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
610
Streams
Chapter 21
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(ffllooaatt ff);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(ddoouubbllee ff);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(lloonngg ddoouubbllee ff);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(bbooooll nn);
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(ccoonnsstt vvooiidd* pp);
// write pointer value
bbaassiicc__oossttrreeaam
m& ppuutt(C
Chh cc);
// write c
bbaassiicc__oossttrreeaam
m& w
wrriittee(ccoonnsstt C
Chh* pp, ssttrreeaam
mssiizzee nn);
// p[0]..p[n-1]
// ...
};
An ooppeerraattoorr<<() returns a reference to the oossttrreeaam
m for which it was called so that another ooppeerraa-ttoorr<<() can be applied to it. For example,
cceerrrr << "xx = " << xx;
where x is an iinntt, will be interpreted as:
(cceerrrr.ooppeerraattoorr<<("xx = ")).ooppeerraattoorr<<(xx);
In particular, this implies that when several items are printed by a single output statement, they will
be printed in the expected order: left to right. For example:
vvooiidd vvaall(cchhaarr cc)
{
ccoouutt << "iinntt(´" << c << "´) = " << iinntt(cc) << ´\\nn´;
}
iinntt m
maaiinn()
{
vvaall(´A
A´);
vvaall(´Z
Z´);
}
On an implementation using ASCII characters, this will print:
iinntt(´A
A´) = 6655
iinntt(´Z
Z´) = 9900
Note that a character literal has type cchhaarr (§4.3.1) so that ccoouutt<<´Z
Z´ will print the letter Z and not
the integer value 9900.
A bbooooll value will be output as 0 or 1 by default. If you don’t like that, you can set the formatting flag bboooollaallpphhaa from <iioom
maanniipp> (§21.4.6.2) and get ttrruuee or ffaallssee. For example:
iinntt m
maaiinn()
{
ccoouutt << ttrruuee << ´ ´ << ffaallssee << ´\\nn´;
ccoouutt << bboooollaallpphhaa;
ccoouutt << ttrruuee << ´ ´ << ffaallssee << ´\\nn´;
}
// use symbolic representation for true and false
This prints:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.2.2
Output of Built-In Types
611
1 0
ttrruuee ffaallssee
More precisely, bboooollaallpphhaa ensures that we get a locale-dependent representation of bbooooll values.
By setting my locale (§21.7) just right, I can get:
1 0
ssaannddtt ffaallsskk
Formatting floating-point numbers, the base used for integers, etc., are discussed in §21.4.
The function oossttrreeaam
m::ooppeerraattoorr<<(ccoonnsstt vvooiidd*) prints a pointer value in a form appropriate
to the architecture of the machine used. For example,
iinntt m
maaiinn()
{
iinntt i = 00;
iinntt* p = nneew
w iinntt;
ccoouutt << "llooccaall " << &ii << ", ffrreeee ssttoorree " << p << ´\\nn´;
}
printed
llooccaall 00xx77ffffffeeaadd00, ffrreeee ssttoorree 00xx550000cc
on my machine. Other systems have different conventions for printing pointer values.
The ppuutt() and w
wrriittee() functions simply write characters. Consequently, the << for outputting
characters need not be a member. The ooppeerraattoorr<<() functions that take a character operand can
be implemented as nonmembers using ppuutt():
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&, C
Chh);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&, cchhaarr);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>&, cchhaarr);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>&, ssiiggnneedd cchhaarr);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>&, uunnssiiggnneedd cchhaarr);
Similarly, << is provided for writing out zero-terminated character arrays:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&, ccoonnsstt C
Chh*);
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&, ccoonnsstt cchhaarr*);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>&, ccoonnsstt cchhaarr*);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>&, ccoonnsstt ssiiggnneedd cchhaarr*);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<cchhaarr,T
Trr>&, ccoonnsstt uunnssiiggnneedd cchhaarr*);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
612
Streams
Chapter 21
21.2.3 Output of User-Defined Types [io.out.udt]
Consider a user-defined type ccoom
mpplleexx (§11.3):
ccllaassss ccoom
mpplleexx {
ppuubblliicc:
ddoouubbllee rreeaall() ccoonnsstt { rreettuurrnn rree; }
ddoouubbllee iim
maagg() ccoonnsstt { rreettuurrnn iim
m; }
// ...
};
Operator << can be defined for the new type ccoom
mpplleexx like this:
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m&ss, ccoom
mpplleexx zz)
{
rreettuurrnn s << ´(´ << zz.rreeaall() << ´,´ << zz.iim
maagg() << ´)´;
}
This << can then be used exactly like << for a built-in type. For example,
iinntt m
maaiinn()
{
ccoom
mpplleexx xx(11,22);
ccoouutt << "xx = " << x << ´\\nn´;
}
produces
x = (11,22)
Defining an output operation for a user-defined type does not require modification of the declaration of class oossttrreeaam
m. This is fortunate because oossttrreeaam
m is defined in <iioossttrreeaam
m>, which users
cannot and should not modify. Not allowing additions to oossttrreeaam
m also provides protection against
accidental corruption of that data structure and makes it possible to change the implementation of
an oossttrreeaam
m without affecting user programs.
21.2.3.1 Virtual Output Functions [io.virtual]
The oossttrreeaam
m members are not vviirrttuuaall. The output operations that a programmer can add are not
members, so they cannot be vviirrttuuaall either. One reason for this is to achieve close to optimal performance for simple operations such as putting a character into a buffer. This is a place where runtime efficiency is crucial and where inlining is a must. Virtual functions are used to achieve flexibility for the operations dealing with buffer overflow and underflow only (§21.6.4).
However, a programmer sometimes wants to output an object for which only a base class is
known. Since the exact type isn’t known, correct output cannot be achieved simply by defining a
<< for each new type. Instead, a virtual output function can be provided in the abstract base:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.2.3.1
Virtual Output Functions
613
ccllaassss M
Myy__bbaassee {
ppuubblliicc:
// ...
vviirrttuuaall oossttrreeaam
m& ppuutt(oossttrreeaam
m& ss) ccoonnsstt = 00;
// write *this to s
};
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m& ss, ccoonnsstt M
Myy__bbaassee& rr)
{
rreettuurrnn rr.ppuutt(ss);
// use the right put()
}
That is, ppuutt() is a virtual function that ensures that the right output operation is used in <<.
Given that, we can write:
ccllaassss SSoom
meettyyppee : ppuubblliicc M
Myy__bbaassee {
ppuubblliicc:
// ...
oossttrreeaam
m& ppuutt(oossttrreeaam
m& ss) ccoonnsstt;
// the real output function: override My_base::put()
};
vvooiidd ff(ccoonnsstt M
Myy__bbaassee& rr, SSoom
meettyyppee& ss)
{
ccoouutt << r << ss;
}
// use << which calls the right put()
This integrates the virtual ppuutt() into the framework provided by oossttrreeaam
m and <<. The technique
is generally useful to provide operations that act like virtual functions, but with the run-time selection based on their second argument.
21.3 Input
[io.in]
Input is handled similarly to output. There is a class iissttrreeaam
m that provides an input operator >>
(‘‘get from’’) for a small set of standard types. An ooppeerraattoorr>>() can then be defined for a userdefined type.
21.3.1 Input Streams [io.istream]
In parallel to bbaassiicc__oossttrreeaam
m (§21.2.1), bbaassiicc__iissttrreeaam
m is defined in <iissttrreeaam
m>, which contains the
input-related parts of <iioossttrreeaam
m>, like this:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss ssttdd::bbaassiicc__iissttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
vviirrttuuaall ~bbaassiicc__iissttrreeaam
m();
// ...
};
The base class bbaassiicc__iiooss is described in §21.2.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
614
Streams
Chapter 21
Two standard input streams cciinn and w
wcciinn are provided in <iioossttrreeaam
m>:
ttyyppeeddeeff bbaassiicc__iissttrreeaam
m<cchhaarr> iissttrreeaam
m;
ttyyppeeddeeff bbaassiicc__iissttrreeaam
m<w
wcchhaarr__tt> w
wiissttrreeaam
m;
iissttrreeaam
m cciinn;
// standard input stream of char
w
wiissttrreeaam
m w
wcciinn; // standard input stream of wchar_t
The cciinn stream reads from the same source as C’s ssttddiinn (§21.8).
21.3.2 Input of Built-In Types [io.in.builtin]
An iissttrreeaam
m provides operator >> for the built-in types:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iissttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
// formatted input:
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(sshhoorrtt& nn);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(iinntt& nn);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(lloonngg& nn);
// read into n
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(uunnssiiggnneedd sshhoorrtt& uu); // read into u
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(uunnssiiggnneedd iinntt& uu);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(uunnssiiggnneedd lloonngg& uu);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(ffllooaatt& ff);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(ddoouubbllee& ff);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(lloonngg ddoouubbllee& ff);
// read into f
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(bbooooll& bb);
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(vvooiidd*& pp);
// read into b
// read pointer value into p
// ...
};
The ooppeerraattoorr>>() input functions are defined in this style:
iissttrreeaam
m& iissttrreeaam
m::ooppeerraattoorr>>(T
T& ttvvaarr)
// T is a type for which istream::operator>> is declared
{
// skip whitespace, then somehow read a T into ‘tvar’
rreettuurrnn *tthhiiss;
}
Because >> skips whitespace, you can read a sequence of whitespace-separated integers like this:
iinntt rreeaadd__iinnttss(vveeccttoorr<iinntt>& vv) // fill v, return number of ints read
{
iinntt i = 00;
w
whhiillee (ii<vv.ssiizzee() && cciinn>>vv[ii]) ii++;
rreettuurrnn ii;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.3.2
Input of Built-In Types
615
A non-iinntt on the input will cause the input operation to fail and thus terminate the input loop. For
example, the input:
1 2 3 4 55.66 7 88.
will have rreeaadd__iinnttss() read in the five integers
1 2 3 4 5
and leave the dot as the next character to be read from input. Whitespace is defined as the standard
C whitespace (blank, tab, newline, formfeed, and carriage return) by a call to iissssppaaccee() as defined
in <ccccttyyppee> (§20.4.2).
The most common mistake when using iissttrreeaam
ms is to fail to notice that input didn’t happen as
expected because the input wasn’t of the expected format. One should either check the state of an
input stream (§21.3.3) before relying on values supposedly read in or use exceptions (§21.3.6).
The format expected for input is specified by the current locale (§21.7). By default, the bbooooll
values ttrruuee and ffaallssee are represented by 1 and 00, respectively. Integers must be decimal and
floating-point numbers of the form used to write them in a C++ program. By setting bbaassee__ffiieelldd
(§21.4.2), it is possible to read 00112233 as an octal number with the decimal value 8833 and 00xxffff as a
hexadecimal number with the decimal value 225555. The format used to read pointers is completely
implementation-dependent (have a look to see what your implementation does).
Surprisingly, there is no member >> for reading a character. The reason is simply that >> for
characters can be implemented using the ggeett() character input operations (§21.3.4), so it doesn’t
need to be a member. From a stream, we can read a character into the stream’s character type. If
that character type is cchhaarr, we can also read into a ssiiggnneedd cchhaarr and uunnssiiggnneedd cchhaarr:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&, C
Chh&);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>&, uunnssiiggnneedd cchhaarr&);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>&, ssiiggnneedd cchhaarr&);
From a user’s point of view, it does not matter whether a >> is a member.
Like the other >> operators, these functions first skip whitespace. For example:
vvooiidd ff()
{
cchhaarr cc;
cciinn >> cc;
// ...
}
This places the first non-whitespace character from cciinn into cc.
In addition, we can read into an array of characters:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
616
Streams
Chapter 21
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&, C
Chh*);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>&, uunnssiiggnneedd cchhaarr*);
tteem
mppllaattee<ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<cchhaarr,T
Trr>&, ssiiggnneedd cchhaarr*);
These operations first skip whitespace. Then they read into their array operand until they encounter
a whitespace character or end-of-file. Finally, they terminate the string with a 00. Clearly, this
offers ample opportunity for overflow, so reading into a ssttrriinngg (§20.3.15) is usually better. However, you can specify a maximum for the number of characters to be read by >>: iiss.w
wiiddtthh(nn)
specifies that the next >> on iiss will read at most nn-11 characters into an array. For example:
vvooiidd gg()
{
cchhaarr vv[44];
cciinn.w
wiiddtthh(44);
cciinn >> vv;
ccoouutt << "vv = " << v << eennddll;
}
This will read at most three characters into v and add a terminating 00.
Setting w
wiiddtthh() for an iissttrreeaam
m affects only the immediately following >> into an array and
does not affect reading into other types of variables.
21.3.3 Stream State [io.state]
Every stream (iissttrreeaam
m or oossttrreeaam
m) has a state associated with it. Errors and nonstandard conditions are handled by setting and testing this state appropriately.
The stream state is found in bbaassiicc__iissttrreeaam
m’s base bbaassiicc__iiooss from <iiooss>:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
ppuubblliicc:
// ...
bbooooll
bbooooll
bbooooll
bbooooll
ggoooodd() ccoonnsstt;
eeooff() ccoonnsstt;
ffaaiill() ccoonnsstt;
bbaadd() ccoonnsstt;
// next operation might succeed
// end of input seen
// next operation will fail
// stream is corrupted
iioossttaattee rrddssttaattee() ccoonnsstt;
// get io state flags
vvooiidd cclleeaarr(iioossttaattee f = ggooooddbbiitt);
// set io state flags
vvooiidd sseettssttaattee(iioossttaattee ff) { cclleeaarr(rrddssttaattee()|ff); } // add f to io state flags
ooppeerraattoorr vvooiidd*() ccoonnsstt;
bbooooll ooppeerraattoorr!() ccoonnsstt { rreettuurrnn ffaaiill(); }
// nonzero if !fail()
// ...
};
If the state is ggoooodd() the previous input operation succeeded. If the state is ggoooodd(), the next input
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.3.3
Stream State
617
operation might succeed; otherwise, it will fail. Applying an input operation to a stream that is not
in the ggoooodd() state is a null operation. If we try to read into a variable v and the operation fails,
the value of v should be unchanged (it is unchanged if v is a variable of one of the types handled by
iissttrreeaam
m or oossttrreeaam
m member functions). The difference between the states ffaaiill() and bbaadd() is
subtle. When the state is ffaaiill() but not also bbaadd(), it is assumed that the stream is uncorrupted
and that no characters have been lost. When the state is bbaadd(), all bets are off.
The state of a stream is represented as a set of flags. Like most constants used to express the
behavior of streams, these flags are defined in bbaassiicc__iiooss’ base iiooss__bbaassee:
ccllaassss iiooss__bbaassee {
ppuubblliicc:
// ...
ttyyppeeddeeff implementation_defined2 iioossttaattee;
ssttaattiicc ccoonnsstt iioossttaattee bbaaddbbiitt,
// stream is corrupted
eeooffbbiitt,
// end-of-file seen
ffaaiillbbiitt,
// next operation will fail
ggooooddbbiitt; // goodbit==0
// ...
};
The I/O state flags can be directly manipulated. For example:
vvooiidd ff()
{
iiooss__bbaassee::iioossttaattee s = cciinn.rrddssttaattee(); // returns a set of iostate bits
iiff (ss & iiooss__bbaassee::bbaaddbbiitt) {
// cin characters possibly lost
}
// ...
cciinn.sseettssttaattee(iiooss__bbaassee::ffaaiillbbiitt);
// ...
}
When a stream is used as a condition, the state of the stream is tested by ooppeerraattoorr vvooiidd*() or
ooppeerraattoorr!(). The test succeeds only if the state is ggoooodd(). For example, a general copy function
can be written like this:
tteem
mppllaattee<ccllaassss T
T> vvooiidd iiooccooppyy(iissttrreeaam
m& iiss, oossttrreeaam
m& ooss)
{
T bbuuff;
w
whhiillee (iiss>>bbuuff) ooss << bbuuff << ´\\nn´;
}
The iiss>>bbuuff returns a reference to iiss, which is tested by a call of iiss::ooppeerraattoorr vvooiidd*(). For
example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
618
Streams
Chapter 21
vvooiidd ff(iissttrreeaam
m& ii11, iissttrreeaam
m& ii22, iissttrreeaam
m& ii33, iissttrreeaam
m& ii44)
{
iiooccooppyy<ccoom
mpplleexx>(ii11,ccoouutt);
// copy complex numbers
iiooccooppyy<ddoouubbllee>(ii22,ccoouutt);
// copy doubles
iiooccooppyy<cchhaarr>(ii33,ccoouutt);
// copy chars
iiooccooppyy<ssttrriinngg>(ii44,ccoouutt);
// copy whitespace-separated words
}
21.3.4 Input of Characters [io.in.unformatted]
The >> operator is intended for formatted input; that is, reading objects of an expected type and
format. Where this is not desirable and we want to read characters as characters and then examine
them, we use the ggeett() functions:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iissttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
// unformatted input:
ssttrreeaam
mssiizzee ggccoouunntt() ccoonnsstt;
// number of char read by last get()
iinntt__ttyyppee ggeett();
// read one Ch (or Tr::eof())
bbaassiicc__iissttrreeaam
m& ggeett(C
Chh& cc);
// read one Ch into c
bbaassiicc__iissttrreeaam
m& ggeett(C
Chh* pp, ssttrreeaam
mssiizzee nn);
// newline is terminator
bbaassiicc__iissttrreeaam
m& ggeett(C
Chh* pp, ssttrreeaam
mssiizzee nn, C
Chh tteerrm
m);
bbaassiicc__iissttrreeaam
m& ggeettlliinnee(C
Chh* pp, ssttrreeaam
mssiizzee nn); // newline is terminator
bbaassiicc__iissttrreeaam
m& ggeettlliinnee(C
Chh* pp, ssttrreeaam
mssiizzee nn, C
Chh tteerrm
m);
bbaassiicc__iissttrreeaam
m& iiggnnoorree(ssttrreeaam
mssiizzee n = 11, iinntt__ttyyppee t = T
Trr::eeooff());
bbaassiicc__iissttrreeaam
m& rreeaadd(C
Chh* pp, ssttrreeaam
mssiizzee nn);
// read at most n char
// ...
};
The ggeett() and ggeettlliinnee() functions treat whitespace characters exactly like other characters. They
are intended for input operations, where one doesn’t make assumptions about the meanings of the
characters read.
The function iissttrreeaam
m::ggeett(cchhaarr&) reads a single character into its argument. For example, a
character-by-character copy program can be written like this:
iinntt m
maaiinn()
{
cchhaarr cc;
w
whhiillee(cciinn.ggeett(cc)) ccoouutt.ppuutt(cc);
}
The three-argument ss.ggeett(pp,nn,tteerrm
m) reads at at most nn-11 characters into pp[00]..pp[nn-22]. A
call of ggeett() will always place a 0 at the end of the characters (if any) it placed in the buffer, so p
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.3.4
Input of Characters
619
must point to an array of at least n characters. The third argument, tteerrm
m, specifies a terminator. A
typical use of the three-argument ggeett() is to read a ‘‘line’’ into a fixed-sized buffer for further
analysis. For example:
vvooiidd ff()
{
cchhaarr bbuuff[110000];
cciinn >> bbuuff;
cciinn.ggeett(bbuuff,110000,´\\nn´);
// ...
}
// suspect: will overflow some day
// safe
If the terminator is found, it is left as the first unread character on the stream. Never call ggeett()
twice without removing the terminator. For example:
vvooiidd ssuubbttllee__iinnffiinniittee__lloooopp()
{
cchhaarr bbuuff[225566];
w
whhiillee (cciinn) {
cciinn.ggeett(bbuuff,225566);
ccoouutt << bbuuff;
}
// read a line
// print a line. Oops: forgot to remove ’\n’ from cin
}
This example is a good reason to prefer ggeettlliinnee() over ggeett(). A ggeettlliinnee() behaves like its corresponding ggeett(), except that it removes its terminator from the iissttrreeaam
m. For example:
vvooiidd ff()
{
cchhaarr w
woorrdd[110000][M
MA
AX
X];
iinntt i = 00;
w
whhiillee(cciinn.ggeettlliinnee(w
woorrdd[ii++],110000,´\\nn´) && ii<M
MA
AX
X);
// ...
}
When efficiency isn’t paramount, it is better to read into a ssttrriinngg (§3.6, §20.3.15). In that way, the
most common allocation and overflow problems cannot occur. However, the ggeett(), ggeettlliinnee(),
and rreeaadd() functions are needed to implement such higher-level facilities. The relatively messy
interface is the price we pay for speed, for not having to re-scan the input to figure out what terminated the input operation, for being able to reliably limit the number of characters read, etc.
A call rreeaadd(pp,nn) reads at most n characters into pp[00]..pp[nn-11]. The read function does not
rely on a terminator, and it doesn’t put a terminating 0 into its target. Consequently, it really can
read n characters (rather than just nn-11). In other words, it simply reads characters and doesn’t try
to make its target into a C-style string.
The iiggnnoorree() function reads characters like rreeaadd(), but it doesn’t store them anywhere. Like
rreeaadd(), it really can read n characters (rather than nn-11). The default number of characters read by
iiggnnoorree() is 11, so a call of iiggnnoorree() without an argument means ‘‘throw the next character away.’’
Like ggeettlliinnee(), it optionally takes a terminator and removes that terminator from the input stream
if it gets to it. Note that iiggnnoorree()’s default terminator is end-of-file.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
620
Streams
Chapter 21
For all of these functions, it is not immediately obvious what terminated the read – and it can
be hard even to remember which function has what termination criterion. However, we can always
inquire whether we reached end-of-file (§21.3.3). Also, ggccoouunntt() gives the number of characters
read from the stream by the most recent, unformatted input function call. For example:
vvooiidd rreeaadd__aa__lliinnee(iinntt m
maaxx)
{
// ...
iiff (cciinn.ffaaiill()) {
// Oops: bad input format
cciinn.cclleeaarr();
// clear the input flags (§21.3.3)
cciinn.iiggnnoorree(m
maaxx,´;´);
// skip to semicolon
iiff (!cciinn) {
// oops: we reached the end of the stream
}
eellssee iiff (cciinn.ggccoouunntt()==m
maaxx) {
// oops: read max characters
}
eellssee {
// found and discarded the semicolon
}
}
}
Unfortunately, if the maximum number of characters are read there is no way of knowing whether
the terminator was found (as the last character).
The ggeett() that doesn’t take an argument is the <iioossttrreeaam
m> version of the <ccssttddiioo> ggeettcchhaarr()
(§21.8). It simply reads a character and returns the character’s numeric value. In that way, it
avoids making assumptions about the character type used. If there is no input character to return,
ggeett() returns a suitable ‘‘end-of-file’’ marker (that is, the stream’s ttrraaiittss__ttyyppee::eeooff()) and sets
the iissttrreeaam
m into eeooff-state (§21.3.3). For example:
vvooiidd ff(uunnssiiggnneedd cchhaarr* pp)
{
iinntt ii;
w
whhiillee((ii = cciinn.ggeett()) && ii!=E
EO
OF
F) {
*pp++ = ii;
// ...
}
}
E
EO
OF
F is the value of eeooff() from the usual cchhaarr__ttrraaiittss for cchhaarr. E
EO
OF
F is presented in <iioossttrreeaam
m>.
Thus, this loop could have been written rreeaadd(pp,M
MA
AX
X__IIN
NT
T), but presumably we wrote an explicit
loop because we wanted to look at each character as it came in. It has been said that C’s greatest
strength is its ability to read a character and decide to do nothing with it – and to do this fast. It is
indeed an important and underrated strength, and one that C++ aims to preserve.
The standard header <ccccttyyppee> defines several functions that can be useful when processing
input (§20.4.2). For example, an eeaattw
whhiittee() function that reads whitespace characters from a
stream could be defined like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.3.4
Input of Characters
621
iissttrreeaam
m& eeaattw
whhiittee(iissttrreeaam
m& iiss)
{
cchhaarr cc;
w
whhiillee (iiss.ggeett(cc)) {
iiff (!iissssppaaccee(cc)) { // is c a whitespace character?
iiss.ppuuttbbaacckk(cc); // put c back into the input buffer
bbrreeaakk;
}
}
rreettuurrnn iiss;
}
The call iiss.ppuuttbbaacckk(cc) makes c be the next character read from the stream iiss (§21.6.4).
21.3.5 Input of User-Defined Types [io.in.udt]
An input operation can be defined for a user-defined type exactly as an output operation was. However, for an input operation, it is essential that the second argument be of a non-ccoonnsstt reference
type. For example:
iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m& ss, ccoom
mpplleexx& aa)
/*
input formats for a complex ("f" indicates a floating-point number):
f
(f)
(f,f)
*/
{
ddoouubbllee rree = 00, iim
m = 00;
cchhaarr c = 00;
s >> cc;
iiff (cc == ´(´) {
s >> rree >> cc;
iiff (cc == ´,´) s >> iim
m >> cc;
iiff (cc != ´)´) ss.cclleeaarr(iiooss__bbaassee::bbaaddbbiitt); // set state
}
eellssee {
ss.ppuuttbbaacckk(cc);
s >> rree;
}
iiff (ss) a = ccoom
mpplleexx(rree,iim
m);
rreettuurrnn ss;
}
Despite the scarcity of error-handling code, this will actually handle most kinds of errors. The local
variable c is initialized to avoid having its value accidentally be ´(´ after a failed first >> operation. The final check of the stream state ensures that the value of the argument a is changed only if
everything went well.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
622
Streams
Chapter 21
The operation for setting a stream state is called cclleeaarr() because its most common use is to
reset the state of a stream to ggoooodd(); iiooss__bbaassee::ggooooddbbiitt is the default argument value for
iiooss__bbaassee::cclleeaarr() (§21.3.3).
21.3.6 Exceptions [io.except]
It is not convenient to test for errors after each I/O operation, so a common cause of error is failing
to do so where it matters. In particular, output operations are typically unchecked, but they do
occasionally fail.
The only function that directly changes the state of a stream is cclleeaarr(). Thus, an obvious way
of getting notified by a state change is to ask cclleeaarr() to throw an exception. The iiooss__bbaassee member eexxcceeppttiioonnss() does just that:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
ppuubblliicc:
// ...
ccllaassss ffaaiilluurree;
// exception class (see §14.10)
iioossttaattee eexxcceeppttiioonnss() ccoonnsstt;
vvooiidd eexxcceeppttiioonnss(iioossttaattee eexxcceepptt);
// get exception state
// set exception state
// ...
};
For example,
ccoouutt.eexxcceeppttiioonnss(iiooss__bbaassee::bbaaddbbiitt|iiooss__bbaassee::ffaaiillbbiitt|iiooss__bbaassee::eeooffbbiitt);
requests that cclleeaarr() should throw an iiooss__bbaassee::ffaaiilluurree exception if ccoouutt goes into states bbaadd,
ffaaiill, or eeooff – in other words, if any output operation on ccoouutt doesn’t perform flawlessly. Similarly,
cciinn.eexxcceeppttiioonnss(iiooss__bbaassee::bbaaddbbiitt|iiooss__bbaassee::ffaaiillbbiitt);
allows us to catch the not-too-uncommon case in which the input is not in the format we expected,
so an input operation didn’t return a value from the stream.
A call of eexxcceeppttiioonnss() with no arguments returns the set of I/O state flags that triggers an
exception. For example:
vvooiidd pprriinntt__eexxcceeppttiioonnss(iiooss__bbaassee& iiooss)
{
iiooss__bbaassee::iioossttaattee s = iiooss.eexxcceeppttiioonnss();
iiff (ss&iiooss__bbaassee::bbaaddbbiitt) ccoouutt << "tthhrroow
wss ffoorr bbaadd";
iiff (ss&iiooss__bbaassee::ffaaiillbbiitt) ccoouutt << "tthhrroow
wss ffoorr ffaaiill";
iiff (ss&iiooss__bbaassee::eeooffbbiitt) ccoouutt << "tthhrroow
wss ffoorr eeooff";
iiff (ss == 00) ccoouutt << "ddooeessnn´tt tthhrroow
w";
}
The primary use of I/O exceptions is to catch unlikely – and therefore often forgotten – errors.
Another is to control I/O. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.3.6
Exceptions
623
vvooiidd rreeaaddiinnttss(vveeccttoorr<iinntt>& ss)
// not my favorite style!
{
iiooss__bbaassee::iioossttaattee oolldd__ssttaattee = cciinn.eexxcceeppttiioonnss(); // save exception state
cciinn.eexxcceeppttiioonnss(iiooss__bbaassee::eeooffbbiitt);
// throw for eof
ffoorr (;;)
ttrryy {
iinntt ii;
cciinn>>ii;
ss.ppuusshh__bbaacckk(ii);
}
ccaattcchh(iiooss__bbaassee::eeooff) {
// ok: end of file reached
}
cciinn.eexxcceeppttiioonnss(oolldd__ssttaattee);
// reset exception state
}
The question to ask about this use of exceptions is, ‘‘Is that an error?’’ or ‘‘Is that really exceptional?’’ (§14.5). Usually, I find that the answer to either question is no. Consequently, I prefer to
deal with the stream state directly. What can be handled with local control structures within a function is rarely improved by the use of exceptions.
21.3.7 Tying of Streams [io.tie]
The bbaassiicc__iiooss function ttiiee() is used to set up and break connections between an iissttrreeaam
m and an
oossttrreeaam
m:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss ssttdd::bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
// ...
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>* ttiiee() ccoonnsstt;
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>* ttiiee(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>* ss);
// get pointer to tied stream
// tie *this to s
// ...
};
Consider:
ssttrriinngg ggeett__ppaassssw
wdd()
{
ssttrriinngg ss;
ccoouutt << "P
Paassssw
woorrdd: ";
cciinn >> ss;
// ...
}
How can we be sure that P
Paassssw
woorrdd: appears on the screen before the read operation is executed?
The output on ccoouutt is buffered, so if cciinn and ccoouutt had been independent P
Paassssw
woorrdd: would not
have appeared on the screen until the output buffer was full. The answer is that ccoouutt is tied to cciinn
by the operation cciinn.ttiiee(&ccoouutt).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
624
Streams
Chapter 21
When an oossttrreeaam
m is tied to an iissttrreeaam
m, the oossttrreeaam
m is flushed whenever an input operation on
the iissttrreeaam
m causes underflow; that is, whenever new characters are needed from the ultimate input
source to complete the input operation. Thus,
ccoouutt << "P
Paassssw
woorrdd: ";
cciinn >> ss;
is equivalent to:
ccoouutt << "P
Paassssw
woorrdd: ";
ccoouutt.fflluusshh();
cciinn >> ss;
A stream can have at most one oossttrreeaam
m at a time tied to it. A call ss.ttiiee(00) unties the stream s
from the stream it was tied to, if any. Like most other stream functions that set a value, ttiiee(ss)
returns the previous value; that is, it returns the previously tied stream or 00. A call without an argument, ttiiee(), returns the current value without changing it.
Of the standard streams, ccoouutt is tied to cciinn and w
wccoouutt is tied to w
wcciinn. The cceerrrr streams need
not be tied because they are unbuffered, while the cclloogg streams are not meant for user interaction.
21.3.8 Sentries [io.sentry]
When I wrote operators << and >> for ccoom
mpplleexx, I did not worry about tied streams (§21.3.7) or
whether changing stream state would cause exceptions (§21.3.6). I assumed – correctly – that the
library-provided functions would take care of that for me. But how? There are a couple of dozen
such functions. If we had to write intricate code to handle tied streams, llooccaallees (§21.7), exceptions,
etc., in each, then the code could get rather messy.
The approach taken is to provide the common code through a sseennttrryy class. Code that needs to
be executed first (the ‘‘prefix code’’) – such as flushing a tied stream – is provided as the sseennttrryy’s
constructor. Code that needs to be executed last (the ‘‘suffix code’’) – such as throwing exceptions
caused by state changes – is provided as the sseennttrryy’s destructor:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__oossttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
// ...
ccllaassss sseennttrryy;
// ...
};
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>::sseennttrryy {
ppuubblliicc:
eexxpplliicciitt sseennttrryy(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ss);
~sseennttrryy();
ooppeerraattoorr bbooooll();
// ...
};
Thus, common code is factored out and an individual function can be written like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.3.8
Sentries
625
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>::ooppeerraattoorr<<(iinntt ii)
{
sseennttrryy ss(*tthhiiss);
iiff (!ss) { // check whether all is well for output to start
sseettssttaattee(ffaaiillbbiitt);
rreettuurrnn *tthhiiss;
}
// output the int
rreettuurrnn *tthhiiss;
}
This technique of using constructors and destructors to provide common prefix and suffix code
through a class is useful in many contexts.
Naturally, bbaassiicc__iissttrreeaam
m has a similar sseennttrryy member class.
21.4 Formatting [io.format]
The examples in §21.2 were all of what is commonly called unformatted output. That is, an object
was turned into a sequence of characters according to default rules. Often, the programmer needs
more detailed control. For example, we need to be able to control the amount of space used for an
output operation and the format used for output of numbers. Similarly, some aspects of input can
be explicitly controlled.
Control of I/O formatting resides in class bbaassiicc__iiooss and its base iiooss__bbaassee. For example, class
bbaassiicc__iiooss holds the information about the base (octal, decimal, or hexadecimal) to be used when
integers are written or read, the precision of floating-point numbers written or read, etc. It also
holds the functions to set and examine these per-stream control variables.
Class bbaassiicc__iiooss is a base of bbaassiicc__iissttrreeaam
m and bbaassiicc__oossttrreeaam
m, so format control is on a perstream basis.
21.4.1 Format State [io.format.state]
Formatting of I/O is controlled by a set of flags and integer values in the stream’s iiooss__bbaassee:
ccllaassss iiooss__bbaassee {
ppuubblliicc:
// ...
// names of format flags:
ttyyppeeddeeff implementation_defined1 ffm
mttffllaaggss;
ssttaattiicc ccoonnsstt ffm
mttffllaaggss
sskkiippw
wss,
// skip whitespace on input
lleefftt,
rriigghhtt,
iinntteerrnnaall,
// field adjustment: pad after value
// pad before value
// pad between sign and value
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
626
Streams
Chapter 21
bboooollaallpphhaa,
// use symbolic representation of true and false
ddeecc,
hheexx,
oocctt,
// integer base: base 10 output (decimal)
// base 16 output (hexadecimal)
// base 8 output (octal)
sscciieennttiiffiicc,
ffiixxeedd,
// floating-point notation: d.ddddddEdd
// dddd.dd
sshhoow
wbbaassee,
sshhoow
wppooiinntt,
sshhoow
wppooss,
uuppppeerrccaassee,
// on output prefix oct by 0 and hex by 0x
// print trailing zeros
// explicit ’+’ for positive ints
// ’E’, ’X’ rather than ’e’, ’x’
aaddjjuussttffiieelldd,
bbaasseeffiieelldd,
ffllooaattffiieelldd;
// flags related to field adjustment (§21.4.5)
// flags related to integer base (§21.4.2)
// flags related to floating-point output (§21.4.3)
ffm
mttffllaaggss uunniittbbuuff;
// flush output after each output operation
ffm
mttffllaaggss ffllaaggss() ccoonnsstt;
ffm
mttffllaaggss ffllaaggss(ffm
mttffllaaggss ff);
// read flags
// set flags
ffm
mttffllaaggss sseettff(ffm
mttffllaaggss ff) { rreettuurrnn ffllaaggss(ffllaaggss()|ff); }
// add flag
ffm
mttffllaaggss sseettff(ffm
mttffllaaggss ff, ffm
mttffllaaggss m
maasskk) { rreettuurrnn ffllaaggss(ffllaaggss()|(ff&m
maasskk)); }// add flag
vvooiidd uunnsseettff(ffm
mttffllaaggss m
maasskk) { ffllaaggss(ffllaaggss()&~m
maasskk); }
// clear flags
// ...
};
The values of the flags are implementation-defined. Use the symbolic names exclusively, rather
than specific numeric values, even if those values happen to be correct on your implementation
today.
Defining an interface as a set of flags, and providing operations for setting and clearing those
flags is a time-honored if somewhat old-fashioned technique. Its main virtue is that a user can
compose a set of options. For example:
ccoonnsstt iiooss__bbaassee::ffm
mttffllaaggss m
myy__oopptt = iiooss__bbaassee::lleefftt|iiooss__bbaassee::oocctt|iiooss__bbaassee::ffiixxeedd;
This allows us to pass options around and install them where needed. For example:
vvooiidd yyoouurr__ffuunnccttiioonn(iiooss__bbaassee::ffm
mttffllaaggss oopptt)
{
iiooss__bbaassee::ffm
mttffllaaggss oolldd__ooppttiioonnss = ccoouutt.ffllaaggss(oopptt);
// ...
ccoouutt.ffllaaggss(oolldd__ooppttiioonnss); // reset options
}
// save old_options and set new ones
vvooiidd m
myy__ffuunnccttiioonn()
{
yyoouurr__ffuunnccttiioonn(m
myy__oopptt);
// ...
}
The ffllaaggss() function returns the old option set.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.4.1
Format State
627
Being able to read and set all options allows us to set an individual flag. For example:
m
myyoossttrreeaam
m.ffllaaggss(m
myyoossttrreeaam
m.ffllaaggss()|iiooss__bbaassee::sshhoow
wppooss);
This makes m
myyssttrreeaam
m display an explicit + in front of positive numbers without affecting other
options. The old options are read, and sshhoow
wppooss is set by or-ing it into the set. The function sseettff()
does exactly that, so the example could equivalently have been written:
m
myyoossttrreeaam
m.sseettff(iiooss__bbaassee::sshhoow
wppooss);
Once set, a flag retains its value until it is unset.
Controlling I/O options by explicitly setting and clearing flags is crude and error-prone. For
simple cases, manipulators (§21.4.6) provide a cleaner interface. Using flags to control stream state
is a better study in implementation technique than in interface design.
21.4.1.1 Copying Format State [io.copyfmt]
The complete format state of a stream can be copied by ccooppyyffm
mtt():
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
ppuubblliicc:
// ...
bbaassiicc__iiooss& ccooppyyffm
mtt(ccoonnsstt bbaassiicc__iiooss& ff);
// ...
};
The stream’s buffer (§21.6) and the state of that buffer isn’t copied by ccooppyyffm
mtt(). However, all of
the rest of the state is, including the requested exceptions (§21.3.6) and any user-supplied additions
to that state (§21.7.1).
21.4.2 Integer Output [io.out.int]
The technique of or-ing in a new option with ffllaaggss() or sseettff() works only when a single bit controls a feature. This is not the case for options such as the base used for printing integers and the
style of floating-point output. For such options, the value that specifies a style is not necessarily
represented by a single bit or as a set of independent single bits.
The solution adopted in <iioossttrreeaam
m> is to provide a version of sseettff() that takes a second
‘‘pseudo argument’’ that indicates which kind of option we want to set in addition to the new value.
For example,
ccoouutt.sseettff(iiooss__bbaassee::oocctt,iiooss__bbaassee::bbaasseeffiieelldd); // octal
ccoouutt.sseettff(iiooss__bbaassee::ddeecc,iiooss__bbaassee::bbaasseeffiieelldd); // decimal
ccoouutt.sseettff(iiooss__bbaassee::hheexx,iiooss__bbaassee::bbaasseeffiieelldd); // hexadecimal
sets the base of integers without side effects on other parts of the stream state. Once set, a base is
used until reset. For example,
ccoouutt << 11223344 << ´ ´ << 11223344 << ´ ´;
//default: decimal
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
628
Streams
Chapter 21
ccoouutt.sseettff(iiooss__bbaassee::oocctt,iiooss__bbaassee::bbaasseeffiieelldd); // octal
ccoouutt << 11223344 << ´ ´ << 11223344 << ´ ´;
ccoouutt.sseettff(iiooss__bbaassee::hheexx,iiooss__bbaassee::bbaasseeffiieelldd); // hexadecimal
ccoouutt << 11223344 << ´ ´ << 11223344 << ´ ´;
produces 11223344 11223344 22332222 22332222 44dd22 44dd22.
If we need to be able to tell which base was used for each number, we can set sshhoow
wbbaassee. Thus,
adding
ccoouutt.sseettff(iiooss__bbaassee::sshhoow
wbbaassee);
before the previous operations, we get 11223344 11223344 0022332222 0022332222 00xx44dd22 00xx44dd22. The standard
manipulators (§21.4.6.2) provide a more elegant way of specifying the base of integer output.
21.4.3 Floating-Point Output [io.out.float]
Floating-point output is controlled by a format and a precision:
– The general format lets the implementation choose a format that presents a value in the style
that best preserves the value in the space available. The precision specifies the maximum
number of digits. It corresponds to pprriinnttff()’s %gg (§21.8).
– The scientific format presents a value with one digit before a decimal point and an exponent.
The precision specifies the maximum number of digits after the decimal point. It corresponds to pprriinnttff()’s %ee.
– The fixed format presents a value as an integer part followed by a decimal point and a fractional part. The precision specifies the maximum number of digits after the decimal point.
It corresponds to pprriinnttff()’s %ff.
We control the floating-point output format through the state manipulation functions. In particular,
we can set the notation used for printing floating-point values without side effects on other parts of
the stream state. For example,
ccoouutt << "ddeeffaauulltt:\\tt" << 11223344.5566778899 << ´\\nn´;
ccoouutt.sseettff(iiooss__bbaassee::sscciieennttiiffiicc,iiooss__bbaassee::ffllooaattffiieelldd); // use scientific format
ccoouutt << "sscciieennttiiffiicc:\\tt" << 11223344.5566778899 << ´\\nn´;
ccoouutt.sseettff(iiooss__bbaassee::ffiixxeedd,iiooss__bbaassee::ffllooaattffiieelldd);
ccoouutt << "ffiixxeedd:\\tt" << 11223344.5566778899 << ´\\nn´;
// use fixed-point format
ccoouutt.sseettff(00,iiooss__bbaassee::ffllooaattffiieelldd);
ccoouutt << "ddeeffaauulltt:\\tt" << 11223344.5566778899 << ´\\nn´;
// reset to default (that is, general format)
produces
ddeeffaauulltt:
sscciieennttiiffiicc:
ffiixxeedd:
ddeeffaauulltt:
11223344.5577
11.223344556688ee+0033
11223344.556677889900
11223344.5577
The default precision (for all formats) is 66. The precision is controlled by an iiooss__bbaassee member
function:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.4.3
Floating-Point Output
629
ccllaassss iiooss__bbaassee {
ppuubblliicc:
// ...
ssttrreeaam
mssiizzee pprreecciissiioonn() ccoonnsstt;
// get precision
ssttrreeaam
mssiizzee pprreecciissiioonn(ssttrreeaam
mssiizzee nn); // set precision (and get old precision)
// ...
};
A call of pprreecciissiioonn() affects all floating-point I/O operations for a stream up until the next call of
pprreecciissiioonn(). Thus,
ccoouutt.pprreecciissiioonn(88);
ccoouutt << 11223344.5566778899 << ´ ´ << 11223344.5566778899 << ´ ´ << 112233445566 << ´\\nn´;
ccoouutt.pprreecciissiioonn(44);
ccoouutt << 11223344.5566778899 << ´ ´ << 11223344.5566778899 << ´ ´ << 112233445566 << ´\\nn´;
produces
11223344.55667799 11223344.55667799 112233445566
11223355 11223355 112233445566
Note that floating-point values are rounded rather than just truncated and that pprreecciissiioonn() doesn’t
affect integer output.
The uuppppeerrccaassee flag (§21.4.1) determines whether e or E is used to indicate the exponents in the
scientific format.
Manipulators provide a more elegant way of specifying output format for floating-point output
(§21.4.6.2).
21.4.4 Output Fields [io.fields]
Often, we want to fill a specific space on an output line with text. We want to use exactly n characters and not fewer (and more only if the text does not fit). To do this, we specify a field width and a
character to be used if padding is needed:
ccllaassss iiooss__bbaassee {
ppuubblliicc:
// ...
ssttrreeaam
mssiizzee w
wiiddtthh() ccoonnsstt;
// get field width
ssttrreeaam
mssiizzee w
wiiddtthh(ssttrreeaam
mssiizzee w
wiiddee); // set field width
// ...
};
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
ppuubblliicc:
// ...
C
Chh ffiillll() ccoonnsstt;
// get filler character
C
Chh ffiillll(C
Chh cchh);
// set filler character
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
630
Streams
Chapter 21
The w
wiiddtthh() function specifies the minimum number of characters to be used for the next standard
library << output operation of a numeric value, bbooooll, C-style string, character, pointer (§21.2.1),
ssttrriinngg (§20.3.15), and bbiittffiieelldd (§17.5.3.3). For example,
ccoouutt.w
wiiddtthh(44);
ccoouutt << 1122;
will print 1122 preceded by two spaces.
The ‘‘padding’’ or ‘‘filler’’ character can be specified by the ffiillll() function. For example,
ccoouutt.w
wiiddtthh(44);
ccoouutt.ffiillll(´#´);
ccoouutt << "aabb";
gives the output ##aabb.
The default fill character is the space character and the default field size is 00, meaning ‘‘as many
characters as needed.’’ The field size can be reset to its default value like this:
ccoouutt.w
wiiddtthh(00); // ‘‘as many characters as needed’’
A call w
wiiddtthh(nn) function sets the minimum number of characters to nn. If more characters are provided, they will all be printed. For example,
ccoouutt.w
wiiddtthh(44);
ccoouutt << "aabbccddeeff";
produces aabbccddeeff rather than just aabbccdd. It is usually better to get the right output looking ugly than
to get the wrong output looking just fine (see also §21.10[21]).
Aw
wiiddtthh(nn) call affects only the immediately following << output operation:
ccoouutt.w
wiiddtthh(44);
ccoouutt.ffiillll(´#´);
ccoouutt << 1122 << ´:´ << 1133;
This produces ##1122:1133, rather than ##1122###:##1133, as would have been the case had w
wiiddtthh(44)
applied to subsequent operations. Had all subsequent output operations been affected by w
wiiddtthh(),
we would have had to explicitly specify w
wiiddtthh() for essentially all values.
The standard manipulators (§21.4.6.2) provide a more elegant way of specifying the width of an
output field.
21.4.5 Field Adjustment [io.field.adjust]
The adjustment of characters within a field can be controlled by sseettff() calls:
ccoouutt.sseettff(iiooss__bbaassee::lleefftt,iiooss__bbaassee::aaddjjuussttffiieelldd);
// left
ccoouutt.sseettff(iiooss__bbaassee::rriigghhtt,iiooss__bbaassee::aaddjjuussttffiieelldd);
// right
ccoouutt.sseettff(iiooss__bbaassee::iinntteerrnnaall,iiooss__bbaassee::aaddjjuussttffiieelldd); // internal
This sets the adjustment of output within an output field defined by iiooss__bbaassee::w
wiiddtthh() without
side effects on other parts of the stream state.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.4.5
Field Adjustment
631
Adjustment can be specified like this:
ccoouutt.ffiillll(´#´);
ccoouutt << ´(´;
ccoouutt.w
wiiddtthh(44);
ccoouutt << -1122 << "),(";
ccoouutt.w
wiiddtthh(44);
ccoouutt.sseettff(iiooss__bbaassee::lleefftt,iiooss__bbaassee::aaddjjuussttffiieelldd);
ccoouutt << -1122 << "),(";
ccoouutt.w
wiiddtthh(44);
ccoouutt.sseettff(iiooss__bbaassee::iinntteerrnnaall,iiooss__bbaassee::aaddjjuussttffiieelldd);
ccoouutt << -1122 << ")";
This produces: (#-1122), (-1122#), (-#1122). Internal adjustment places fill characters between the
sign and the value. As shown, right adjustment is the default.
21.4.6 Manipulators [io.manipulators]
To save the programmer from having to deal with the state of a stream in terms of flags, the standard library provides a set of functions for manipulating that state. The key idea is to insert an
operation that modifies the state in between the objects being read or written. For example, we can
explicitly request that an output buffer be flushed:
ccoouutt << x << fflluusshh << y << fflluusshh;
Here, ccoouutt.fflluusshh() is called at the appropriate times. This is done by a version of << that takes a
pointer to function argument and invokes it:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__oossttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m& (*ff)(bbaassiicc__oossttrreeaam
m&)) { rreettuurrnn ff(*tthhiiss); }
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(iiooss__bbaassee& (*ff)(iiooss__bbaassee&));
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(bbaassiicc__iiooss<C
Chh,T
Trr>& (*ff)(bbaassiicc__iiooss<C
Chh,T
Trr>&));
// ...
};
For this to work, a function must be a nonmember or static-member function with the right type. In
particular, fflluusshh() is defined like this:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& fflluusshh(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ss)
{
rreettuurrnn ss.fflluusshh();
// call ostream’s member flush()
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
632
Streams
Chapter 21
These declarations ensure that
ccoouutt << fflluusshh;
is resolved as
ccoouutt.ooppeerraattoorr<<(fflluusshh);
which calls
fflluusshh(ccoouutt);
which then invokes
ccoouutt.fflluusshh();
The whole rigmarole is done (at compile time) to allow bbaassiicc__oossttrreeaam
m::fflluusshh() to be called
using the ccoouutt<<fflluusshh notation.
There is a wide variety of operations we might like to perform just before or just after an input
or output operation. For example:
ccoouutt << xx;
ccoouutt.fflluusshh();
ccoouutt << yy;
cciinn.nnoosskkiippw
wss();
cciinn >> xx;
// don’t skip whitespace
When the operations are written as separate statements, the logical connections between the operations are not obvious. Once the logical connection is lost, the code gets harder to understand. The
notion of manipulators allows operations such as fflluusshh() and nnoosskkiippw
wss() to be inserted directly
in the list of input or output operations. For example:
ccoouutt << x << fflluusshh << y << fflluusshh;
cciinn >> nnoosskkiippw
wss >> xx;
Naturally, class bbaassiicc__iissttrreeaam
m provides >> operators for invoking manipulators in a way similar to
class bbaassiicc__oossttrreeaam
m:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iissttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m& (*ppff)(bbaassiicc__iissttrreeaam
m&));
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(bbaassiicc__iiooss<C
Chh,T
Trr>& (*ppff)(bbaassiicc__iiooss<C
Chh,T
Trr>&));
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(iiooss__bbaassee& (*ppff)(iiooss__bbaassee&));
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.4.6.1
Manipulators Taking Arguments
633
21.4.6.1 Manipulators Taking Arguments [io.manip.arg]
Manipulators that take arguments can also be useful. For example, we might want to write
ccoouutt << sseettpprreecciissiioonn(44) << aannggllee;
to print the value of the floating-point variable aannggllee with four digits.
To do this, sseettpprreecciissiioonn must return an object that is initialized by 4 and that calls
ccoouutt::sseettpprreecciissiioonn(44) when invoked. Such a manipulator is a function object that is invoked by
<< rather than by (). The exact type of that function object is implementation-defined, but it
might be defined like this:
ssttrruucctt ssm
maanniipp {
iiooss__bbaassee& (*ff)(iiooss__bbaassee&,iinntt);
iinntt ii;
// function to be called
ssm
maanniipp(iiooss__bbaassee& (*ffff)(iiooss__bbaassee&,iinntt), iinntt iiii) : ff(ffff), ii(iiii) { }
};
tteem
mppllaattee<ccllaadddd C
Chh, ccllaassss T
Trr>
oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(oossttrreeaam
m<C
Chh,T
Trr>& ooss, ssm
maanniipp& m
m)
{
rreettuurrnn m
m.ff(ooss,m
m.ii);
}
The ssm
maanniipp constructor stores its arguments in f and ii, and ooppeerraattoorr<< calls ff(ii). We can now
define sseettpprreecciissiioonn() like this:
iiooss__bbaassee& sseett__pprreecciissiioonn(iiooss__bbaassee& ss, iinntt nn)
// helper
{
rreettuurrnn ss.sseettpprreecciissiioonn(nn); // call the member function
}
iinnlliinnee ssm
maanniipp sseettpprreecciissiioonn(iinntt nn)
{
rreettuurrnn ssm
maanniipp(sseett__pprreecciissiioonn,nn);
}
// make the function object
We can now write:
ccoouutt << sseettpprreecciissiioonn(44) << aannggllee ;
A programmer can define new manipulators in the style of ssm
maanniipp as needed (§21.10[22]). Doing
this does not require modification of the definitions of standard library templates and classes such
as bbaassiicc__iissttrreeaam
m, bbaassiicc__oossttrreeaam
m, bbaassiicc__iiooss, and iiooss__bbaassee.
21.4.6.2 Standard I/O Manipulators [io.std.manipulators]
The standard library provides manipulators corresponding to the various format states and state
changes. The standard manipulators are defined in namespace ssttdd. Manipulators taking iioo__bbaassee,
iissttrreeaam
m, and oossttrreeaam
marguments are presented in <iiooss>, <oossttrreeaam
m>, and <iioossttrreeaam
m>, respectively.
The rest of the standard manipulators are presented in <iioom
maanniipp>.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
634
Streams
Chapter 21
iiooss__bbaassee& bboooollaallpphhaa(iiooss__bbaassee&);
// symbolic representation of true and false (input and output)
iiooss__bbaassee& nnoobboooollaallpphhaa(iiooss__bbaassee& ss); // s.unsetf(ios_base::boolalpha)
iiooss__bbaassee& sshhoow
wbbaassee(iiooss__bbaassee&);
// on output prefix oct by 0 and hex by 0x
iiooss__bbaassee& nnoosshhoow
wbbaassee(iiooss__bbaassee& ss); // s.unsetf(ios_base::showbase)
iiooss__bbaassee& sshhoow
wppooiinntt(iiooss__bbaassee&);
iiooss__bbaassee& nnoosshhoow
wppooiinntt(iiooss__bbaassee& ss); // s.unsetf(ios_base::showpoint)
iiooss__bbaassee& sshhoow
wppooss(iiooss__bbaassee&);
iiooss__bbaassee& nnoosshhoow
wppooss(iiooss__bbaassee& ss); // s.unsetf(ios_base::showpos)
iiooss__bbaassee& sskkiippw
wss(iiooss__bbaassee&);
iiooss__bbaassee& nnoosskkiippw
wss(iiooss__bbaassee& ss);
// skip whitespace
// s.unsetf(ios_base::skipws)
iiooss__bbaassee& uuppppeerrccaassee(iiooss__bbaassee&);
// X and E rather than x and e
iiooss__bbaassee& nnoouuppppeerrccaassee(iiooss__bbaassee&); // x and e rather than X and E
iiooss__bbaassee& iinntteerrnnaall(iiooss__bbaassee&);
iiooss__bbaassee& lleefftt(iiooss__bbaassee&);
iiooss__bbaassee& rriigghhtt(iiooss__bbaassee&);
// adjust §21.4.5
// pad after value
// pad before value
iiooss__bbaassee& ddeecc(iiooss__bbaassee&);
iiooss__bbaassee& hheexx(iiooss__bbaassee&);
iiooss__bbaassee& oocctt(iiooss__bbaassee&);
// integer base is 10 (§21.4.2)
// integer base is 16
// integer base is 8
iiooss__bbaassee& ffiixxeedd(iiooss__bbaassee&);
iiooss__bbaassee& sscciieennttiiffiicc(iiooss__bbaassee&);
// floating-point format dddd.dd (§21.4.3)
// scientific format d.ddddEdd
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& eennddll(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&); // put ’\n’ and flush
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& eennddss(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&); // put ’\0’ and flush
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& fflluusshh(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&); // flush stream
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& w
wss(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&);
ssm
maanniipp
ssm
maanniipp
ssm
maanniipp
ssm
maanniipp
ssm
maanniipp
ssm
maanniipp
rreesseettiioossffllaaggss(iiooss__bbaassee::ffm
mttffllaaggss ff);
sseettiioossffllaaggss(iiooss__bbaassee::ffm
mttffllaaggss ff);
sseettbbaassee(iinntt bb);
sseettffiillll(iinntt cc);
sseettpprreecciissiioonn(iinntt nn);
sseettw
w(iinntt nn);
// eat whitespace
// clear flags (§21.4)
// set flags (§21.4)
// output integers in base b
// make c the fill character
// n digits after decimal point
// next field is n char
For example,
ccoouutt << 11223344 << ´,´ << hheexx << 11223344 << ´,´ << oocctt << 11223344 << eennddll;
produces 11223344,44dd22,22332222 and
ccoouutt << ´(´ << sseettw
w(44) << sseettffiillll(´#´) << 1122 << ") (" << 1122 << ")\\nn";
produces (##1122) (1122).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.4.6.2
Standard I/O Manipulators
635
When using manipulators that do not take arguments, do not add parentheses. When using standard manipulators that take arguments, remember to #iinncclluuddee <iioom
maanniipp>. For example:
#iinncclluuddee <iioossttrreeaam
m>
iinntt m
maaiinn()
{
ssttdd::ccoouutt << sseettpprreecciissiioonn(44)
<< sscciieennttiiffiicc()
<< d << eennddll;
}
// error: setprecision undefined (forgot <iomanip>)
// error: ostream<<ostream& (spurious parentheses)
21.4.6.3 User-Defined Manipulators [io.ud.manipulators]
A programmer can add manipulators in the style of the standard ones. Here, I present an additional
style that I have found useful for formatting floating-point numbers.
The pprreecciissiioonn used persists for all output operations, but a w
wiiddtthh() operation applies to the
next numeric output operation only. What I want is something that makes it simple to output a
floating-point number in a predefined format without affecting future output operations on the
stream. The basic idea is to define a class that represents formats, another that represents a format
plus a value to be formatted, and then an operator << that outputs the value to an oossttrreeaam
m according to the format. For example:
F
Foorrm
m ggeenn44(44); // general format, precision is 4
vvooiidd ff(ddoouubbllee dd)
{
F
Foorrm
m ssccii88 = ggeenn44;
ssccii88.sscciieennttiiffiicc().pprreecciissiioonn(88); // scientific format, precision 8
ccoouutt << d << ´ ´ << ggeenn44(dd) << ´ ´ << ssccii88(dd) << ´ ´ << d << ´\\nn´;
}
A call ff(11223344.5566778899) writes
11223344.5577 11223355 11.2233445566778899ee+0033 11223344.5577
Note how the use of a F
Foorrm
m doesn’t affect the state of the stream so that the last output of d has the
same default format as the first.
Here is a simplified implementation:
ccllaassss B
Boouunndd__ffoorrm
m;
// Form plus value
ccllaassss F
Foorrm
m{
ffrriieenndd oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m&, ccoonnsstt B
Boouunndd__ffoorrm
m&);
iinntt pprrcc;
iinntt w
wddtt;
iinntt ffm
mtt;
// ...
// precision
// width, 0 means as wide as necessary
// general, scientific, or fixed (§21.4.3)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
636
Streams
Chapter 21
ppuubblliicc:
eexxpplliicciitt F
Foorrm
m(iinntt p = 66) : pprrcc(pp) // default precision is 6
{
ffm
mtt = 00; // general format (§21.4.3)
w
wddtt = 00; // as wide as necessary
}
B
Boouunndd__ffoorrm
m ooppeerraattoorr()(ddoouubbllee dd) ccoonnsstt; // make a Bound_form for *this and d
F
Foorrm
m& sscciieennttiiffiicc() { ffm
mtt = iiooss__bbaassee::sscciieennttiiffiicc; rreettuurrnn *tthhiiss; }
F
Foorrm
m& ffiixxeedd() { ffm
mtt = iiooss__bbaassee::ffiixxeedd; rreettuurrnn *tthhiiss; }
F
Foorrm
m& ggeenneerraall() { ffm
mtt = 00; rreettuurrnn *tthhiiss; }
F
Foorrm
m& uuppppeerrccaassee();
F
Foorrm
m& lloow
weerrccaassee();
F
Foorrm
m& pprreecciissiioonn(iinntt pp) { pprrcc = pp; rreettuurrnn *tthhiiss; }
F
Foorrm
m& w
wiiddtthh(iinntt w
w) { w
wddtt = w
w; rreettuurrnn *tthhiiss; }
F
Foorrm
m& ffiillll(cchhaarr);
// applies to all types
F
Foorrm
m& pplluuss(bbooooll b = ttrruuee);
F
Foorrm
m& ttrraaiilliinngg__zzeerrooss(bbooooll b = ttrruuee);
// ...
// explicit plus
// print trailing zeros
};
The idea is that a F
Foorrm
m holds all the information needed to format one data item. The default is
chosen to be reasonable for many uses, and the various member functions can be used to reset individual aspects of formatting. The () operator is used to bind a value with the format to be used to
output it. A B
Boouunndd__ffoorrm
m can then be output to a given stream by a suitable << function:
ssttrruucctt B
Boouunndd__ffoorrm
m{
ccoonnsstt F
Foorrm
m& ff;
ddoouubbllee vvaall;
B
Boouunndd__ffoorrm
m(ccoonnsstt F
Foorrm
m& ffff, ddoouubbllee vv) : ff(ffff), vvaall(vv) { }
};
B
Boouunndd__ffoorrm
m F
Foorrm
m::ooppeerraattoorr()(ddoouubbllee dd) { rreettuurrnn B
Boouunndd__ffoorrm
m(*tthhiiss,dd); }
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m& ooss, ccoonnsstt B
Boouunndd__ffoorrm
m& bbff)
{
oossttrriinnggssttrreeaam
m ss;
// string streams are described in §21.5.3
ss.pprreecciissiioonn(bbff.ff.pprrcc);
ss.sseettff(bbff.ff.ffm
mtt,iiooss__bbaassee::ffllooaattffiieelldd);
s << bbff.vvaall;
// compose string in s
rreettuurrnn ooss << ss.ssttrr();
// output s to os
}
Writing a less simplistic implementation of << is left as an exercise (§21.10[21]). The F
Foorrm
m and
B
Boouunndd__ffoorrm
m classes are easily extended for formatting integers, strings, etc. (see §21.10[20]).
Note that these declarations make the combination of << and () into a ternary operator;
ccoouutt<<ssccii44(dd) collects the oossttrreeaam
m, the format, and the value into a single function before doing
any real computation.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.5
File Streams and String Streams
637
21.5 File Streams and String Streams [io.files]
When a C++ program starts, ccoouutt, cceerrrr, cclloogg, cciinn, and their wide-character equivalents (§21.2.1) are
available for use. These streams are set up by default and their correspondence with I/O devices or
files is determined by ‘‘the system.’’ In addition, you can create your own streams. In this case,
you must specify to what the streams are attached. Attaching a stream to a file or to a ssttrriinngg is
common enough so as to be supported directly by the standard library. Here is the hierarchy of
standard stream classes:
iiooss__bbaassee
iiooss<>
...
..
...
.....
iissttrreeaam
m<>
iissttrriinnggssttrreeaam
m<> iiffssttrreeaam
m<>
...
...
oossttrreeaam
m<>
iioossttrreeaam
m<>
ffssttrreeaam
m<>
ooffssttrreeaam
m<> oossttrriinnggssttrreeaam
m<>
ssttrriinnggssttrreeaam
m<>
The classes suffixed by <> are templates parameterized on the character type, and their names have
a bbaassiicc__ prefix. A dotted line indicates a virtual base class (§15.2.4).
Files and strings are examples of containers that you can both read from and write to. Consequently, you can have a stream that supports both << and >>. Such a stream is called an iioossttrreeaam
m,
which is defined in namespace ssttdd and presented in <iioossttrreeaam
m>:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iioossttrreeaam
m : ppuubblliicc bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>, ppuubblliicc bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr> {
ppuubblliicc:
eexxpplliicciitt bbaassiicc__iioossttrreeaam
m(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* ssbb);
vviirrttuuaall ~bbaassiicc__iioossttrreeaam
m();
};
ttyyppeeddeeff bbaassiicc__iioossttrreeaam
m<cchhaarr> iioossttrreeaam
m;
ttyyppeeddeeff bbaassiicc__iioossttrreeaam
m<w
wcchhaarr__tt> w
wiioossttrreeaam
m;
Reading and writing from an iioossttrreeaam
m is controlled through the put-buffer and get-buffer operations on the iioossttrreeaam
m’s ssttrreeaam
mbbuuff (§21.6.4).
21.5.1 File Streams [io.filestream]
Here is a complete program that copies one file to another. The file names are taken as commandline arguments:
#iinncclluuddee <ffssttrreeaam
m>
#iinncclluuddee <ccssttddlliibb>
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
638
Streams
Chapter 21
vvooiidd eerrrroorr(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* pp22 = "")
{
cceerrrr << p << ´ ´ << pp22 << ´\\nn´;
ssttdd::eexxiitt(11);
}
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
iiff (aarrggcc != 33) eerrrroorr("w
wrroonngg nnuum
mbbeerr ooff aarrgguum
meennttss");
ssttdd::iiffssttrreeaam
m ffrroom
m(aarrggvv[11]);
// open input file stream
iiff (!ffrroom
m) eerrrroorr("ccaannnnoott ooppeenn iinnppuutt ffiillee",aarrggvv[11]);
ssttdd::ooffssttrreeaam
m ttoo(aarrggvv[22]);
// open output file stream
iiff (!ttoo) eerrrroorr("ccaannnnoott ooppeenn oouuttppuutt ffiillee",aarrggvv[22]);
cchhaarr cchh;
w
whhiillee (ffrroom
m.ggeett(cchh)) ttoo.ppuutt(cchh);
iiff (!ffrroom
m.eeooff() || !ttoo) eerrrroorr("ssoom
meetthhiinngg ssttrraannggee hhaappppeenneedd");
}
A file is opened for input by creating an object of class iiffssttrreeaam
m (input file stream) with the file
name as the argument. Similarly, a file is opened for output by creating an object of class ooffssttrreeaam
m
(output file stream) with the file name as the argument. In both cases, we test the state of the created object to see if the file was successfully opened.
A bbaassiicc__ooffssttrreeaam
m is declared like this in <ffssttrreeaam
m>:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__ooffssttrreeaam
m : ppuubblliicc bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr> {
ppuubblliicc:
bbaassiicc__ooffssttrreeaam
m();
eexxpplliicciitt bbaassiicc__ooffssttrreeaam
m(ccoonnsstt cchhaarr* pp, ooppeennm
mooddee m = oouutt);
bbaassiicc__ffiilleebbuuff<C
Chh,T
Trr>* rrddbbuuff() ccoonnsstt;
bbooooll iiss__ooppeenn() ccoonnsstt;
vvooiidd ooppeenn(ccoonnsstt cchhaarr* pp, ooppeennm
mooddee m = oouutt);
vvooiidd cclloossee();
};
As usual, ttyyppeeddeeffs are available for the most common types:
ttyyppeeddeeff bbaassiicc__iiffssttrreeaam
m<cchhaarr> iiffssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__ooffssttrreeaam
m<cchhaarr> ooffssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__ffssttrreeaam
m<cchhaarr> ffssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__iiffssttrreeaam
m<w
wcchhaarr__tt> w
wiiffssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__ooffssttrreeaam
m<w
wcchhaarr__tt> w
wooffssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__ffssttrreeaam
m<w
wcchhaarr__tt> w
wffssttrreeaam
m;
An iiffssttrreeaam
m is like an ooffssttrreeaam
m, except that it is derived from iissttrreeaam
m and is by default opened for
reading. In addition, the standard library offers an ffssttrreeaam
m, which is like an ooffssttrreeaam
m, except that it
is derived from iioossttrreeaam
m and by default can be both read from and written to.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.5.1
File Streams
639
File stream constructors take a second argument specifying alternative modes of opening:
ccllaassss iiooss__bbaassee {
ppuubblliicc:
// ...
ttyyppeeddeeff implementation_defined3 ooppeennm
mooddee;
ssttaattiicc ooppeennm
mooddee aapppp,
// append
aattee,
// open and seek to end of file (pronounced ‘‘at end’’)
bbiinnaarryy, // I/O to be done in binary mode (rather than text mode)
iinn,
// open for reading
oouutt,
// open for writing
ttrruunncc;
// truncate file to 0-length
// ...
};
The actual values of ooppeennm
mooddees and their meanings are implementation-defined. Please consult
your systems and library manual for details – and do experiment. The comments should give some
idea of the intended meaning of the modes. For example, we can open a file so that anything written to it is appended to the end:
ooffssttrreeaam
m m
myyssttrreeaam
m(nnaam
mee.cc__ssttrr(),iiooss__bbaassee::aapppp);
It is also possible to open a file for both input and output. For example:
ffssttrreeaam
m ddiiccttiioonnaarryy("ccoonnccoorrddaannccee",iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
21.5.2 Closing of Streams [io.close]
A file can be explicitly closed by calling cclloossee() on its stream:
vvooiidd ff(oossttrreeaam
m& m
myyssttrreeaam
m)
{
// ...
m
myyssttrreeaam
m.cclloossee();
}
However, this is implicitly done by the stream’s destructor. So an explicit call of cclloossee() is
needed only if the file must be closed before reaching the end of the scope in which its stream was
declared.
This raises the question of how an implementation can ensure that the predefined streams ccoouutt,
cciinn, cceerrrr, and cclloogg are created before their first use and closed (only) after their last use. Naturally,
different implementations of the <iioossttrreeaam
m> stream library can use different techniques to achieve
this. After all, exactly how it is done is an implementation detail that should not be visible to the
user. Here, I present just one technique that is general enough to be used to ensure proper order of
construction and destruction of global objects of a variety of types. An implementation may be
able to do better by taking advantage of special features of a compiler or linker.
The fundamental idea is to define a helper class that is a counter that keeps track of how many
times <iioossttrreeaam
m> has been included in a separately compiled source file:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
640
Streams
Chapter 21
ccllaassss iiooss__bbaassee::IInniitt {
ssttaattiicc iinntt ccoouunntt;
ppuubblliicc:
IInniitt();
~IInniitt();
};
nnaam
meessppaaccee {
iiooss__bbaassee::IInniitt ____iiooiinniitt;
}
// in <iostream>, one copy in each file #including <iostream>
iinntt iiooss__bbaassee::IInniitt::ccoouunntt = 00;
// in some .c file
Each translation unit (§9.1) declares its own object called ____iiooiinniitt. The constructor for the ____iiooiinniitt
objects uses iiooss__bbaassee::IInniitt::ccoouunntt as a first-time switch to ensure that actual initialization of the
global objects of the stream I/O library is done exactly once:
iiooss__bbaassee::IInniitt::IInniitt()
{
iiff (ccoouunntt++ == 00) { /* initialize cout, cerr, cin, etc. */ }
}
Conversely, the destructor for the ____iiooiinniitt objects uses iiooss__bbaassee::IInniitt::ccoouunntt as a last-time
switch to ensure that the streams are closed:
iiooss__bbaassee::IInniitt::~IInniitt()
{
iiff (--ccoouunntt == 00) { /* clean up cout (flush, etc.), cerr, cin, etc. */ }
}
This is a general technique for dealing with libraries that require initialization and cleanup of global
objects. In a system in which all code resides in main memory during execution, the technique is
almost free. When that is not the case, the overhead of bringing each object file into main memory
to execute its initialization function can be noticeable. When possible, it is better to avoid global
objects. For a class in which each operation performs significant work, it can be reasonable to test
a first-time switch (like iiooss__bbaassee::IInniitt::ccoouunntt) in each operation to ensure initialization. However, that approach would have been prohibitively expensive for streams. The overhead of a firsttime switch in the functions that read and write single characters would have been quite noticeable.
21.5.3 String Streams [io.stringstream]
A stream can be attached to a ssttrriinngg. That is, we can read from a ssttrriinngg and write to a ssttrriinngg using
the formatting facilities provided by streams. Such streams are called a ssttrriinnggssttrreeaam
ms. They are
defined in <ssssttrreeaam
m>:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr=cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__ssttrriinnggssttrreeaam
m : ppuubblliicc bbaassiicc__iioossttrreeaam
m<C
Chh,T
Trr> {
ppuubblliicc:
eexxpplliicciitt bbaassiicc__ssttrriinnggssttrreeaam
m(iiooss__bbaassee::ooppeennm
mooddee m = oouutt|iinn);
eexxpplliicciitt bbaassiicc__ssttrriinnggssttrreeaam
m(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh>& ss, ooppeennm
mooddee m = oouutt|iinn);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.5.3
bbaassiicc__ssttrriinngg<C
Chh> ssttrr() ccoonnsstt;
vvooiidd ssttrr(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh>& ss);
String Streams
641
// get copy of string
// set value to copy of s
bbaassiicc__ssttrriinnggbbuuff<C
Chh,T
Trr>* rrddbbuuff() ccoonnsstt;
};
ttyyppeeddeeff bbaassiicc__iissttrriinnggssttrreeaam
m<cchhaarr> iissttrriinnggssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__oossttrriinnggssttrreeaam
m<cchhaarr> oossttrriinnggssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__ssttrriinnggssttrreeaam
m<cchhaarr> ssttrriinnggssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__iissttrriinnggssttrreeaam
m<w
wcchhaarr__tt> w
wiissttrriinnggssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__oossttrriinnggssttrreeaam
m<w
wcchhaarr__tt> w
woossttrriinnggssttrreeaam
m;
ttyyppeeddeeff bbaassiicc__ssttrriinnggssttrreeaam
m<w
wcchhaarr__tt> w
wssttrriinnggssttrreeaam
m;
For example, an oossttrriinnggssttrreeaam
m can be used to format message ssttrriinnggss:
eexxtteerrnn ccoonnsstt cchhaarr* ssttdd__m
meessssaaggee[];
ssttrriinngg ccoom
mppoossee(iinntt nn, ccoonnsstt ssttrriinngg& ccss)
{
oossttrriinnggssttrreeaam
m oosstt;
oosstt << "eerrrroorr(" << n << ") " << ssttdd__m
meessssaaggee[nn] << " (uusseerr ccoom
mm
meenntt: " << ccss << ´)´;
rreettuurrnn oosstt.ssttrr();
}
There is no need to check for overflow because oosstt is expanded as needed. This technique can be
most useful for coping with cases in which the formatting required is more complicated than what
is common for a line-oriented output device.
An initial value can be provided for an oossttrriinnggssttrreeaam
m, so we could equivalently have written:
ssttrriinngg ccoom
mppoossee22(iinntt nn, ccoonnsstt ssttrriinngg& ccss)
{
oossttrriinnggssttrreeaam
m oosstt("eerrrroorr(");
oosstt << n << ") " << ssttdd__m
meessssaaggee[nn] << " (uusseerr ccoom
mm
meenntt: " << ccss << ´)´;
rreettuurrnn oosstt.ssttrr();
}
An iissttrriinnggssttrreeaam
m is an input stream reading from a ssttrriinngg:
#iinncclluuddee <ssssttrreeaam
m>
vvooiidd w
woorrdd__ppeerr__lliinnee(ccoonnsstt ssttrriinngg& ss) // prints one word per line
{
iissttrriinnggssttrreeaam
m iisstt(ss);
ssttrriinngg w
w;
w
whhiillee (iisstt>>w
w) ccoouutt << w << ´\\nn´;
}
iinntt m
maaiinn()
{
w
woorrdd__ppeerr__lliinnee("IIff yyoouu tthhiinnkk C
C++ iiss ddiiffffiiccuulltt, ttrryy E
Enngglliisshh");
}
The initializer ssttrriinngg is copied into the iissttrriinnggssttrreeaam
m. The end of the string terminates input.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
642
Streams
Chapter 21
It is possible to define streams that directly read from and write to arrays of characters
(§21.10[26]). This is often useful when dealing with older code, especially since the oossttrrssttrreeaam
m
and iissttrrssttrreeaam
m classes doing that were part of the original streams library.
21.6 Buffering [io.buf]
Conceptually, an output stream puts characters into a buffer. Some time later, the characters are
then written to wherever they are supposed to go. Such a buffer is called a ssttrreeaam
mbbuuff (§21.6.4). Its
definition is found in <ssttrreeaam
mbbuuff>. Different types of ssttrreeaam
mbbuuffs implement different buffering
strategies. Typically, the ssttrreeaam
mbbuuff stores characters in an array until an overflow forces it to write
the characters to their real destination. Thus, an oossttrreeaam
m can be represented graphically like this:
ostream:
real destination
streambuf:
tellp()
begin
current
end
character buffer
.....
.
.
.
.
.
.
.....
The set of template arguments for an oossttrreeaam
m and its ssttrreeaam
mbbuuff must be the same and determines
the type of character used in the character buffer.
An iissttrreeaam
m is similar, except that the characters flow the other way.
Unbuffered I/O is simply I/O where the streambuf immediately transfers each character, rather
than holding on to characters until enough have been gathered for efficient transfer.
21.6.1 Output Streams and Buffers [io.ostreambuf]
An oossttrreeaam
m provides operations for converting values of various types into character sequences
according to conventions (§21.2.1) and explicit formatting directives (§21.4). In addition, an
oossttrreeaam
m provides operations that deal directly with its ssttrreeaam
mbbuuff:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__oossttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
eexxpplliicciitt bbaassiicc__oossttrreeaam
m(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* bb);
ppooss__ttyyppee tteellllpp();
// get current position
bbaassiicc__oossttrreeaam
m& sseeeekkpp(ppooss__ttyyppee);
// set current position
bbaassiicc__oossttrreeaam
m& sseeeekkpp(ooffff__ttyyppee, iiooss__bbaassee::sseeeekkddiirr); // set current position
bbaassiicc__oossttrreeaam
m& fflluusshh();
// empty buffer (to real destination)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.6.1
Output Streams and Buffers
643
bbaassiicc__oossttrreeaam
m& ooppeerraattoorr<<(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* bb); // write from b
};
An oossttrreeaam
m is constructed with a ssttrreeaam
mbbuuff argument, which determines how the characters written are handled and where they eventually go. For example, an oossttrriinnggssttrreeaam
m (§21.5.3) or an
ooffssttrreeaam
m (§21.5.1) are created by initializing an oossttrreeaam
m with a suitable ssttrreeaam
mbbuuff (§21.6.4).
The sseeeekkpp() functions are used to position an oossttrreeaam
m for writing. The p suffix indicates that
it is the position used for putting characters into the stream. These functions have no effect unless
the stream is attached to something for which positioning is meaningful, such as a file. The
ppooss__ttyyppee represents a character position in a file, and the ooffff__ttyyppee represents an offset from a point
indicated by an iiooss__bbaassee::sseeeekkddiirr:
ccllaassss iiooss__bbaassee {
// ...
ttyyppeeddeeff implementation_defined4 sseeeekkddiirr;
ssttaattiicc ccoonnsstt sseeeekkddiirr bbeegg, // seek from beginning of current file
ccuurr, // seek from current position
eenndd; // seek backwards from end of current file
// ...
};
Stream positions start at 00, so we can think of a file as an array of n characters. For example:
iinntt ff(ooffssttrreeaam
m& ffoouutt)
{
ffoouutt.sseeeekkpp(1100);
ffoouutt << ´#´;
ffoouutt.sseeeekkpp(-22,iiooss__bbaassee::ccuurr);
ffoouutt << ´*´;
}
This places a # into ffiillee[1100] and a * in ffiillee[88]. There is no similar way to do random access on
elements of a plain iissttrreeaam
m or oossttrreeaam
m (see §21.10[13]).
The fflluusshh() operation allows the user to empty the buffer without waiting for an overflow.
It is possible to use << to write a ssttrreeaam
mbbuuff directly into an oossttrreeaam
m. This is primarily handy
for implementers of I/O mechanisms.
21.6.2 Input Streams and Buffers [io.istreambuf]
An iissttrreeaam
m provides operations for reading characters and converting them into values of various
types (§21.3.1). In addition, an iissttrreeaam
m provides operations that deal directly with its ssttrreeaam
mbbuuff:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iissttrreeaam
m : vviirrttuuaall ppuubblliicc bbaassiicc__iiooss<C
Chh,T
Trr> {
ppuubblliicc:
// ...
eexxpplliicciitt bbaassiicc__iissttrreeaam
m(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* bb);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
644
Streams
Chapter 21
ppooss__ttyyppee tteellllgg();
bbaassiicc__iissttrreeaam
m& sseeeekkgg(ppooss__ttyyppee);
bbaassiicc__iissttrreeaam
m& sseeeekkgg(ooffff__ttyyppee, iiooss__bbaassee::sseeeekkddiirr);
// get current position
// set current position
// set current position
bbaassiicc__iissttrreeaam
m& ppuuttbbaacckk(C
Chh cc); // put c back into the buffer
bbaassiicc__iissttrreeaam
m& uunnggeett();
// putback most recent char read
iinntt__ttyyppee ppeeeekk();
// look at next character to be read
iinntt ssyynncc();
// clear buffer (flush input)
bbaassiicc__iissttrreeaam
m& ooppeerraattoorr>>(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* bb); // read into b
bbaassiicc__iissttrreeaam
m& ggeett(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>& bb, C
Chh t = T
Trr::nneew
wlliinnee());
ssttrreeaam
mssiizzee rreeaaddssoom
mee(C
Chh* pp, ssttrreeaam
mssiizzee nn);
// read at most n char
};
The positioning functions work like their oossttrreeaam
m counterparts (§21.6.1). The g suffix indicates
that it is the position used for getting characters from the stream. The p and g suffixes are needed
because we can create an iioossttrreeaam
m derived from both iissttrreeaam
m and oossttrreeaam
m and such a stream needs
to keep track of both a get position and a put position.
The ppuuttbbaacckk() function allows a program to put an unwanted character back to be read some
other time, as shown in §21.3.5. The uunnggeett() function puts the most recently read character back.
Unfortunately, backing up an input stream is not always possible. For example, trying to back up
past the first character read will set iiooss__bbaassee::ffaaiillbbiitt. What is guaranteed is that you can back up
one character after a successful read. The ppeeeekk() reads the next character but leaves it in the
ssttrreeaam
mbbuuff so that it can be read again. Thus, cc=ppeeeekk() is equivalent to (cc=ggeett(),uunnggeett(),cc)
and to (ppuuttbbaacckk(cc=ggeett()),cc). Note that setting ffaaiillbbiitt might trigger an exception (§21.3.6).
Flushing an iissttrreeaam
m is done using ssyynncc(). This cannot always be done right. For some kinds
of streams, we would have to reread characters from the real source – and that is not always possible or desirable. Consequently, ssyynncc() returns 0 if it succeeded. If it failed, it sets
iiooss__bbaassee::bbaaddbbiitt (§21.3.3) and returns -11. Again, setting bbaaddbbiitt might trigger an exception
(§21.3.6).
The >> and ggeett() operations that target a ssttrreeaam
mbbuuff are primarily useful for implementers of
I/O facilities. Only such implementers should manipulate ssttrreeaam
mbbuuffs directly.
The rreeaaddssoom
mee() function is a low-level operation that allows a user to peek at a stream to see
if there are any characters available to read. This can be most useful when it is undesirable to wait
for input, say, from a keyboard. See also iinn__aavvaaiill() (§21.6.4).
21.6.3 Streams and Buffers [io.rdbuf]
The connection between a stream and its buffer is maintained in the stream’s bbaassiicc__iiooss:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__iiooss : ppuubblliicc iiooss__bbaassee {
ppuubblliicc:
// ...
bbaassiicc__ssttrreeaam
mbbuuff<cchhaarrT
T,ttrraaiittss>* rrddbbuuff() ccoonnsstt;
// get buffer
bbaassiicc__ssttrreeaam
mbbuuff<cchhaarrT
T,ttrraaiittss>* rrddbbuuff(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* bb); // set buffer
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.6.3
Streams and Buffers
llooccaallee iim
mbbuuee(ccoonnsstt llooccaallee& lloocc);
// set locale (and get old locale)
cchhaarr nnaarrrroow
w(cchhaarr__ttyyppee cc, cchhaarr dd) ccoonnsstt;
cchhaarr__ttyyppee w
wiiddeenn(cchhaarr cc) ccoonnsstt;
// make char value from char_type c
// make char_type value from char c
645
// ...
pprrootteecctteedd:
bbaassiicc__iiooss();
vvooiidd iinniitt(bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* bb);
};
// set initial buffer
In addition to reading and setting the stream’s ssttrreeaam
mbbuuff (§21.6.4), bbaassiicc__iiooss provides iim
mbbuuee() to
read and re-set the stream’s locale (§21.7) by calling iim
mbbuuee() on its iiooss__bbaassee (§21.7.1) and
ppuubbiim
mbbuuee() on its buffer (§21.6.4).
The nnaarrrroow
w() and w
wiiddeenn() functions are used to convert cchhaarrs to and from a buffer’s
cchhaarr__ttyyppee. The second argument of nnaarrrroow
w(cc,dd) is the cchhaarr returned if there isn’t a cchhaarr corresponding to the cchhaarr__ttyyppee value cc.
21.6.4 Stream Buffers [io.streambuf]
The I/O operations are specified without any mention of file types, but not all devices can be
treated identically with respect to buffering strategies. For example, an oossttrreeaam
m bound to a ssttrriinngg
(§21.5.3) needs a different kind of buffer than does an oossttrreeaam
m bound to a file (§21.5.1). These
problems are handled by providing different buffer types for different streams at the time of initialization. There is only one set of operations on these buffer types, so the oossttrreeaam
m functions do not
contain code distinguishing them. The different types of buffers are derived from class ssttrreeaam
mbbuuff.
Class ssttrreeaam
mbbuuff provides virtual functions for operations where buffering strategies differ, such as
the functions that handle overflow and underflow.
The bbaassiicc__ssttrreeaam
mbbuuff class provides two interfaces. The public interface is aimed primarily at
implementers of stream classes such as iissttrreeaam
m, oossttrreeaam
m, ffssttrreeaam
m, ssttrriinnggssttrreeaam
m, etc. In addition,
a protected interface is provided for implementers of new buffering strategies and of ssttrreeaam
mbbuuffs for
new input sources and output destinations.
To understand a ssttrreeaam
mbbuuff, it is useful first to consider the underlying model of a buffer area
provided by the protected interface. Assume that the ssttrreeaam
mbbuuff has a put area into which <<
writes, and a get area from which >> reads. Each area is described by a beginning pointer, current
pointer, and one-past-the-end pointer. These pointers are made available through functions:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__ssttrreeaam
mbbuuff {
pprrootteecctteedd:
C
Chh* eebbaacckk() ccoonnsstt;
C
Chh* ggppttrr() ccoonnsstt;
C
Chh* eeggppttrr() ccoonnsstt;
// start of get-buffer
// next filled character (next char read comes from here)
// one-past-end of get-buffer
vvooiidd ggbbuum
mpp(iinntt nn);
// add n to gptr()
vvooiidd sseettgg(C
Chh* bbeeggiinn, C
Chh* nneexxtt, C
Chh* eenndd); // set eback(), gptr(), and egptr()
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
646
Streams
Chapter 21
C
Chh* ppbbaassee() ccoonnsstt;
// start of put-buffer
C
Chh* ppppttrr() ccoonnsstt;
// next free char (next char written goes here)
C
Chh* eeppppttrr() ccoonnsstt;
// one-past-end of put-buffer
vvooiidd ppbbuum
mpp(iinntt nn);
// add n to pptr()
vvooiidd sseettpp(C
Chh* bbeeggiinn, C
Chh* eenndd);
// set pbase() and pptr() to begin, and epptr() to end
// ...
};
Given an array of characters, sseettgg() and sseettpp() can set up the pointers appropriately. An implementation might access its get area like this:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>::iinntt__ttyyppee bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>::ssnneexxttcc() // read next character
{
iiff (ggppttrr()==00) rreettuurrnn uufflloow
w();
// no input buffering
ggbbuum
mpp(11);
// move to next character
iiff (ggppttrr()>=eeggppttrr()) rreettuurrnn uunnddeerrfflloow
w(); // re-fill buffer
rreettuurrnn *ggppttrr();
// return the now current character
}
The public interface of a ssttrreeaam
mbbuuff looks like this:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__ssttrreeaam
mbbuuff {
ppuubblliicc:
// usual typedefs (§21.2.1)
bbaassiicc__ssttrreeaam
mbbuuff();
vviirrttuuaall ~bbaassiicc__ssttrreeaam
mbbuuff();
llooccaallee ppuubbiim
mbbuuee(ccoonnsstt llooccaallee &lloocc);
llooccaallee ggeettlloocc() ccoonnsstt;
// set locale (and get old locale)
// get locale
bbaassiicc__ssttrreeaam
mbbuuff* ppuubbsseettbbuuff(C
Chh* pp, ssttrreeaam
mssiizzee nn);
// set buffer space
// position (§21.6.1):
ppooss__ttyyppee ppuubbsseeeekkooffff(ooffff__ttyyppee ooffff, iiooss__bbaassee::sseeeekkddiirr w
waayy,
iiooss__bbaassee::ooppeennm
mooddee m = iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
ppooss__ttyyppee ppuubbsseeeekkppooss(ppooss__ttyyppee pp, iiooss__bbaassee::ooppeennm
mooddee m = iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
iinntt ppuubbssyynncc();
// sync() input (§21.6.2)
iinntt__ttyyppee
iinntt__ttyyppee
iinntt__ttyyppee
ssttrreeaam
mssiizzee
ssnneexxttcc();
ssbbuum
mppcc();
ssggeettcc();
ssggeettnn(C
Chh* pp, ssttrreeaam
mssiizzee nn);
iinntt__ttyyppee
iinntt__ttyyppee
ssppuuttbbaacckkcc(C
Chh cc);
ssuunnggeettcc();
// get next character
// advance gptr() by 1
// get current char
// get into p[0]..p[n-1]
// put c back into buffer (§21.6.2)
// unget last char
iinntt__ttyyppee ssppuuttcc(C
Chh cc);
ssttrreeaam
mssiizzee ssppuuttnn(ccoonnsstt C
Chh* pp, ssttrreeaam
mssiizzee nn);
ssttrreeaam
mssiizzee iinn__aavvaaiill();
// put c
// put p[0]..p[n-1]
// is input ready?
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.6.4
Stream Buffers
647
// ...
};
The public interface contains functions for inserting characters into the buffer and extracting characters from the buffer. These functions are simple and easily inlined. This is crucial for efficiency.
Functions that implement parts of a specific buffering strategy invoke corresponding functions
in the protected interface. For example, ppuubbsseettbbuuff() calls sseettbbuuff(), which is overridden by a
derived class to implement that class’ notion of getting memory for the buffered characters. Using
two functions to implement an operation such as sseettbbuuff() allows an iioossttrreeaam
m implementer to do
some ‘‘housekeeping’’ before and after the user’s code. For example, an implementer might wrap
a try-block around the call of the virtual function and catch exceptions thrown by the user code.
This use of a pair of public and protected functions is yet another general technique that just happens to be useful in the context of I/O.
By default, sseettbbuuff(00,00) means ‘‘unbuffered’’ and sseettbbuuff(pp,nn) means use pp[00]..pp[nn-11]
to hold buffered characters.
A call to iinn__aavvaaiill() is used to see how many characters are available in the buffer. This can be
used to avoid waiting for input. When reading from a stream connected to a keyboard, cciinn.ggeett(cc)
might wait until the user comes back from lunch. On some systems and for some applications, it
can be worthwhile taking that into account when reading. For example:
iiff (cciinn.rrddbbuuff().iinn__aavvaaiill()) { // get() will not block
cciinn.ggeett(cc);
// do something
}
eellssee {
// get() might block
// do something else
}
In addition to the public interface used by bbaassiicc__iissttrreeaam
m and bbaassiicc__oossttrreeaam
m, bbaassiicc__ssttrreeaam
mbbuuff
offers a protected interface to implementers of ssttrreeaam
mbbuuffs. This is where the virtual functions that
determine policy are declared:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__ssttrreeaam
mbbuuff {
pprrootteecctteedd:
// ...
vviirrttuuaall vvooiidd iim
mbbuuee(ccoonnsstt llooccaallee &lloocc);
// set locale
vviirrttuuaall bbaassiicc__ssttrreeaam
mbbuuff* sseettbbuuff(C
Chh* pp, ssttrreeaam
mssiizzee nn);
vviirrttuuaall ppooss__ttyyppee sseeeekkooffff(ooffff__ttyyppee ooffff, iiooss__bbaassee::sseeeekkddiirr w
waayy,
iiooss__bbaassee::ooppeennm
mooddee m = iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
vviirrttuuaall ppooss__ttyyppee sseeeekkppooss(ppooss__ttyyppee pp,
iiooss__bbaassee::ooppeennm
mooddee m = iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
vviirrttuuaall iinntt ssyynncc();
// sync() input (§21.6.2)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
648
Streams
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
Chapter 21
iinntt sshhoow
wm
maannyycc();
ssttrreeaam
mssiizzee xxssggeettnn(C
Chh* pp, ssttrreeaam
mssiizzee nn);
iinntt__ttyyppee uunnddeerrfflloow
w();
iinntt__ttyyppee uufflloow
w();
// get n chars
// get area empty
vviirrttuuaall iinntt__ttyyppee ppbbaacckkffaaiill(iinntt__ttyyppee c = T
Trr::eeooff());
// putback failed
vviirrttuuaall ssttrreeaam
mssiizzee xxssppuuttnn(ccoonnsstt C
Chh* pp, ssttrreeaam
mssiizzee nn);
vviirrttuuaall iinntt__ttyyppee oovveerrfflloow
w(iinntt__ttyyppee c = T
Trr::eeooff());
// put n chars
// put area full
};
The uunnddeerrfflloow
w() and uufflloow
w() functions are called to get the next character from the real input
source when the buffer is empty. If no more input is available from that source, the stream is set
into eeooff state (§21.3.3). If doing that doesn’t cause an exception, ttrraaiittss__ttyyppee::eeooff() is returned.
Unbuffered input uses uufflloow
w(); buffered input uses uunnddeerrfflloow
w(). Remember that there typically
are more buffers in your system than the ones introduced by the iioossttrreeaam
m library, so you can suffer
buffering delays even when using unbuffered stream I/O.
The oovveerrfflloow
w() function is called to transfer characters to the real output destination when the
buffer is full. A call oovveerrfflloow
w(cc) outputs the contents of the buffer plus the character cc. If no
more output can be written to that target, the stream is put into eeooff state (§21.3.3). If doing that
doesn’t cause an exception, ttrraaiittss__ttyyppee::eeooff() is returned.
The sshhoow
wm
maannyycc() – ‘‘show how many characters’’ – function is an odd function intended to
allow a user to learn something about the state of a machine’s input system. It returns an estimate
of how many characters can be read ‘‘soon,’’ say, by emptying the operating system’s buffers
rather than waiting for a disc read. A call to sshhoow
wm
maannyycc() returns -11 if it cannot promise that any
character can be read without encountering end-of-file. This is (necessarily) rather low-level and
highly implementation-dependent. Don’t use sshhoow
wm
maannyycc() without a careful reading of your system documentation and a few experiments.
By default, every stream gets the global locale (§21.7). A ppuubbiim
mbbuuee(lloocc) or iim
mbbuuee(lloocc) call
makes a stream use lloocc as its locale.
A ssttrreeaam
mbbuuff for a particular kind of stream is derived from bbaassiicc__ssttrreeaam
mbbuuff. It provides the
constructors and initialization functions that connect the ssttrreeaam
mbbuuff to a real source of (target for)
characters and overrides the virtual functions that determine the buffering strategy. For example:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr = cchhaarr__ttrraaiittss<C
Chh> >
ccllaassss bbaassiicc__ffiilleebbuuff : ppuubblliicc bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr> {
ppuubblliicc:
bbaassiicc__ffiilleebbuuff();
vviirrttuuaall ~bbaassiicc__ffiilleebbuuff();
bbooooll iiss__ooppeenn() ccoonnsstt;
bbaassiicc__ffiilleebbuuff* ooppeenn(ccoonnsstt cchhaarr* pp, iiooss__bbaassee::ooppeennm
mooddee m
mooddee);
bbaassiicc__ffiilleebbuuff* cclloossee();
pprrootteecctteedd:
vviirrttuuaall iinntt sshhoow
wm
maannyycc();
vviirrttuuaall iinntt__ttyyppee uunnddeerrfflloow
w();
vviirrttuuaall iinntt__ttyyppee uufflloow
w();
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.6.4
Stream Buffers
649
vviirrttuuaall iinntt__ttyyppee ppbbaacckkffaaiill(iinntt__ttyyppee c = T
Trr::eeooff());
vviirrttuuaall iinntt__ttyyppee oovveerrfflloow
w(iinntt__ttyyppee c = T
Trr::eeooff());
vviirrttuuaall bbaassiicc__ssttrreeaam
mbbuuff<C
Chh,T
Trr>* sseettbbuuff(C
Chh* pp, ssttrreeaam
mssiizzee nn);
vviirrttuuaall ppooss__ttyyppee sseeeekkooffff(ooffff__ttyyppee ooffff, iiooss__bbaassee::sseeeekkddiirr w
waayy,
iiooss__bbaassee::ooppeennm
mooddee m = iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
vviirrttuuaall ppooss__ttyyppee sseeeekkppooss(ppooss__ttyyppee pp,
iiooss__bbaassee::ooppeennm
mooddee m = iiooss__bbaassee::iinn|iiooss__bbaassee::oouutt);
vviirrttuuaall iinntt ssyynncc();
vviirrttuuaall vvooiidd iim
mbbuuee(ccoonnsstt llooccaallee& lloocc);
};
The functions for manipulating buffers, etc., are inherited unchanged from bbaassiicc__ssttrreeaam
mbbuuff. Only
functions that affect initialization and buffering policy need to be separately provided.
As usual, the obvious ttyyppeeddeeffs and their wide stream counterparts are provided:
ttyyppeeddeeff bbaassiicc__ssttrreeaam
mbbuuff<cchhaarr> ssttrreeaam
mbbuuff;
ttyyppeeddeeff bbaassiicc__ssttrriinnggbbuuff<cchhaarr> ssttrriinnggbbuuff;
ttyyppeeddeeff bbaassiicc__ffiilleebbuuff<cchhaarr> ffiilleebbuuff;
ttyyppeeddeeff bbaassiicc__ssttrreeaam
mbbuuff<w
wcchhaarr__tt> w
wssttrreeaam
mbbuuff;
ttyyppeeddeeff bbaassiicc__ssttrriinnggbbuuff<w
wcchhaarr__tt> w
wssttrriinnggbbuuff;
ttyyppeeddeeff bbaassiicc__ffiilleebbuuff<w
wcchhaarr__tt> w
wffiilleebbuuff;
21.7 Locale [io.locale]
A llooccaallee is an object that controls the classification of characters into letters, digits, etc.; the collation order of strings; and the appearance of numeric values on input and output. Most commonly a
llooccaallee is used implicitly by the iioossttrreeaam
ms library to ensure that the usual conventions for some natural language or culture is adhered to. In such cases, a programmer never sees a llooccaallee object.
However, by changing a ssttrreeaam
m’s llooccaallee, a programmer can change the way the stream behaves to
suit a different set of conventions
A locale is an object of class llooccaallee defined in namespace ssttdd presented in <llooccaallee>:
ccllaassss llooccaallee {
ppuubblliicc:
// ...
llooccaallee() tthhrroow
w();
eexxpplliicciitt llooccaallee(ccoonnsstt cchhaarr* nnaam
mee);
bbaassiicc__ssttrriinngg<cchhaarr> nnaam
mee() ccoonnsstt;
// copy of current global locale
// construct locale using C locale name
// give name of this locale
llooccaallee(ccoonnsstt llooccaallee&) tthhrroow
w();
ccoonnsstt llooccaallee& ooppeerraattoorr=(ccoonnsstt llooccaallee& ) tthhrroow
w();
ssttaattiicc llooccaallee gglloobbaall(ccoonnsstt llooccaallee&);
ssttaattiicc ccoonnsstt llooccaallee& ccllaassssiicc();
// copy locale
// copy locale
// set the global locale (get the previous locale)
// get the locale that C defines
};
Here, I omitted all of the interesting pieces and left only what is needed to switch from one existing
locale to another. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
650
Streams
Chapter 21
vvooiidd ff()
{
ssttdd::llooccaallee lloocc("P
PO
OSSIIX
X");
// standard locale for POSIX
cciinn.iim
mbbuuee(lloocc);
// let cin use loc
// ...
cciinn.iim
mbbuuee(ssttdd::llooccaallee::gglloobbaall()); // reset cin to use the default locale
}
The iim
mbbuuee() function is a member of bbaassiicc__iiooss (§21.7.1).
As shown, some fairly standard locales have character string names. These tend to be shared
with C.
It is possible to set the llooccaallee that is used by all newly constructed streams:
vvooiidd gg(ccoonnsstt llooccaallee& lloocc = llooccaallee())
{
llooccaallee oolldd__gglloobbaall = llooccaallee::gglloobbaall(lloocc);
// ...
}
// use current global locale by default
// make loc the default locale
Setting the global llooccaallee does not change the behavior of existing streams that are using the previous value of the global llooccaallee. In particular, cciinn, ccoouutt, etc., are not affected. If they should be
changed, they must be explicitly iim
mbbuuee()d.
Imbuing a stream with a llooccaallee changes facets of its behavior. It is possible to use members of
a llooccaallee directly, to define new llooccaallees, and to extend llooccaallees with new facets. For example, a
llooccaallee can also be used explicitly to control the appearance of monetary units, dates, etc., on input
and output (§21.10[25]) and conversion between codesets. However, discussion of that is beyond
the scope of this book. Please consult your implementation’s documentation.
The C-style locale is presented in <ccllooccaallee> and <llooccaallee.hh>.
21.7.1 Stream Callbacks [io.callbacks]
Sometimes, people want to add to the state of a stream. For example, one might want a stream to
‘‘know’’ whether a ccoom
mpplleexx should be output in polar or Cartesian coordinates. Class iiooss__bbaassee
provides a function xxaalllloocc() to allocate space for such simple state information. The value
returned by xxaalllloocc() identifies a pair of locations that can be accessed by iiw
woorrdd() and ppw
woorrdd():
ccllaassss iiooss__bbaassee {
ppuubblliicc:
// ...
~iiooss__bbaassee();
llooccaallee iim
mbbuuee(ccoonnsstt llooccaallee& lloocc);
llooccaallee ggeettlloocc() ccoonnsstt;
ssttaattiicc iinntt xxaalllloocc();
lloonngg& iiw
woorrdd(iinntt ii);
vvooiidd*& ppw
woorrdd(iinntt ii);
// get and set locale
// get locale
// get an integer and a pointer (both initialized to 0)
// access the integer iword(i)
// access the pointer pword(i)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.7.1
Stream Callbacks
651
// callbacks:
eennuum
m eevveenntt { eerraassee__eevveenntt, iim
mbbuuee__eevveenntt, ccooppyyffm
mtt__eevveenntt }; // event type
ttyyppeeddeeff vvooiidd (*eevveenntt__ccaallllbbaacckk)(eevveenntt, iiooss__bbaassee&, iinntt ii);
vvooiidd rreeggiisstteerr__ccaallllbbaacckk(eevveenntt__ccaallllbbaacckk ff, iinntt ii);
// attach f to word(i)
};
Sometimes, an implementer or a user needs to be notified about a change in a stream’s state. The
rreeggiisstteerr__ccaallllbbaacckk() function ‘‘registers’’ a function to be called when its ‘‘event’’ occurs. Thus,
a call of iim
mbbuuee(), ccooppyyffm
mtt(), or ~iiooss__bbaassee() will call a function ‘‘registered’’ for an
iim
mbbuuee__eevveenntt, ccooppyyffm
mtt__eevveenntt, or eerraassee__eevveenntt, respectively. When the the state changes, registered
functions are called with the argument i supplied by their rreeggiisstteerr__ccaallllbbaacckk().
This storage and callback mechanism is fairly obscure. Use it only when you absolutely need to
extend the low-level formatting facilities.
21.8 C Input/Output [io.c]
Because C++ and C code are often intermixed, C++ stream I/O is sometimes mixed with the C
pprriinnttff() family of I/O functions. The C-style I/O functions are presented by <ccssttddiioo> and
<ssttddiioo.hh>. Also, because C functions can be called from C++ some programmers may prefer to
use the more familiar C I/O functions. Even if you prefer stream I/O, you will undoubtedly
encounter C-style I/O at some time.
C and C++ I/O can be mixed on a per-character basis. A call of ssyynncc__w
wiitthh__ssttddiioo() before the
first stream I/O operation in the execution of a program guarantees that the C-style and C++-style
I/O operations share buffers. A call of ssyynncc__w
wiitthh__ssttddiioo(ffaallssee) before the first stream I/O operation prevents buffer sharing and can improve I/O performance on some implementations.
ccllaassss iiooss__bbaassee {
// ...
ssttaattiicc bbooooll ssyynncc__w
wiitthh__ssttddiioo(bbooooll ssyynncc = ttrruuee); // get and set
};
The general advantage of the stream output functions over the C standard library function pprriinnttff()
is that the stream functions are type safe and have a common style for specifying output of objects
of built-in and user-defined types.
The general C output functions
iinntt pprriinnttff(ccoonnsstt cchhaarr* ffoorrm
maatt ...);
// write to stdout
iinntt ffpprriinnttff(F
FIIL
LE
E*, ccoonnsstt cchhaarr* ffoorrm
maatt ...); // write to ‘‘file’’ (stdout, stderr)
iinntt sspprriinnttff(cchhaarr* pp, ccoonnsstt cchhaarr* ffoorrm
maatt ...); // write to p[0]..
produce formatted output of an arbitrary sequence of arguments under control of the format string
ffoorrm
maatt. The format string contains two types of objects: plain characters, which are simply copied
to the output stream, and conversion specifications, each of which causes conversion and printing
of the next argument. Each conversion specification is introduced by the character %. For example:
pprriinnttff("tthheerree w
weerree %dd m
meem
mbbeerrss pprreesseenntt.",nnoo__ooff__m
meem
mbbeerrss);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
652
Streams
Chapter 21
Here %dd specifies that nnoo__ooff__m
meem
mbbeerrss is to be treated as an iinntt and printed as the appropriate
sequence of decimal digits. With nnoo__ooff__m
meem
mbbeerrss==112277, the output is
tthheerree w
weerree 112277 m
meem
mbbeerrss pprreesseenntt.
The set of conversion specifications is quite large and provides a great degree of flexibility. Following the %, there may be:
- an optional minus sign that specifies left-adjustment of the converted value in the field;
+ an optional plus sign that specifies that a value of a signed type will always begin with a +
or - sign;
# an optional # that specifies that floating-point values will be printed with a decimal point
even if no nonzero digits follow, that trailing zeroes will be printed, that octal values will
be printed with an initial 00, and that hexadecimal values will be printed with an initial 00xx
or 00X
X;
d
an optional digit string specifying a field width; if the converted value has fewer characters
than the field width, it will be blank-padded on the left (or right, if the left-adjustment indicator has been given) to make up the field width; if the field width begins with a zero,
zero-padding will be done instead of blank-padding;
. an optional period that serves to separate the field width from the next digit string;
d
an optional digit string specifying a precision that specifies the number of digits to appear
after the decimal point, for e- and f-conversion, or the maximum number of characters to
be printed from a string;
* a field width or precision may be * instead of a digit string. In this case an integer argument supplies the field width or precision;
h an optional character hh, specifying that a following dd, oo, xx, or u corresponds to a short integer argument;
l an optional character ll, specifying that a following dd, oo, xx, or u corresponds to a long integer argument;
% indicating that the character % is to be printed; no argument is used;
c
a character that indicates the type of conversion to be applied. The conversion characters
and their meanings are:
d The integer argument is converted to decimal notation;
o The integer argument is converted to octal notation;
x The integer argument is converted to hexadecimal notation with an initial 00xx;
X The integer argument is converted to hexadecimal notation with an initial 00X
X;
f The ffllooaatt or ddoouubbllee argument is converted to decimal notation in the style [-]ddd.ddd.
The number of d’s after the decimal point is equal to the precision for the argument.
If necessary, the number is rounded. If the precision is missing, six digits are given;
if the precision is explicitly 0 and # isn’t specified, no decimal point is printed;
e The ffllooaatt or ddoouubbllee argument is converted to decimal notation in the scientific style
[-]d.ddde+dd or [-]d.ddde-dd, where there is one digit before the decimal point and
the number of digits after the decimal point is equal to the precision specification for
the argument. If necessary, the number is rounded. If the precision is missing, six
digits are given; if the precision is explicitly 0 and # isn’t specified, no digits and no
decimal point are printed;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.8
C Input/Output
653
As ee, but with an uppercase E used to identify the exponent;
The ffllooaatt or ddoouubbllee argument is printed in style d, in style f, or in style e, whichever
gives the greatest precision in minimum space;
G As gg, but with an uppercase E used to identify the exponent.
c The character argument is printed. Null characters are ignored;
s The argument is taken to be a string (character pointer), and characters from the string
are printed until a null character or until the number of characters indicated by the
precision specification is reached; however, if the precision is 0 or missing, all characters up to a null are printed.
p The argument is taken to be a pointer. The representation printed is implementationdependent.
u The unsigned integer argument is converted to decimal notation;
In no case does a nonexistent or small field width cause truncation of a field; padding takes
place only if the specified field width exceeds the actual width.
Here is a more elaborate example:
E
g
cchhaarr* lliinnee__ffoorrm
maatt = "\\nn#lliinnee %dd \\"%ss\\"\\nn";
iinntt m
maaiinn()
{
iinntt lliinnee = 1133;
cchhaarr* ffiillee__nnaam
mee = "C
C++/m
maaiinn.cc";
pprriinnttff("iinntt aa;\\nn");
pprriinnttff(lliinnee__ffoorrm
maatt,lliinnee,ffiillee__nnaam
mee);
pprriinnttff("iinntt bb;\\nn");
}
which produces:
iinntt aa;
#lliinnee 1133 "C
C++/m
maaiinn.cc"
iinntt bb;
Using pprriinnttff() is unsafe in the sense that type checking is not done. For example, here is a wellknown way of getting unpredictable output, a core dump, or worse:
cchhaarr xx;
// ...
pprriinnttff("bbaadd iinnppuutt cchhaarr: %ss",xx);
// %s should have been %c
The pprriinnttff() does, however, provide great flexibility in a form that is familiar to C programmers.
Similarly, ggeettcchhaarr() provides a familiar way of reading characters from input:
iinntt ii;
w
whhiillee ((ii=ggeettcchhaarr())!=E
EO
OF
F) { // C character input
// use i
}
Note that to be able to test for end-of-file against the iinntt value E
EO
OF
F, the value of ggeettcchhaarr() must
be put into an iinntt rather than into a cchhaarr.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
654
Streams
Chapter 21
For further details of C I/O, see your C reference manual or Kernighan and Ritchie: The C Programming Language [Kernighan,1988].
21.9 Advice [io.advice]
[1] Define << and >> for user-defined types with values that have meaningful textural representations; §21.2.3, §21.3.5.
[2] Use parentheses when printing expressions containing operators of low precedence; §21.2.
[3] You don’t need to modify iissttrreeaam
m or oossttrreeaam
m to add new << and >> operators; §21.2.3.
[4] You can define a function so that it behaves as a vviirrttuuaall function based on its second (or subsequent) argument; §21.2.3.1.
[5] Remember that by default >> skips whitespace; §21.3.2.
[6] Use lower-level input functions such as ggeett() and rreeaadd() primarily in the implementation of
higher-lever input functions; §21.3.4.
[7] Be careful with the termination criteria when using ggeett(), ggeettlliinnee(), and rreeaadd(); §21.3.4.
[8] Prefer manipulators to state flags for controlling I/O; §21.3.3, §21.4, §21.4.6.
[9] Use exceptions to catch rare I/O errors (only); §21.3.6.
[10] Tie streams used for interactive I/O; §21.3.7.
[11] Use sentries to concentrate entry and exit code for many functions in one place; §21.3.8.
[12] Don’t use parentheses after a no-argument manipulator; §21.4.6.2.
[13] Remember to #iinncclluuddee <iioom
maanniipp> when using standard manipulators; §21.4.6.2.
[14] You can achieve the effect (and efficiency) of a ternary operator by defining a simple function
object; §21.4.6.3.
[15] Remember that w
wiiddtthh specifications apply to the following I/O operation only; §21.4.4.
[16] Remember that pprreecciissiioonn specifications apply to all following floating-point output operations; §21.4.3.
[17] Use string streams for in-memory formatting; §21.5.3.
[18] You can specify a mode for a file stream ; §21.5.1.
[19] Distinguish sharply between formatting (iioossttrreeaam
ms) and buffering (ssttrreeaam
mbbuuffs) when extending the I/O system; §21.1, §21.6.
[20] Implement nonstandard ways of transmitting values as stream buffers; §21.6.4.
[21] Implement nonstandard ways of formatting values as stream operations; §21.2.3, §21.3.5.
[22] You can isolate and encapsulate calls of user-defined code by using a pair of functions;
§21.6.4.
[23] You can use iinn__aavvaaiill() to determine whether an input operation will block before reading;
§21.6.4.
[24] Distinguish between simple operations that need to be efficient and operations that implement
policy (make the former iinnlliinnee and the latter vviirrttuuaall); §21.6.4.
[25] Use llooccaallee to localize ‘‘cultural differences;’’ §21.7.
[26] Use ssyynncc__w
wiitthh__ssttddiioo(xx) to mix C-style and C++-style I/O and to disassociate C-style and
C++-style I/O; §21.8.
[27] Beware of type errors in C-style I/O; §21.8.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 21.10
Exercises
655
21.10 Exercises [io.exercises]
1. (∗1.5) Read a file of floating-point numbers, make complex numbers out of pairs of numbers
read, and write out the complex numbers.
2. (∗1.5) Define a type N
Naam
mee__aanndd__aaddddrreessss. Define << and >> for it. Copy a stream of
N
Naam
mee__aanndd__aaddddrreessss objects.
3. (∗2.5) Copy a stream of N
Naam
mee__aanndd__aaddddrreessss objects in which you have inserted as many errors
as you can think of (e.g., format errors and premature end of string). Handle these errors in a
way that ensures that the copy function reads most of the correctly formatted
N
Naam
mee__aanndd__aaddddrreesssses, even when the input is completely messed up.
4. (∗2.5) Redefine the I/O format N
Naam
mee__aanndd__aaddddrreessss to make it more robust in the presence of
format errors.
5. (∗2.5) Design some functions for requesting and reading information of various types. Ideas:
integer, floating-point number, file name, mail address, date, personal information, etc. Try to
make them foolproof.
6. (∗1.5) Write a program that prints (a) all lowercase letters, (b) all letters, (c) all letters and digits, (d) all characters that may appear in a C++ identifier on your system, (e) all punctuation
characters, (f) the integer value of all control characters, (g) all whitespace characters, (h) the
integer value of all whitespace characters, and finally (i) all printing characters.
7. (∗2) Read a sequence of lines of text into a fixed-sized character buffer. Remove all whitespace
characters and replace each alphanumeric character with the next character in the alphabet
(replace z by a and 9 by 00). Write out the resulting line.
8. (∗3) Write a ‘‘miniature’’ stream I/O system that provides classes iissttrreeaam
m, oossttrreeaam
m, iiffssttrreeaam
m,
ooffssttrreeaam
m providing functions such as ooppeerraattoorr<<() and ooppeerraattoorr>>() for integers and operations such as ooppeenn() and cclloossee() for files.
9. (∗4) Implement the C standard I/O library (<ssttddiioo.hh>) using the C++ standard I/O library
(<iioossttrreeaam
m>).
10. (∗4) Implement the C++ standard I/O library (<iioossttrreeaam
m>) using the C standard I/O library
(<ssttddiioo.hh>).
11. (∗4) Implement the C and C++ libraries so that they can be used simultaneously.
12. (∗2) Implement a class for which [] is overloaded to implement random reading of characters
from a file.
13. (∗3) Repeat §21.10[12] but make [] useful for both reading and writing. Hint: Make [] return
an object of a ‘‘descriptor type’’ for which assignment means ‘‘assign through descriptor to
file’’ and implicit conversion to cchhaarr ‘‘means read from file through descriptor.’’
14. (∗2) Repeat §21.10[13] but let [] index objects of arbitrary types, not just characters.
15. (∗3.5) Implement versions of iissttrreeaam
m and oossttrreeaam
m that read and write numbers in their binary
form rather than converting them into a character representation. Discuss the advantages and
disadvantages of this approach compared to the character-based approach.
16. (∗3.5) Design and implement a pattern-matching input operation. Use pprriinnttff-style format
strings to specify a pattern. It should be possible to try out several patterns against some input
to find the actual format. One might derive a pattern-matching input class from iissttrreeaam
m.
17. (∗4) Invent (and implement) a much better kind of pattern for pattern matching. Be specific
about what is better about it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
656
Streams
Chapter 21
18. (∗2) Define an output manipulator bbaasseedd that takes two arguments – a base and an iinntt value –
and outputs the integer in the representation specified by the base. For example, bbaasseedd(22,99)
should print 11000011.
19. (∗2) Write manipulators that turn character echoing on and off.
20. (∗2) Implement B
Boouunndd__ffoorrm
m from §21.4.6.3 for the usual set of built-in types.
21. (∗2) Re-implement B
Boouunndd__ffoorrm
m from §21.4.6.3 so that an output operation never overflows its
w
wiiddtthh(). It should be possible for a programmer to ensure that output is never quietly truncated beyond its specified precision.
22. (∗3) Implement an eennccrryypptt(kk) manipulator that ensures that output on its oossttrreeaam
m is encrypted
using the key kk. Provide a similar ddeeccrryypptt(kk) manipulator for an iissttrreeaam
m. Provide the means
for turning the encryption off for a stream so that further I/O is cleartext.
23. (∗2) Trace a character’s route through your system from the keyboard to the screen for a simple:
cchhaarr cc;
cciinn >> cc;
ccoouutt << c << eennddll;
24. (∗2) Modify rreeaaddiinnttss() (§21.3.6) to handle all exceptions. Hint: Resource acquisition is
initialization.
25. (∗2.5) There is a standard way of reading, writing, and representing dates under control of a
llooccaallee. Find it in the documentation of your implementation and write a small program that
reads and writes dates using this mechanism. Hint: ssttrruucctt ttm
m.
26. (∗2.5) Define an oossttrreeaam
m called oossttrrssttrreeaam
m that can be attached to an array of characters (a Cstyle string) in a way similar to the way oossttrriinnggssttrreeaam
m is attached to a ssttrriinngg. However, do not
copy the array into or out of the oossttrrssttrreeaam
m. The oossttrrssttrreeaam
m should simply provide a way of
writing to its array argument. It might be used for in-memory formatting like this:
cchhaarr bbuuff[m
meessssaaggee__ssiizzee];
oossttrrssttrreeaam
m oosstt(bbuuff,m
meessssaaggee__ssiizzee);
ddoo__ssoom
meetthhiinngg(aarrgguum
meennttss,oosstt);
ccoouutt << bbuuff;
// output to buf through ost
// ost adds terminating 0
An operation such as ddoo__ssoom
meetthhiinngg() can write to the stream oosstt, pass oosstt on to its suboperations, etc., using the standard output operations. There is no need to check for overflow because
oosstt knows its size and will go into ffaaiill() state when it is full. Finally, a ddiissppllaayy() operation
can write the message to a ‘‘real’’ output stream. This technique can be most useful for coping
with cases in which the final display operation involves writing to something more complicated
than a traditional line-oriented output device. For example, the text from oosstt could be placed in
a fixed-sized area somewhere on a screen. Similarly, define class iissttrrssttrreeaam
m as an input string
stream reading from a zero-terminated string of characters. Interpret the terminating zero character as end-of-file. These ssttrrssttrreeaam
ms were part of the original streams library and can often be
found in <ssttrrssttrreeaam
m.hh>.
27. (∗2.5) Implement a manipulator ggeenneerraall() that resets a stream to its original (general) format
in the same way a sscciieennttiiffiicc() (§21.4.6.2) sets a stream to use scientific format.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
22
________________________________________
________________________________________________________________________________________________________________________________________________________________
Numerics
The purpose of computing is insight, not numbers.
– R.W. Hamming
... but for the student,
numbers are often the best road to insight.
– A. Ralston
Introduction — numeric limits — mathematical functions — vvaallaarrrraayy — vector operations — slices — sslliiccee__aarrrraayy — elimination of temporaries — ggsslliiccee__aarrrraayy —
m
maasskk__aarrrraayy — iinnddiirreecctt__aarrrraayy — ccoom
mpplleexx — generalized algorithms — random numbers — advice — exercises.
22.1 Introduction [num.intro]
It is rare to write any real code without doing some calculation. However, most code requires little
mathematics beyond simple arithmetic. This chapter presents the facilities the standard library
offers to people who go beyond that.
Neither C nor C++ were designed primarily with numeric computation in mind. However,
numeric computation typically occurs in the context of other work – such as database access, networking, instrument control, graphics, simulation, financial analysis, etc. – so C++ becomes an
attractive vehicle for computations that are part of a larger system. Furthermore, numeric methods
have come a long way from being simple loops over vectors of floating-point numbers. Where
more complex data structures are needed as part of a computation, C++’s strengths become relevant. The net effect is that C++ is increasingly used for scientific and engineering computation
involving sophisticated numerics. Consequently, facilities and techniques supporting such computation have emerged. This chapter describes the parts of the standard library that support numerics
and presents a few techniques for dealing with issues that arise when people express numeric
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
658
Numerics
Chapter 22
computations in C++. I make no attempt to teach numeric methods. Numeric computation is a fascinating topic in its own right. To understand it, you need a good course in numerical methods or
at least a good textbook – not just a language manual and tutorial.
22.2 Numeric Limits [num.limits]
To do anything interesting with numbers, we typically need to know something about general properties of built-in numeric types that are implementation-defined rather than fixed by the rules of the
language itself (§4.6). For example, what is the largest iinntt? What is the smallest ffllooaatt? Is a ddoouu-bbllee rounded or truncated when assigned to a ffllooaatt? How many bits are there in a cchhaarr?
Answers to such questions are provided by the specializations of the nnuum
meerriicc__lliim
miittss template
presented in <lliim
miittss>. For example:
vvooiidd ff(ddoouubbllee dd, iinntt ii)
{
iiff (nnuum
meerriicc__lliim
miittss<cchhaarr>::ddiiggiittss != 88) {
// unusual bytes (number of bits not 8)
}
iiff (ii<nnuum
meerriicc__lliim
miittss<sshhoorrtt>::m
miinn() || nnuum
meerriicc__lliim
miittss<sshhoorrtt>::m
maaxx()<ii) {
// i cannot be stored in a short without loss of precision
}
iiff (00<dd && dd<nnuum
meerriicc__lliim
miittss<ddoouubbllee>::eeppssiilloonn()) d = 00;
iiff (nnuum
meerriicc__lliim
miittss<Q
Quuaadd>::iiss__ssppeecciiaalliizzeedd) {
// limits information available for type Quad
}
}
Each specialization provides the relevant information for its argument type. Thus, the general
nnuum
meerriicc__lliim
miittss template is simply a notational handle for a set of constants and inline functions:
tteem
mppllaattee<ccllaassss T
T> ccllaassss nnuum
meerriicc__lliim
miittss {
ppuubblliicc:
ssttaattiicc ccoonnsstt bbooooll iiss__ssppeecciiaalliizzeedd = ffaallssee; // is information available for numeric_limits<T>?
// uninteresting defaults
};
The real information is in the specializations. Each implementation of the standard library provides
a specialization of nnuum
meerriicc__lliim
miittss for each fundamental type (the character types, the integer and
floating-point types, and bbooooll) but not for any other plausible candidates such as vvooiidd, enumerations, or library types (such as ccoom
mpplleexx<ddoouubbllee>).
For an integral type such as cchhaarr, only a few pieces of information are of interest. Here is
nnuum
meerriicc__lliim
miittss<cchhaarr> for an implementation in which a cchhaarr has 8 bits and is signed:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.2
Numeric Limits
ccllaassss nnuum
meerriicc__lliim
miittss<cchhaarr> {
ppuubblliicc:
ssttaattiicc ccoonnsstt bbooooll iiss__ssppeecciiaalliizzeedd = ttrruuee;
659
// yes, we have information
ssttaattiicc ccoonnsstt iinntt ddiiggiittss = 88;
// number of bits (‘‘binary digits’’)
ssttaattiicc ccoonnsstt bbooooll iiss__ssiiggnneedd = ttrruuee;
ssttaattiicc ccoonnsstt bbooooll iiss__iinntteeggeerr = ttrruuee;
// this implementation has char signed
// char is an integral type
iinnlliinnee ssttaattiicc cchhaarr m
miinn() tthhrroow
w() { rreettuurrnn -112288; }
iinnlliinnee ssttaattiicc cchhaarr m
maaxx() tthhrroow
w() { rreettuurrnn 112277; }
// smallest value
// largest value
// lots of declarations not relevant to a char
};
Most members of nnuum
meerriicc__lliim
miittss are intended to describe floating-point numbers. For example,
this describes one possible implementation of ffllooaatt:
ccllaassss nnuum
meerriicc__lliim
miittss<ffllooaatt> {
ppuubblliicc:
ssttaattiicc ccoonnsstt bbooooll iiss__ssppeecciiaalliizzeedd = ttrruuee;
ssttaattiicc ccoonnsstt iinntt rraaddiixx = 22;
ssttaattiicc ccoonnsstt iinntt ddiiggiittss = 2244;
ssttaattiicc ccoonnsstt iinntt ddiiggiittss1100 = 66;
// base of exponent (in this case, binary)
// number radix digits in mantissa
// number of base 10 digits in mantissa
ssttaattiicc ccoonnsstt bbooooll iiss__ssiiggnneedd = ttrruuee;
ssttaattiicc ccoonnsstt bbooooll iiss__iinntteeggeerr = ffaallssee;
ssttaattiicc ccoonnsstt bbooooll iiss__eexxaacctt = ffaallssee;
iinnlliinnee ssttaattiicc ffllooaatt m
miinn() tthhrroow
w() { rreettuurrnn 11.1177554499443355E
E-3388F
F; }
iinnlliinnee ssttaattiicc ffllooaatt m
maaxx() tthhrroow
w() { rreettuurrnn 33.4400228822334477E
E+3388F
F; }
iinnlliinnee ssttaattiicc ffllooaatt eeppssiilloonn() tthhrroow
w() { rreettuurrnn 11.1199220099229900E
E-0077F
F; }
iinnlliinnee ssttaattiicc ffllooaatt rroouunndd__eerrrroorr() tthhrroow
w() { rreettuurrnn 00.55F
F; }
iinnlliinnee
iinnlliinnee
iinnlliinnee
iinnlliinnee
ssttaattiicc
ssttaattiicc
ssttaattiicc
ssttaattiicc
ffllooaatt
ffllooaatt
ffllooaatt
ffllooaatt
iinnffiinniittyy() tthhrroow
w() { rreettuurrnn /* some value */; }
qquuiieett__N
NaaN
N() tthhrroow
w() { rreettuurrnn /* some value */; }
ssiiggnnaalliinngg__N
NaaN
N() tthhrroow
w() { rreettuurrnn /* some value */; }
ddeennoorrm
m__m
miinn() tthhrroow
w() { rreettuurrnn m
miinn(); }
ssttaattiicc
ssttaattiicc
ssttaattiicc
ssttaattiicc
ccoonnsstt
ccoonnsstt
ccoonnsstt
ccoonnsstt
iinntt
iinntt
iinntt
iinntt
ssttaattiicc
ssttaattiicc
ssttaattiicc
ssttaattiicc
ssttaattiicc
ccoonnsstt
ccoonnsstt
ccoonnsstt
ccoonnsstt
ccoonnsstt
bbooooll hhaass__iinnffiinniittyy = ttrruuee;
bbooooll hhaass__qquuiieett__N
NaaN
N = ttrruuee;
bbooooll hhaass__ssiiggnnaalliinngg__N
NaaN
N = ttrruuee;
ffllooaatt__ddeennoorrm
m__ssttyyllee hhaass__ddeennoorrm
m = ddeennoorrm
m__aabbsseenntt;
bbooooll hhaass__ddeennoorrm
m__lloossss = ffaallssee;
m
miinn__eexxppoonneenntt = -112255;
m
miinn__eexxppoonneenntt1100 = -3377;
m
maaxx__eexxppoonneenntt = +112288;
m
maaxx__eexxppoonneenntt1100 = +3388;
// enum from <limits>
ssttaattiicc ccoonnsstt bbooooll iiss__iieecc555599 = ttrruuee; // conforms to IEC-559
ssttaattiicc ccoonnsstt bbooooll iiss__bboouunnddeedd = ttrruuee;
ssttaattiicc ccoonnsstt bbooooll iiss__m
moodduulloo = ffaallssee;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
660
Numerics
Chapter 22
ssttaattiicc ccoonnsstt bbooooll ttrraappss = ttrruuee;
ssttaattiicc ccoonnsstt bbooooll ttiinnyynneessss__bbeeffoorree = ttrruuee;
ssttaattiicc ccoonnsstt ffllooaatt__rroouunndd__ssttyyllee rroouunndd__ssttyyllee = rroouunndd__ttoo__nneeaarreesstt; // enum from <limits>
};
Note that m
miinn() is the smallest positive normalized number and that eeppssiilloonn is the smallest positive floating-point number such that 11+eeppssiilloonn-11 is representable.
When defining a scalar type along the lines of the built-in ones, it is a good idea also to provide
a suitable specialization of nnuum
meerriicc__lliim
miittss. For example, if I wrote a quadruple-precision type
Q
Quuaadd or if a vendor provided an extended-precision integer lloonngg lloonngg, a user could reasonably
expect nnuum
meerriicc__lliim
miittss<Q
Quuaadd> and nnuum
meerriicc__lliim
miittss<lloonngg lloonngg> to be supplied.
One can imagine specializations of nnuum
meerriicc__lliim
miittss describing properties of user-defined types
that have little to do with floating-point numbers. In such cases, it is usually better to use the general technique for describing properties of a type than to specialize nnuum
meerriicc__lliim
miittss with properties
not considered in the standard. Latin1...UL float_denom_style
Floating-point values are represented as inline functions. Integral values in nnuum
meerriicc__lliim
miittss,
however, must be represented in a form that allows them to be used in constant expressions. That
implies that they must have in-class initializers (§10.4.6.2). If you use ssttaattiicc ccoonnsstt members rather
than enumerators for that, remember to define the ssttaattiiccs.
22.2.1 Limit Macros [num.limit.c]
From C, C++ inherited macros that describe properties of integers. These are found in <cclliim
miittss>
and <lliim
miittss.hh> and have names such as C
CH
HA
AR
R__B
BIIT
T and IIN
NT
T__M
MA
AX
X. Similarly, <ccffllooaatt> and
<ffllooaatt.hh> define macros describing properties of floating-point numbers. They have names such
as D
DB
BL
L__M
MIIN
N__E
EX
XP
P, F
FL
LT
T__R
RA
AD
DIIX
X, and L
LD
DB
BL
L__M
MA
AX
X.
As ever, macros are best avoided.
22.3 Standard Mathematical Functions [num.math]
The headers <ccm
maatthh> and <m
maatthh.hh> provide what is commonly referred to as ‘‘the usual mathematical functions:’’
ddoouubbllee aabbss(ddoouubbllee);
ddoouubbllee ffaabbss(ddoouubbllee);
// absolute value; not in C, same as fabs()
// absolute value
ddoouubbllee cceeiill(ddoouubbllee dd);
ddoouubbllee fflloooorr(ddoouubbllee dd);
// smallest integer not less than d
// largest integer not greater than d
ddoouubbllee ssqqrrtt(ddoouubbllee dd);
// square root of d, d must be non-negative
ddoouubbllee ppoow
w(ddoouubbllee dd, ddoouubbllee ee);
// d to the power of e,
// error if d==0 and e<=0 or if d<0 and e isn’t an integer.
// d to the power of i; not in C
ddoouubbllee ppoow
w(ddoouubbllee dd, iinntt ii);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.3
Standard Mathematical Functions
ddoouubbllee ccooss(ddoouubbllee);
ddoouubbllee ssiinn(ddoouubbllee);
ddoouubbllee ttaann(ddoouubbllee);
// cosine
// sine
// tangent
ddoouubbllee
ddoouubbllee
ddoouubbllee
ddoouubbllee
// arc cosine
// arc sine
// arc tangent
// atan(x/y)
aaccooss(ddoouubbllee);
aassiinn(ddoouubbllee);
aattaann(ddoouubbllee);
aattaann22(ddoouubbllee xx, ddoouubbllee yy);
ddoouubbllee ssiinnhh(ddoouubbllee);
ddoouubbllee ccoosshh(ddoouubbllee);
ddoouubbllee ttaannhh(ddoouubbllee);
// hyperbolic sine
// hyperbolic cosine
// hyperbolic tangent
ddoouubbllee eexxpp(ddoouubbllee);
ddoouubbllee lloogg(ddoouubbllee dd);
ddoouubbllee lloogg1100(ddoouubbllee dd);
// exponential, base e
// natural (base e) logarithm, d must be > 0
// base 10 logarithm, d must be > 0
661
ddoouubbllee m
mooddff(ddoouubbllee dd, ddoouubbllee* pp); // return fractional part of d, place integral part in *p
ddoouubbllee ffrreexxpp(ddoouubbllee dd, iinntt* pp);
// find x in [.5,1) and y so that d = x*pow(2,y),
// return x and store y in *p
ddoouubbllee ffm
moodd(ddoouubbllee dd, ddoouubbllee m
m); // floating-point remainder, same sign as d
ddoouubbllee llddeexxpp(ddoouubbllee dd, iinntt ii);
// d*pow(2,i)
In addition, <ccm
maatthh> and <m
maatthh.hh> supply these functions for ffllooaatt and lloonngg ddoouubbllee arguments.
Where several values are possible results – as with aassiinn() – the one nearest to 0 is returned.
The result of aaccooss() is non-negative.
Errors are reported by setting eerrrrnnoo from <cceerrrrnnoo> to E
ED
DO
OM
M for a domain error and to
E
ER
RA
AN
NG
GE
E for a range error. For example:
vvooiidd ff()
{
eerrrrnnoo = 00; // clear old error state
ssqqrrtt(-11);
iiff (eerrrrnnoo==E
ED
DO
OM
M) cceerrrr << "ssqqrrtt() nnoott ddeeffiinneedd ffoorr nneeggaattiivvee aarrgguum
meenntt";
ppoow
w(nnuum
meerriicc__lliim
miittss<ddoouubbllee>::m
maaxx(),22);
iiff (eerrrrnnoo == E
ER
RA
AN
NG
GE
E) cceerrrr << "rreessuulltt ooff ppoow
w() ttoooo llaarrggee ttoo rreepprreesseenntt aass a ddoouubbllee";
}
For historical reasons, a few mathematical functions are found in the <ccssttddlliibb> header rather than
in <ccm
maatthh>:
iinntt aabbss(iinntt);
lloonngg aabbss(lloonngg);
lloonngg llaabbss(lloonngg);
// absolute value
// absolute value (not in C)
// absolute value
ssttrruucctt ddiivv__tt { implementation_defined qquuoott, rreem
m; };
ssttrruucctt llddiivv__tt { implementation_defined qquuoott, rreem
m; };
ddiivv__tt ddiivv(iinntt nn, iinntt dd);
// divide n by d, return (quotient,remainder)
llddiivv__tt ddiivv(lloonngg iinntt nn, lloonngg iinntt dd); // divide n by d, return (quotient,remainder) (not in C)
llddiivv__tt llddiivv(lloonngg iinntt nn, lloonngg iinntt dd); // divide n by d, return (quotient,remainder)
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
662
Numerics
Chapter 22
22.4 Vector Arithmetic [num.valarray]
Much numeric work relies on relatively simple single-dimensional vectors of floating-point values.
In particular, such vectors are well supported by high-performance machine architectures, libraries
relying on such vectors are in wide use, and very aggressive optimization of code using such
vectors is considered essential in many fields. Consequently, the standard library provides a vector
– called vvaallaarrrraayy – designed specifically for speed of the usual numeric vector operations.
When looking at the vvaallaarrrraayy facilities, it is wise to remember that they are intended as a relatively low-level building block for high-performance computation. In particular, the primary
design criterion wasn’t ease of use, but rather effective use of high-performance computers when
relying on aggressive optimization techniques. If your aim is flexibility and generality rather than
efficiency, you are probably better off building on the standard containers from Chapter 16 and
Chapter 17 than trying to fit into the simple, efficient, and deliberately traditional framework of
vvaallaarrrraayy.
One could argue that vvaallaarrrraayy should have been called vveeccttoorr because it is a traditional mathematical vector and that vveeccttoorr (§16.3) should have been called aarrrraayy. However, this is not the way
the terminology evolved. A vvaallaarrrraayy is a vector optimized for numeric computation, a vveeccttoorr is a
flexible container designed for holding and manipulating objects of a wide variety of types, and an
array is a low-level, built-in type.
The vvaallaarrrraayy type is supported by four auxiliary types for specifying subsets of a vvaallaarrrraayy:
– sslliiccee__aarrrraayy and ggsslliiccee__aarrrraayy represent the notion of slices (§22.4.6, §22.4.8),
– m
maasskk__aarrrraayy specifies a subset by marking each element in or out (§22.4.9), and
– iinnddiirreecctt__aarrrraayy lists the indices of the elements to be considered (§22.4.10).
22.4.1 Valarray Construction [num.valarray.ctor]
The vvaallaarrrraayy type and its associated facilities are defined in namespace ssttdd and presented in
<vvaallaarrrraayy>:
tteem
mppllaattee<ccllaassss T
T> ccllaassss ssttdd::vvaallaarrrraayy {
// representation
ppuubblliicc:
ttyyppeeddeeff T vvaalluuee__ttyyppee;
vvaallaarrrraayy();
eexxpplliicciitt vvaallaarrrraayy(ssiizzee__tt nn);
vvaallaarrrraayy(ccoonnsstt T
T& vvaall, ssiizzee__tt nn);
vvaallaarrrraayy(ccoonnsstt T
T* pp, ssiizzee__tt nn);
vvaallaarrrraayy(ccoonnsstt vvaallaarrrraayy& vv);
vvaallaarrrraayy(ccoonnsstt
vvaallaarrrraayy(ccoonnsstt
vvaallaarrrraayy(ccoonnsstt
vvaallaarrrraayy(ccoonnsstt
sslliiccee__aarrrraayy<T
T>&);
ggsslliiccee__aarrrraayy<T
T>&);
m
maasskk__aarrrraayy<T
T>&);
iinnddiirreecctt__aarrrraayy<T
T>&);
// valarray with size()==0
// n elements with value T()
// n elements with value val
// n elements with values p[0], p[1], ...
// copy of v
// see §22.4.6
// see §22.4.8
// see §22.4.9
// see §22.4.10
~vvaallaarrrraayy();
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.1
Valarray Construction
663
This set of constructors allows us to initialize vvaallaarrrraayys from the auxiliary numeric array types and
from single values. For example:
vvaallaarrrraayy<ddoouubbllee> vv00;
vvaallaarrrraayy<ffllooaatt> vv11(11000000);
// placeholder, we can assign to v0 later
// 1000 elements with value float()==0.0F
vvaallaarrrraayy<iinntt> vv22(-11,22000000);
vvaallaarrrraayy<ddoouubbllee> vv33(110000,99.88006644);
// 2000 elements with value – 1
// bad mistake: floating-point valarray size
vvaallaarrrraayy<ddoouubbllee> vv44 = vv33;
// v4 has v3.size() elements
In the two-argument constructors, the value comes before the number of elements. This differs
from the convention for other standard containers (§16.3.4).
The number of elements of an argument vvaallaarrrraayy to a copy constructor determines the size of
the resulting vvaallaarrrraayy.
Most programs need data from tables or input; this is supported by a constructor that copies elements from a built-in array. For example:
ccoonnsstt ddoouubbllee vvdd[] = { 00, 11, 22, 33, 4 };
ccoonnsstt iinntt vvii[] = { 00, 11, 22, 33, 4 };
vvaallaarrrraayy<ddoouubbllee> vv33(vvdd,44);
vvaallaarrrraayy<ddoouubbllee> vv44(vvii,44);
vvaallaarrrraayy<ddoouubbllee> vv55(vvdd,88);
// 4 elements: 0,1,2,3
// type error: vi is not pointer to double
// undefined: too few elements in initializer
This form of initialization is important because numeric software that produces data in the form of
large arrays is common.
The vvaallaarrrraayy and its auxiliary facilities were designed for high-speed computing. This is
reflected in a few constraints on users and by a few liberties granted to implementers. Basically, an
implementer of vvaallaarrrraayy is allowed to use just about every optimization technique you can think
of. For example, operations may be inlined and the vvaallaarrrraayy operations are assumed to be free of
side effects (except on their explicit arguments of course). Also, vvaallaarrrraayys are assumed to be alias
free, and the introduction of auxiliary types and the elimination of temporaries is allowed as long as
the basic semantics are maintained. Thus, the declarations in <vvaallaarrrraayy> may look somewhat different from what you find here (and in the standard), but they should provide the same operations
with the same meaning for code that doesn’t go out of the way to break the rules. In particular, the
elements of a vvaallaarrrraayy should have the usual copy semantics (§17.1.4).
22.4.2 Valarray Subscripting and Assignment [num.valarray.sub]
For vvaallaarrrraayys, subscripting is used both to access individual elements and to obtain subarrays:
tteem
mppllaattee<ccllaassss T
T> ccllaassss vvaallaarrrraayy {
ppuubblliicc:
// ...
vvaallaarrrraayy& ooppeerraattoorr=(ccoonnsstt vvaallaarrrraayy& vv);
vvaallaarrrraayy& ooppeerraattoorr=(ccoonnsstt T
T& vvaall);
// copy v
// assign val to every element
T ooppeerraattoorr[](ssiizzee__tt) ccoonnsstt;
T
T& ooppeerraattoorr[](ssiizzee__tt);
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
664
Numerics
Chapter 22
vvaallaarrrraayy ooppeerraattoorr[](sslliiccee) ccoonnsstt;
sslliiccee__aarrrraayy<T
T> ooppeerraattoorr[](sslliiccee);
// see §22.4.6
vvaallaarrrraayy ooppeerraattoorr[](ccoonnsstt ggsslliiccee&) ccoonnsstt;
ggsslliiccee__aarrrraayy<T
T> ooppeerraattoorr[](ccoonnsstt ggsslliiccee&);
// see §22.4.8
vvaallaarrrraayy ooppeerraattoorr[](ccoonnsstt vvaallaarrrraayy<bbooooll>&) ccoonnsstt;
m
maasskk__aarrrraayy<T
T> ooppeerraattoorr[](ccoonnsstt vvaallaarrrraayy<bbooooll>&);
// see §22.4.9
vvaallaarrrraayy ooppeerraattoorr[](ccoonnsstt vvaallaarrrraayy<ssiizzee__tt>&) ccoonnsstt;
iinnddiirreecctt__aarrrraayy<T
T> ooppeerraattoorr[](ccoonnsstt vvaallaarrrraayy<ssiizzee__tt>&);
// see §22.4.10
vvaallaarrrraayy& ooppeerraattoorr=(ccoonnsstt
vvaallaarrrraayy& ooppeerraattoorr=(ccoonnsstt
vvaallaarrrraayy& ooppeerraattoorr=(ccoonnsstt
vvaallaarrrraayy& ooppeerraattoorr=(ccoonnsstt
sslliiccee__aarrrraayy<T
T>&);
ggsslliiccee__aarrrraayy<T
T>&);
m
maasskk__aarrrraayy<T
T>&);
iinnddiirreecctt__aarrrraayy<T
T>&);
// see §22.4.6
// see §22.4.8
// see §22.4.9
// see §22.4.10
// ...
};
A vvaallaarrrraayy can be assigned to another of the same size. As one would expect, vv11=vv22 copies every
element of vv22 into its corresponding position in vv11. If vvaallaarrrraayys have different sizes, the result of
assignment is undefined. Because vvaallaarrrraayy is designed to be optimized for speed, it would be
unwise to assume that assigning with a vvaallaarrrraayy of the wrong size would cause an easily comprehensible error (such as an exception) or other ‘‘reasonable’’ behavior.
In addition to this conventional assignment, it is possible to assign a scalar to a vvaallaarrrraayy. For
example, vv=77 assigns 7 to every element of the vvaallaarrrraayy vv. This may be surprising, and is best
understood as an occasionally useful degenerate case of the operator assignment operations
(§22.4.3).
Subscripting with an integer behaves conventionally and does not perform range checking.
In addition to the selection of individual elements, vvaallaarrrraayy subscripting provides four ways of
extracting subarrays (§22.4.6). Conversely, assignment (and constructors §22.4.1) accepts such
subarrays as operands. The set of assignments on vvaallaarrrraayy ensures that it is not necessary to convert an auxiliary array type, such as sslliiccee__aarrrraayy, to vvaallaarrrraayy before assigning it. An implementation may similarly replicate other vector operations, such as + and *, to assure efficiency. In addition, many powerful optimization techniques exist for vector operations involving sslliiccees and the
other auxiliary vector types.
22.4.3 Member Operations [num.valarray.member]
The obvious, as well as a few less obvious, member functions are provided:
tteem
mppllaattee<ccllaassss T
T> ccllaassss vvaallaarrrraayy {
ppuubblliicc:
// ...
vvaallaarrrraayy& ooppeerraattoorr*=(ccoonnsstt T
T& aarrgg);
// v[i]*=arg for every element
// similarly: /=, %=, +=, – =, ˆ=, &=, =, <<=, and >>=
T ssuum
m() ccoonnsstt;
// sum of elements
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.3
Member Operations
665
vvaallaarrrraayy sshhiifftt(iinntt ii) ccoonnsstt;
vvaallaarrrraayy ccsshhiifftt(iinntt ii) ccoonnsstt;
// logical shift (left for 0<i, right for i<0)
// cyclic shift (left for 0<i, right for i<0)
vvaallaarrrraayy aappppllyy(T
T ff(T
T)) ccoonnsstt;
vvaallaarrrraayy aappppllyy(T
T ff(ccoonnsstt T
T&)) ccoonnsstt;
// result[i] = f(v[i]) for every element
vvaallaarrrraayy
vvaallaarrrraayy
vvaallaarrrraayy
vvaallaarrrraayy
// result[i] = -v[i] for every element
// result[i] = +v[i] for every element
// result[i] = ˜v[i] for every element
// result[i] = !v[i] for every element
ooppeerraattoorr-() ccoonnsstt;
ooppeerraattoorr+() ccoonnsstt;
ooppeerraattoorr~() ccoonnsstt;
ooppeerraattoorr!() ccoonnsstt;
T m
miinn() ccoonnsstt; // smallest value using < for comparison; if size()==0 the value is undefined
T m
maaxx() ccoonnsstt; // largest value using < for comparison; if size()==0 the value is undefined
ssiizzee__tt ssiizzee() ccoonnsstt;
vvooiidd rreessiizzee(ssiizzee__tt nn, ccoonnsstt T
T& vvaall = T
T());
// number of elements
// n elements with value val
};
For example, if v is a vvaallaarrrraayy, it can be scaled like this: vv*=.22, and this: vv/=11.33. That is, applying a scalar to a vector means applying the scalar to each element of the vector. As usual, it is easier to optimize uses of *= than uses of a combination of * and = (§11.3.1).
Note that the non-assignment operations construct a new vvaallaarrrraayy. For example:
ddoouubbllee iinnccrr(ddoouubbllee dd) { rreettuurrnn dd+11; }
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& vv)
{
vvaallaarrrraayy<ddoouubbllee> vv22 = vv.aappppllyy(iinnccrr);
}
// produce incremented valarray
This does not change the value of vv. Unfortunately, aappppllyy() does not accept a function object
(§18.4) as an argument (§22.9[1]).
The logical and cyclic shift functions, sshhiifftt() and ccsshhiifftt(), return a new vvaallaarrrraayy with the elements suitably shifted and leave the original one unchanged. For example, the cyclic shift
vv22=vv.ccsshhiifftt(nn) produces a vvaallaarrrraayy so that vv22[ii]==vv[(ii+nn)%vv.ssiizzee()]. The logical shift
vv33=vv.sshhiifftt(nn) produces a vvaallaarrrraayy so that vv33[ii] is vv[ii+nn] if ii+nn is a valid index for vv. Otherwise, the result is the default element value. This implies that both sshhiifftt() and ccsshhiifftt() shift left
when given a positive argument and right when given a negative argument. For example:
vvooiidd ff()
{
iinntt aallpphhaa[] = { 11, 22, 33, 44, 5 ,66, 77, 8 };
vvaallaarrrraayy<iinntt> vv(aallpphhaa,88);
// 1, 2, 3, 4, 5, 6, 7, 8
vvaallaarrrraayy<iinntt> vv22 = vv.sshhiifftt(22);
// 3, 4, 5, 6, 7, 8, 0, 0
vvaallaarrrraayy<iinntt> vv33 = vv<<22;
// 4, 8, 12, 16, 20, 24, 28, 32
vvaallaarrrraayy<iinntt> vv44 = vv.sshhiifftt(-22);
// 0, 0, 1, 2, 3, 4, 5, 6
vvaallaarrrraayy<iinntt> vv55 = vv>>22;
// 0, 0, 0, 1, 1, 1, 1, 2
vvaallaarrrraayy<iinntt> vv66 = vv.ccsshhiifftt(22);
// 3, 4, 5, 6, 7, 8, 1, 2
vvaallaarrrraayy<iinntt> vv77 = vv.ccsshhiifftt(-22);
// 7, 8, 1, 2, 3, 4, 5, 6
}
For vvaallaarrrraayys, >> and << are bit shift operators, rather than element shift operators or I/O
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
666
Numerics
Chapter 22
operators (§22.4.4). Consequently, <<= and >>= can be used to shift bits within elements of an
integral type. For example:
vvooiidd ff(vvaallaarrrraayy<iinntt> vvii, vvaallaarrrraayy<ddoouubbllee> vvdd)
{
vvii <<= 22; // vi[i]<<=2 for all elements of vi
vvdd <<= 22; // error: shift is not defined for floating-point values
}
It is possible to change the size of a vvaallaarrrraayy. However, rreessiizzee() is nnoott an operation intended to
make vvaallaarrrraayy into a data structure that can grow dynamically the way a vveeccttoorr and a ssttrriinngg can.
Instead, rreessiizzee() is a re-initialize operation that replaces the existing contents of a vvaallaarrrraayy by a
set of default values. The old values are lost.
Often, a resized vvaallaarrrraayy is one that we created as an empty vector. Consider how we might
initialize a vvaallaarrrraayy from input:
vvooiidd ff()
{
iinntt n = 00;
cciinn >> nn;
iiff (nn<=00) eerrrroorr("bbaadd aarrrraayy bboouunndd");
// read array size
vvaallaarrrraayy<ddoouubbllee> vv(nn);
// make an array of the right size
iinntt i = 00;
w
whhiillee (ii<nn && cciinn>>vv[ii++]) ;
// fill array
iiff (ii!=nn) eerrrroorr("ttoooo ffeew
w eelleem
meennttss oonn iinnppuutt");
// ...
}
If we want to handle the input in a separate function, we might do it like this:
vvooiidd iinniittiiaalliizzee__ffrroom
m__iinnppuutt(vvaallaarrrraayy<ddoouubbllee>& vv)
{
iinntt n = 00;
cciinn >> nn;
iiff (nn<=00) eerrrroorr("bbaadd aarrrraayy bboouunndd");
// read array size
vv.rreessiizzee(nn);
// make v the right size
iinntt i = 00;
w
whhiillee (ii<nn && cciinn>>vv[ii++]) ;
// fill array
iiff (ii!=nn) eerrrroorr("ttoooo ffeew
w eelleem
meennttss oonn iinnppuutt");
}
vvooiidd gg()
{
vvaallaarrrraayy<ddoouubbllee> vv;
iinniittiiaalliizzee__ffrroom
m__iinnppuutt(vv);
// ...
}
// make a default array
// give v the right size and elements
This avoids copying large amounts of data.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.3
Member Operations
667
If we want a vvaallaarrrraayy holding valuable data to grow dynamically, we must use a temporary:
vvooiidd ggrroow
w(vvaallaarrrraayy<iinntt>& vv, ssiizzee__tt nn)
{
iiff (nn<=vv.ssiizzee()) rreettuurrnn;
vvaallaarrrraayy<iinntt> ttm
mpp(nn);
// n default elements
ccooppyy(&vv[00],&vv[vv.ssiizzee()],&ttm
mpp[00]);
// copy algorithm from §18.6.1
vv.rreessiizzee(nn);
ccooppyy(&ttm
mpp[00],&ttm
mpp[vv.ssiizzee()],&vv[00]);
}
This is not the intended way to use vvaallaarrrraayy. A vvaallaarrrraayy is intended to have a fixed size after
being given its initial value.
The elements of a vvaallaarrrraayy form a sequence; that is, vv[00]..vv[nn-11] are contiguous in memory. This implies that T
T* is a random-access iterator (§19.2.1) for vvaallaarrrraayy<T
T> so that standard
algorithms, such as ccooppyy(), can be used. However, it would be more in the spirit of vvaallaarrrraayy to
express the copy in terms of assignment and subarrays:
vvooiidd ggrroow
w22(vvaallaarrrraayy<iinntt>& vv, ssiizzee__tt nn)
{
iiff (nn<=vv.ssiizzee()) rreettuurrnn;
vvaallaarrrraayy<iinntt> ttm
mpp(nn);
sslliiccee ss(00,vv.ssiizzee(),11);
// n default elements
// subarray of v.size() elements (see §22.4.5)
ttm
mpp[ss] = vv;
vv.rreessiizzee(nn);
vv[ss] = ttm
mpp;
}
If for some reason input data is organized so that you have to count the elements before knowing
the size of vector needed to hold them, it is usually best to read the input into a vveeccttoorr (§16.3.5) and
then copy the elements into a vvaallaarrrraayy.
22.4.4 Nonmember Operations [valarray.ops]
The usual binary operators and mathematical functions are provided:
tteem
mppllaattee<ccllaassss T
T> vvaallaarrrraayy<T
T> ooppeerraattoorr*(ccoonnsstt vvaallaarrrraayy<T
T>&, ccoonnsstt vvaallaarrrraayy<T
T>&);
tteem
mppllaattee<ccllaassss T
T> vvaallaarrrraayy<T
T> ooppeerraattoorr*(ccoonnsstt vvaallaarrrraayy<T
T>&, ccoonnsstt T
T&);
tteem
mppllaattee<ccllaassss T
T> vvaallaarrrraayy<T
T> ooppeerraattoorr*(ccoonnsstt T
T&, ccoonnsstt vvaallaarrrraayy<T
T>&);
// similarly: /, %, +, – , ˆ, &, , <<, >>, &&, , ==, !=, <, >, <=, >=, atan2, and pow
tteem
mppllaattee<ccllaassss T
T> vvaallaarrrraayy<T
T> aabbss(ccoonnsstt vvaallaarrrraayy<T
T>&);
// similarly: acos, asin, atan, cos, cosh, exp, log, log10, sin, sinh, sqrt, tan, and tanh
The binary operations are defined for vvaallaarrrraayys and for combinations of a vvaallaarrrraayy and its scalar
type. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
668
Numerics
Chapter 22
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& vv, vvaallaarrrraayy<ddoouubbllee>& vv22, ddoouubbllee dd)
{
vvaallaarrrraayy<ddoouubbllee> vv33 = vv*vv22;
// v3[i] = v[i]*v2[i] for all i
vvaallaarrrraayy<ddoouubbllee> vv44 = vv*dd;
// v4[i] = v[i]*d for all i
vvaallaarrrraayy<ddoouubbllee> vv55 = dd*vv22;
// v5[i] = d*v2[i] for all i
vvaallaarrrraayy<ddoouubbllee> vv66 = ccooss(vv); // v6[i] = cos(v[i]) for all i
}
These vector operations all apply their operations to each element of their operand(s) in the way
indicated by the * and ccooss() examples. Naturally, an operation can be used only if the corresponding operation is defined for the template argument type. Otherwise, the compiler will issue
an error when trying to specialize the template (§13.5).
Where the result is a vvaallaarrrraayy, its length is the same as its vvaallaarrrraayy operand. If the lengths of
the two arrays are not the same, the result of a binary operator on two vvaallaarrrraayys is undefined.
Curiously enough, no I/O operations are provided for vvaallaarrrraayy (§22.4.3); << and >> are shift
operations. However, I/O versions of >> and << for vvaallaarrrraayy are easily defined (§22.9[5]).
Note that these vvaallaarrrraayy operations return new vvaallaarrrraayys rather than modifying their operands.
This can be expensive, but it doesn’t have to be when aggressive optimization techniques are
applied (e.g., see §22.4.7).
All of the operators and mathematical functions on vvaallaarrrraayys can also be applied to
sslliiccee__aarrrraayys (§22.4.6), ggsslliiccee__aarrrraayys (§22.4.8), m
maasskk__aarrrraayys (§22.4.9), iinnddiirreecctt__aarrrraayys
(§22.4.10), and combinations of these types. However, an implementation is allowed to convert an
operand that is not a vvaallaarrrraayy to a vvaallaarrrraayy before performing a required operation.
22.4.5 Slices [num.slice]
A sslliiccee is an abstraction that allows us to manipulate a vector efficiently as a matrix of arbitrary
dimension. It is the key notion of Fortran vectors and of the BLAS (Basic Linear Algebra Subprograms) library, which is the basis for much numeric computation. Basically, a slice is every nnth
element of some part of a vvaallaarrrraayy:
ccllaassss ssttdd::sslliiccee {
// starting index, a length, and a stride
ppuubblliicc:
sslliiccee();
sslliiccee(ssiizzee__tt ssttaarrtt, ssiizzee__tt ssiizzee, ssiizzee__tt ssttrriiddee);
ssiizzee__tt ssttaarrtt() ccoonnsstt;
ssiizzee__tt ssiizzee() ccoonnsstt;
ssiizzee__tt ssttrriiddee() ccoonnsstt;
// index of first element
// number of elements
// element n is at start()+n*stride()
};
A stride is the distance (in number of elements) between two elements of the sslliiccee. Thus, a sslliiccee
describes a sequence of integers. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.5
Slices
669
ssiizzee__tt sslliiccee__iinnddeexx(ccoonnsstt sslliiccee& ss, ssiizzee__tt ii) // map i to its corresponding index
{
rreettuurrnn ss.ssttaarrtt()+ii*ss.ssttrriiddee();
}
vvooiidd pprriinntt__sseeqq(ccoonnsstt sslliiccee& ss) // print the elements of s
{
ffoorr (iinntt i = 00; ii<ss.ssiizzee(); ii++) ccoouutt << sslliiccee__iinnddeexx(ss,ii) << " ";
}
vvooiidd ff()
{
pprriinntt__sseeqq(sslliiccee(00,33,44));
ccoouutt << ", ";
pprriinntt__sseeqq(sslliiccee(11,33,44));
ccoouutt << ", ";
pprriinntt__sseeqq(sslliiccee(00,44,11));
ccoouutt << ", ";
pprriinntt__sseeqq(sslliiccee(44,44,11));
}
// row 0
// row 1
// column 0
// column 1
prints 0 4 8 , 1 5 9 , 0 1 2 3 , 4 5 6 77.
In other words, a sslliiccee describes a mapping of non-negative integers into indices. The number
of elements (the ssiizzee()) doesn’t affect the mapping (addressing) but simply allows us to find the
end of a sequence. This mapping can be used to simulate two-dimensional arrays within a onedimensional array (such as vvaallaarrrraayy) in an efficient, general, and reasonably convenient way. Consider a 3-by-4 matrix the way we often think of it (§C.7):
00
10
20
30
01
11
21
31
02
12
22
32
Following Fortran conventions, we can lay it out in memory like this:
0
4
8
00 10 20 30 01 11 21 31 02 12 22 32
0 1 2 3
This is not the way arrays are laid out in C++ (see §C.7). However, we should be able to present a
concept with a clean and logical interface and then choose a representation to suit the constraints of
the problem. Here, I have chosen to use Fortran layout to ease the interaction with numeric software that follows that convention. I have not, however, gone so far as to start indexing from 1
rather than 00; that is left as an exercise (§22.9[9]). Much numeric computation is done and will
remain done in a mixture of languages and using a variety of libraries. Often the ability to manipulate data in a variety of formats determined by those libraries and language standards is essential.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
670
Numerics
Chapter 22
Row x can be described by a sslliiccee(xx,33,44). That is, the first element of row x is the xxth element of the vector, the next element of the row is the (xx+44)th, etc., and there are 3 elements in
each row. In the figures, sslliiccee(00,33,44) describes the row 0000, 0011, and 0022.
Column y can be described by sslliiccee(44*yy,44,11). That is, the first element of column y is the
44*yyth element of the vector, the next element of the column is the (44*yy+11)th, etc., and there are 4
elements in each column. In the figures, sslliiccee(00,44,11) describes the column 0000, 1100, 2200, and 3300.
In addition to its use for simulating two-dimensional arrays, a sslliiccee can describe many other
sequences. It is a fairly general way of specifying very simple sequences. This notion is explored
further in §22.4.8.
One way of thinking of a slice is as an odd kind of iterator: a sslliiccee allows us to describe a
sequence of indices for a vvaallaarrrraayy. We could build a real iterator based on that:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSlliiccee__iitteerr {
vvaallaarrrraayy<T
T>* vv;
sslliiccee ss;
ssiizzee__tt ccuurrrr;
// index of current element
T
T& rreeff(ssiizzee__tt ii) ccoonnsstt { rreettuurrnn (*vv)[ss.ssttaarrtt()+ii*ss.ssttrriiddee()]; }
ppuubblliicc:
SSlliiccee__iitteerr(vvaallaarrrraayy<T
T>* vvvv, sslliiccee ssss) :vv(vvvv), ss(ssss), ccuurrrr(00) { }
SSlliiccee__iitteerr eenndd()
{
SSlliiccee__iitteerr t = *tthhiiss;
tt.ccuurrrr = ss.ssiizzee();
rreettuurrnn tt;
}
// index of last-plus-one element
SSlliiccee__iitteerr& ooppeerraattoorr++() { ccuurrrr++; rreettuurrnn *tthhiiss; }
SSlliiccee__iitteerr ooppeerraattoorr++(iinntt) { SSlliiccee__iitteerr t = *tthhiiss; ccuurrrr++; rreettuurrnn tt; }
T
T& ooppeerraattoorr[](ssiizzee__tt ii) { rreettuurrnn rreeff(ccuurrrr=ii); }
T
T& ooppeerraattoorr()(ssiizzee__tt ii) { rreettuurrnn rreeff(ccuurrrr=ii); }
T
T& ooppeerraattoorr*() { rreettuurrnn rreeff(ccuurrrr); }
// C style subscript
// Fortran-style subscript
// current element
// ...
};
Since a sslliiccee has a size, we could even provide range checking. Here, I have taken advantage of
sslliiccee::ssiizzee() to provide an eenndd() operation to provide an iterator for the one-past-the-end element of the vvaallaarrrraayy.
Since a sslliiccee can describe either a row or a column, the SSlliiccee__iitteerr allows us to traverse a
vvaallaarrrraayy by row or by column.
For SSlliiccee__iitteerr to be useful, ==, !=, and < must be defined:
tteem
mppllaattee<ccllaassss T
T> bbooooll ooppeerraattoorr==(ccoonnsstt SSlliiccee__iitteerr<T
T>& pp, ccoonnsstt SSlliiccee__iitteerr<T
T>& qq)
{
rreettuurrnn pp.ccuurrrr==qq.ccuurrrr && pp.ss.ssttrriiddee()==qq.ss.ssttrriiddee() && pp.ss.ssttaarrtt()==qq.ss.ssttaarrtt();
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.5
Slices
671
tteem
mppllaattee<ccllaassss T
T> bbooooll ooppeerraattoorr!=(ccoonnsstt SSlliiccee__iitteerr<T
T>& pp, ccoonnsstt SSlliiccee__iitteerr<T
T>& qq)
{
rreettuurrnn !(pp==qq);
}
tteem
mppllaattee<ccllaassss T
T> bbooooll ooppeerraattoorr<(ccoonnsstt SSlliiccee__iitteerr<T
T>& pp, ccoonnsstt SSlliiccee__iitteerr<T
T>& qq)
{
rreettuurrnn pp.ccuurrrr<qq.ccuurrrr && pp.ss.ssttrriiddee()==qq.ss.ssttrriiddee() && pp.ss.ssttaarrtt()==qq.ss.ssttaarrtt();
}
22.4.6 Slice_array [num.slicearray]
From a vvaallaarrrraayy and a sslliiccee, we can build something that looks and feels like a vvaallaarrrraayy, but
which is really simply a way of referring to the subset of the array described by the slice. Such a
sslliiccee__aarrrraayy is defined like this:
tteem
mppllaattee <ccllaassss T
T> ccllaassss ssttdd::sslliiccee__aarrrraayy {
ppuubblliicc:
ttyyppeeddeeff T vvaalluuee__ttyyppee;
vvooiidd ooppeerraattoorr=(ccoonnsstt vvaallaarrrraayy<T
T>&);
vvooiidd ooppeerraattoorr=(ccoonnsstt T
T& vvaall);
// assign val to each element
vvooiidd ooppeerraattoorr*=(ccoonnsstt vvaallaarrrraayy<T
T>& vvaall);
// v[i]*=val for each element
// similarly: /=, %=, +=, – =, ˆ=, &=, =, <<=, >>=
~sslliiccee__aarrrraayy();
pprriivvaattee:
sslliiccee__aarrrraayy();
sslliiccee__aarrrraayy(ccoonnsstt sslliiccee__aarrrraayy&);
sslliiccee__aarrrraayy& ooppeerraattoorr=(ccoonnsstt sslliiccee__aarrrraayy&);
vvaallaarrrraayy<T
T>* pp;
sslliiccee ss;
// prevent construction
// prevent copying
// prevent copying
// implementation-defined representation
};
A user cannot directly create a sslliiccee__aarrrraayy. Instead, the user subscripts a vvaallaarrrraayy to create a
sslliiccee__aarrrraayy for a given slice. Once the sslliiccee__aarrrraayy is initialized, all references to it indirectly go to
the vvaallaarrrraayy for which it is created. For example, we can create something that represents every
second element of an array like this:
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& dd)
{
sslliiccee__aarrrraayy<ddoouubbllee>& vv__eevveenn = dd[sslliiccee(00,dd.ssiizzee()/22,22)];
sslliiccee__aarrrraayy<ddoouubbllee>& vv__oodddd = dd[sslliiccee(11,dd.ssiizzee()/22,22)];
vv__oodddd *= 22;
vv__eevveenn = 00;
// double every odd element of d
// assign 0 to every even element of d
}
The ban on copying sslliiccee__aarrrraayys is necessary so as to allow optimizations that rely on absence of
aliases. It can be quite constraining. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
672
Numerics
Chapter 22
sslliiccee__aarrrraayy<ddoouubbllee> rroow
w(vvaallaarrrraayy<ddoouubbllee>& dd, iinntt ii)
{
sslliiccee__aarrrraayy<ddoouubbllee> v = dd[sslliiccee(00,22,dd.ssiizzee()/22)]; // error: attempt to copy
rreettuurrnn dd[sslliiccee(ii%22,ii,dd.ssiizzee()/22)];
// error: attempt to copy
}
Often copying a sslliiccee is a reasonable alternative to copying a sslliiccee__aarrrraayy.
Slices can be used to express a variety of subsets of an array. For example, we might use slices
to manipulate contiguous subarrays like this:
iinnlliinnee sslliiccee ssuubb__aarrrraayy(ssiizzee__tt ffiirrsstt, ssiizzee__tt ccoouunntt) // [first:first+count[
{
rreettuurrnn sslliiccee(ffiirrsstt,ccoouunntt,11);
}
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& vv)
{
ssiizzee__tt sszz = vv.ssiizzee();
iiff (sszz<22) rreettuurrnn;
ssiizzee__tt n = sszz/22;
ssiizzee__tt nn22 = sszz-nn;
vvaallaarrrraayy<ddoouubbllee> hhaallff11(nn);
vvaallaarrrraayy<ddoouubbllee> hhaallff22(nn22);
hhaallff11 = vv[ssuubb__aarrrraayy(00,nn)];
hhaallff22 = vv[ssuubb__aarrrraayy(nn,nn22)];
// copy of first half of v
// copy of second half of v
// ...
}
The standard library does not provide a matrix class. Instead, the intent is for vvaallaarrrraayy and sslliiccee to
provide the tools for building matrices optimized for a variety of needs. Consider how we might
implement a simple two-dimensional matrix using a vvaallaarrrraayy and sslliiccee__aarrrraayys:
ccllaassss M
Maattrriixx {
vvaallaarrrraayy<ddoouubbllee>* vv;
ssiizzee__tt dd11, dd22;
ppuubblliicc:
M
Maattrriixx(ssiizzee__tt xx, ssiizzee__tt yy);
// note: no default constructor
M
Maattrriixx& M
Maattrriixx(ccoonnsstt M
Maattrriixx&);
M
Maattrriixx& ooppeerraattoorr=(ccoonnsstt M
Maattrriixx&);
~M
Maattrriixx();
ssiizzee__tt ssiizzee() ccoonnsstt { rreettuurrnn dd11*dd22; }
ssiizzee__tt ddiim
m11() ccoonnsstt { rreettuurrnn dd11; }
ssiizzee__tt ddiim
m22() ccoonnsstt { rreettuurrnn dd22; }
SSlliiccee__iitteerr<ddoouubbllee> rroow
w(ssiizzee__tt ii);
C
Csslliiccee__iitteerr<ddoouubbllee> rroow
w(ssiizzee__tt ii) ccoonnsstt;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.6
Slice_array
673
SSlliiccee__iitteerr<ddoouubbllee> ccoolluum
mnn(ssiizzee__tt ii);
C
Csslliiccee__iitteerr<ddoouubbllee> ccoolluum
mnn(ssiizzee__tt ii) ccoonnsstt;
ddoouubbllee& ooppeerraattoorr()(ssiizzee__tt xx, ssiizzee__tt yy);
ddoouubbllee ooppeerraattoorr()(ssiizzee__tt xx, ssiizzee__tt yy) ccoonnsstt;
// Fortran-style subscripts
SSlliiccee__iitteerr<ddoouubbllee> ooppeerraattoorr()(ssiizzee__tt ii) { rreettuurrnn rroow
w(ii); }
C
Csslliiccee__iitteerr<ddoouubbllee> ooppeerraattoorr()(ssiizzee__tt ii) ccoonnsstt { rreettuurrnn rroow
w(ii); }
SSlliiccee__iitteerr<ddoouubbllee> ooppeerraattoorr[](ssiizzee__tt ii) { rreettuurrnn rroow
w(ii); }
// C-style subscript
C
Csslliiccee__iitteerr<ddoouubbllee> ooppeerraattoorr[](ssiizzee__tt ii) ccoonnsstt { rreettuurrnn rroow
w(ii); }
M
Maattrriixx& ooppeerraattoorr*=(ddoouubbllee);
vvaallaarrrraayy<ddoouubbllee>& aarrrraayy() { rreettuurrnn *vv; }
};
The representation of a M
Maattrriixx is a vvaallaarrrraayy. We impose dimensionality on that array through slicing. When necessary, we can view that representation as having one, two, three, etc., dimensions in
the same way that we provide the default two-dimensional view through rroow
w() and ccoolluum
mnn().
The SSlliiccee__iitteerrs are used to circumvent the ban on copying sslliiccee__aarrrraayyss. I couldn’t return a
sslliiccee__aarrrraayy:
sslliiccee__aarrrraayy<ddoouubbllee> rroow
w(ssiizzee__tt ii) { rreettuurrnn (*vv)(sslliiccee(ii,dd11,dd22)); }
so I returned an iterator containing a pointer to the vvaallaarrrraayy and the sslliiccee itself instead of a
sslliiccee__aarrrraayy.
We need an additional class ‘‘iterator for slice of constants,’’ C
Csslliiccee__iitteerr to express the distinction between a slice of a ccoonnsstt M
Maattrriixx and a slice of a non-ccoonnsstt M
Maattrriixx:
iinnlliinnee SSlliiccee__iitteerr<ddoouubbllee> M
Maattrriixx::rroow
w(ssiizzee__tt ii)
{
rreettuurrnn SSlliiccee__iitteerr<ddoouubbllee>(vv,sslliiccee(ii,dd11,dd22));
}
iinnlliinnee C
Csslliiccee__iitteerr<ddoouubbllee> M
Maattrriixx::rroow
w(ssiizzee__tt ii) ccoonnsstt
{
rreettuurrnn C
Csslliiccee__iitteerr<ddoouubbllee>(vv,sslliiccee(ii,dd11,dd22));
}
iinnlliinnee SSlliiccee__iitteerr<ddoouubbllee> M
Maattrriixx::ccoolluum
mnn(ssiizzee__tt ii)
{
rreettuurrnn SSlliiccee__iitteerr<ddoouubbllee>(vv,sslliiccee(ii*dd22,dd22,11));
}
iinnlliinnee C
Csslliiccee__iitteerr<ddoouubbllee> M
Maattrriixx::ccoolluum
mnn(ssiizzee__tt ii) ccoonnsstt
{
rreettuurrnn C
Csslliiccee__iitteerr<ddoouubbllee>(vv,sslliiccee(ii*dd22,dd22,11));
}
The definition of C
Csslliiccee__iitteerr is identical to that of SSlliiccee__iitteerr, except that it returns ccoonnsstt references
to elements of its slice.
The rest of the member operations are fairly trivial:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
674
Numerics
Chapter 22
M
Maattrriixx::M
Maattrriixx(ssiizzee__tt xx, ssiizzee__tt yy)
{
// check that x and y are sensible
dd11 = xx;
dd22 = yy;
v = nneew
w vvaallaarrrraayy<ddoouubbllee>(xx*yy);
}
ddoouubbllee& M
Maattrriixx::ooppeerraattoorr()(ssiizzee__tt xx, ssiizzee__tt yy)
{
rreettuurrnn rroow
w(xx)[yy];
}
ddoouubbllee m
muull(ccoonnsstt vvaallaarrrraayy<ddoouubbllee>& vv11, ccoonnsstt vvaallaarrrraayy<ddoouubbllee>& vv22)
{
ddoouubbllee rreess = 00;
ffoorr (iinntt i = 00; ii<vv11.ssiizzee(); ii++) rreess+= vv11[ii]*vv22[ii];
rreettuurrnn rreess;
}
vvaallaarrrraayy<ddoouubbllee> ooppeerraattoorr*(ccoonnsstt M
Maattrriixx& m
m, ccoonnsstt vvaallaarrrraayy<ddoouubbllee>& vv)
{
vvaallaarrrraayy<ddoouubbllee> rreess(m
m.ddiim
m11());
ffoorr (iinntt i = 00; ii<m
m.ddiim
m11(); ii++) rreess(ii) = m
muull(m
m.rroow
w(ii),vv);
rreettuurrnn rreess;
}
M
Maattrriixx& M
Maattrriixx::ooppeerraattoorr*=(ddoouubbllee dd)
{
(*vv) *= dd;
rreettuurrnn *tthhiiss;
}
I provided (ii,jj) to express M
Maattrriixx subscripting because () is a single operator and because that
notation is the most familiar to many in the numeric community. The concept of a row provides
the more familiar (in the C and C++ communities) [ii][jj] notation:
vvooiidd ff(M
Maattrriixx& m
m)
{
m
m(11,22) = 55;
// Fortran-style subscripts
m
m.rroow
w(11)(22) = 66;
m
m.rroow
w(11)[22] = 77;
m
m[11](22) = 88;
// undesirable mixed style (but it works)
m
m[11][22] = 99;
// C++-style subscripts
}
The use of sslliiccee__aarrrraayyss to express subscripting assumes a good optimizer.
Generalizing this to an nn-dimensional matrix of arbitrary elements and with a reasonable set of
operations is left as an exercise (§22.9[7]).
Maybe your first idea for a two-dimensional vector was something like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.6
Slice_array
675
ccllaassss M
Maattrriixx {
vvaallaarrrraayy< vvaallaarrrraayy<ddoouubbllee> > vv;
ppuubblliicc:
// ...
};
This would also work (§22.9[10]). However, it is not easy to match the efficiency and compatibility required by high-performance computations without dropping to the lower and more conventional level represented by vvaallaarrrraayy plus sslliiccees.
22.4.7 Temporaries, Copying, and Loops [num.matrix]
If you build a vector or a matrix class, you will soon find that three related problems have to be
faced to satisfy performance-conscious users:
[1] The number of temporaries must be minimized.
[2] Copying of matrices must be minimized.
[3] Multiple loops over the same data in composite operations must be minimized.
These issues are not directly addressed by the standard library. However, I can outline a technique
that can be used to produce highly optimized implementations.
Consider U
U=M
M*V
V+W
W, where U
U, V
V, and W are vectors and M is a matrix. A naive implementation introduces temporary vectors for M
M*V
V and M
M*V
V+W
W and copies the results of M
M*V
V and
M
M*V
V+W
W. A smart implementation calls a function m
muull__aadddd__aanndd__aassssiiggnn(&U
U,&M
M,&V
V,&W
W) that
introduces no temporaries, copies no vectors, and touches each element of the matrices the minimum number of times.
This degree of optimization is rarely necessary for more than a few kinds of expressions, so a
simple solution to efficiency problems is to provide functions such as m
muull__aadddd__aanndd__aassssiiggnn() and
let the user call those where it matters. However, it is possible to design a M
Maattrriixx so that such optimizations are applied automatically for expressions of the right form. That is, we can treat
U
U=M
M*V
V+W
W as a use of a single operator with four operands. The basic technique was demonstrated for oossttrreeaam
m manipulators (§21.4.6.3). In general, it can be used to make a combination of n
binary operators act like an (nn+11)-ary operator. Handling U
U=M
M*V
V+W
W requires the introduction of
two auxiliary classes. However, the technique can result in impressive speedups (say, 30 times) on
some systems by enabling more-powerful optimization techniques.
First, we define the result of multiplying a M
Maattrriixx by a V
Veeccttoorr:
ssttrruucctt M
MV
Vm
muull {
ccoonnsstt M
Maattrriixx& m
m;
ccoonnsstt V
Veeccttoorr& vv;
M
MV
Vm
muull(ccoonnsstt M
Maattrriixx& m
mm
m, ccoonnsstt V
Veeccttoorr &vvvv) :m
m(m
mm
m), vv(vvvv) { }
ooppeerraattoorr V
Veeccttoorr(); // evaluate and return result
};
iinnlliinnee M
MV
Vm
muull ooppeerraattoorr*(ccoonnsstt M
Maattrriixx& m
mm
m, ccoonnsstt V
Veeccttoorr& vvvv)
{
rreettuurrnn M
MV
Vm
muull(m
mm
m,vvvv);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
676
Numerics
Chapter 22
This ‘‘multiplication’’ does nothing except store references to its operands; the evaluation of M
M*V
V
is deferred. The object produced by * is closely related to what is called a cclloossuurree in many technical communities. Similarly, we can deal with what happens if we add a V
Veeccttoorr:
ssttrruucctt M
MV
Vm
muullV
Vaadddd {
ccoonnsstt M
Maattrriixx& m
m;
ccoonnsstt V
Veeccttoorr& vv;
ccoonnsstt V
Veeccttoorr& vv22;
M
MV
Vm
muullV
Vaadddd(ccoonnsstt M
MV
Vm
muull& m
mvv, ccoonnsstt V
Veeccttoorr& vvvv) :m
m(m
mvv.m
m), vv(m
mvv.vv), vv22(vvvv) { }
ooppeerraattoorr V
Veeccttoorr(); // evaluate and return result
};
iinnlliinnee M
MV
Vm
muullV
Vaadddd ooppeerraattoorr+(ccoonnsstt M
MV
Vm
muull& m
mvv, ccoonnsstt V
Veeccttoorr& vvvv)
{
rreettuurrnn M
MV
Vm
muullV
Vaadddd(m
mvv,vvvv);
}
This defers the evaluation of M
M*V
V+W
W. We now have to ensure that it all gets evaluated using a
good algorithm when it is assigned to a V
Veeccttoorr:
ccllaassss V
Veeccttoorr {
// ...
ppuubblliicc:
V
Veeccttoorr(ccoonnsstt M
MV
Vm
muullV
Vaadddd& m
m)
// initialize by result of m
{
// allocate elements, etc.
m
muull__aadddd__aanndd__aassssiiggnn(tthhiiss,&m
m.m
m,&m
m.vv,&m
m.vv22);
}
V
Veeccttoorr& ooppeerraattoorr=(ccoonnsstt M
MV
Vm
muullV
Vaadddd& m
m)
// assign the result of m to *this
{
m
muull__aadddd__aanndd__aassssiiggnn(tthhiiss,&m
m.m
m,&m
m.vv,&m
m.vv22);
rreettuurrnn *tthhiiss;
}
// ...
};
Now U
U=M
M*V
V+W
W is automatically expanded to
U
U.ooppeerraattoorr=(M
MV
Vm
muullV
Vaadddd(M
MV
Vm
muull(M
M,V
V),W
W))
which because of inlining resolves to the desired simple call
m
muull__aadddd__aanndd__aassssiiggnn(&U
U,&M
M,&V
V,&W
W)
Clearly, this eliminates the copying and the temporaries. In addition, we might write
m
muull__aadddd__aanndd__aassssiiggnn() in an optimized fashion. However, if we just wrote it in a fairly simple
and unoptimized fashion, it would still be in a form that offered great opportunities to an optimizer.
I introduced a new V
Veeccttoorr (rather than using a vvaallaarrrraayy) because I needed to define assignment
(and assignment must be a member function; §11.2.2). However, vvaallaarrrraayy is a strong candidate for
the representation of that V
Veeccttoorr.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.7
Temporaries, Copying, and Loops
677
The importance of this technique is that most really time-critical vector and matrix computations are done using a few relatively simple syntactic forms. Typically, there is no real gain in optimizing expressions of half-a-dozen operators this way; more conventional techniques (§11.6) suffice.
This technique is based on the idea of using compile-time analysis and closure objects to transfer evaluation of subexpression into an object representing a composite operation. It can be applied
to a variety of problems with the common attribute that several pieces of information need to be
gathered into one function before evaluation can take place. I refer to the objects generated to defer
evaluation as composition closure objects, or simply compositors.
22.4.8 Generalized Slices [num.gslice]
The M
Maattrriixx example in §22.4.6 showed how two sslliiccees could be used to describe rows and
columns of a two-dimensional array. In general, a sslliiccee can describe any row or column of an nndimensional array (§22.9[7]). However, sometimes we need to extract a subarray that is not a row
or a column. For example, we might want to extract the 2-by-3 matrix from the top-left corner of a
3-by-4 matrix:
00
10
20
30
01
11
21
31
02
12
22
32
Unfortunately, these elements are not allocated in a way that can be described by a single slice:
0 1 2
00 10 20 30 01 11 21 31 02 12 22 32
4 5 6
A ggsslliiccee is a ‘‘generalized slice’’ that contains (almost) the information from n slices:
ccllaassss ssttdd::ggsslliiccee {
// instead of 1 stride and one size like slice, gslice holds n strides and n sizes
ppuubblliicc:
ggsslliiccee();
ggsslliiccee(ssiizzee__tt ss, ccoonnsstt vvaallaarrrraayy<ssiizzee__tt>& ll, ccoonnsstt vvaallaarrrraayy<ssiizzee__tt>& dd);
ssiizzee__tt ssttaarrtt() ccoonnsstt;
vvaallaarrrraayy<ssiizzee__tt> ssiizzee() ccoonnsstt;
vvaallaarrrraayy<ssiizzee__tt> ssttrriiddee() ccoonnsstt;
// index of first element
// number of elements in dimension
// stride for index[0], index[1], ...
};
The extra values allow a ggsslliiccee to specify a mapping between n integers and an index to be used to
address elements of an array. For example, we can describe the layout of the 2-by-3 matrix by a
pair of (length,stride) pairs. As shown in §22.4.5, a length of 2 and a stride of 4 describes two
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
678
Numerics
Chapter 22
elements of a row of the 3-by-4 matrix, when Fortran layout is used. Similarly, a length of 3 and a
stride of 1 describes 3 elements of a column. Together, they describe every element of the 2-by-3
submatrix. To list the elements, we can write:
ssiizzee__tt ggsslliiccee__iinnddeexx(ccoonnsstt ggsslliiccee& ss, ssiizzee__tt ii, ssiizzee__tt jj)
{
rreettuurrnn ss.ssttaarrtt()+ii*ss.ssttrriiddee()[00]+jj*ss.ssttrriiddee()[11];
}
ssiizzee__tt lleenn[] = { 22, 3 };
ssiizzee__tt ssttrr[] = { 44, 1 };
// (len[0],str[0]) describes a row
// (len[1],str[1]) describes a column
vvaallaarrrraayy<ssiizzee__tt> lleennggtthhss(lleenn,22);
vvaallaarrrraayy<ssiizzee__tt> ssttrriiddeess(ssttrr,22);
vvooiidd ff()
{
ggsslliiccee ss(00,lleennggtthhss,ssttrriiddeess);
ffoorr (iinntt i = 0 ; ii<ss.ssiizzee()[00]; ii++) ccoouutt << ggsslliiccee__iinnddeexx(ss,ii,00) << " ";// row
ccoouutt << ", ";
ffoorr (iinntt j = 0 ; jj<ss.ssiizzee()[11]; jj++) ccoouutt << ggsslliiccee__iinnddeexx(ss,00,jj) << " ";// column
}
This prints 0 4 , 0 1 22.
In this way, a ggsslliiccee with two (length,stride) pairs describes a subarray of a 2-dimensional
array, a ggsslliiccee with three (length,stride) pairs describes a subarray of a 3-dimensional array, etc.
Using a ggsslliiccee as the index of a vvaallaarrrraayy yields a ggsslliiccee__aarrrraayy consisting of the elements
described by the ggsslliiccee. For example:
vvooiidd ff(vvaallaarrrraayy<ffllooaatt>& vv)
{
ggsslliiccee m
m(00,lleennggtthhss,ssttrriiddeess);
vv[m
m] = 00; // assign 0 to v[0],v[1],v[2],v[4],v[5],v[6]
}
The ggsslliiccee__aarrrraayy offers the same set of members as sslliiccee__aarrrraayy. In particular, a ggsslliiccee__aarrrraayy
cannot be constructed directly by the user and cannot be copied (§22.4.6). Instead, a ggsslliiccee__aarrrraayy
is the result of using a ggsslliiccee as the subscript of a vvaallaarrrraayy (§22.4.2).
22.4.9 Masks [num.mask]
Am
maasskk__aarrrraayy provides yet another way of specifying a subset of a vvaallaarrrraayy and making the result
look like a vvaallaarrrraayy. In the context of vvaallaarrrraayys, a mask is simply a vvaallaarrrraayy<bbooooll>. When a
mask is used as a subscript for a vvaallaarrrraayy, a ttrruuee bit indicates that the corresponding element of the
vvaallaarrrraayy is considered part of the result. This allows us to operate on a subset of a vvaallaarrrraayy even if
there is no simple pattern (such as a sslliiccee) that describes that subset. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.4.9
Masks
679
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& vv)
{
bbooooll bb[] = { ttrruuee , ffaallssee, ffaallssee, ttrruuee, ffaallssee, ttrruuee };
vvaallaarrrraayy<bbooooll> m
maasskk(bb,66);
// elements 0, 3, and 5
vvaallaarrrraayy<ddoouubbllee> vvvv = ccooss(vv[m
maasskk]);
// vv[0]==cos(v[0]), vv[1]==cos(v[3]),
// vv[2]==cos(v[5])
}
The m
maasskk__aarrrraayy offers the same set of members as sslliiccee__aarrrraayy. In particular, a m
maasskk__aarrrraayy cannot be constructed directly by the user and cannot be copied (§22.4.6). Instead, a m
maasskk__aarrrraayy is
the result of using a vvaallaarrrraayy<bbooooll> as the subscript of a vvaallaarrrraayy (§22.4.2). The number of elements of a vvaallaarrrraayy used as a mask must not be greater than the number of elements of the
vvaallaarrrraayy for which it is used as a subscript.
22.4.10 Indirect Arrays [num.indirect]
An iinnddiirreecctt__aarrrraayy provides a way of arbitrarily subsetting and reordering a vvaallaarrrraayy. For example:
vvooiidd ff(vvaallaarrrraayy<ddoouubbllee>& vv)
{
ssiizzee__tt ii[] = { 33, 22, 11, 0 };
vvaallaarrrraayy<ssiizzee__tt> iinnddeexx(ii,44);
vvaallaarrrraayy<ddoouubbllee> vvvv = lloogg(vv[iinnddeexx]);
// first four elements in reverse order
// elements 3, 2, 1, 0 (in that order)
// vv[0]==log(v[3]), vv[1]==log(v[2]),
// vv[2]==log(v[1]), vv[3]==log(v[0])
}
If an index is specified twice, we have referred to an element of a vvaallaarrrraayy twice in the same operation. That’s exactly the kind of aliasing that vvaallaarrrraayys do not allow, so the behavior of an
iinnddiirreecctt__aarrrraayy is undefined if an index is repeated.
The iinnddiirreecctt__aarrrraayy offers the same set of members as sslliiccee__aarrrraayy. In particular, a an
iinnddiirreecctt__aarrrraayy cannot be constructed directly by the user and cannot be copied (§22.4.6). Instead,
an iinnddiirreecctt__aarrrraayy is the result of using a vvaallaarrrraayy<ssiizzee__tt> as the subscript of a vvaallaarrrraayy
(§22.4.2). The number of elements of a vvaallaarrrraayy used as a subscript must not be greater than the
number of elements of the vvaallaarrrraayy for which it is used as a subscript.
22.5 Complex Arithmetic [num.complex]
The standard library provides a ccoom
mpplleexx template along the lines of the ccoom
mpplleexx class described in
§11.3. The library ccoom
mpplleexx needs to be a template to serve the need for complex numbers based on
different scalar types. In particular, specializations are provided for ccoom
mpplleexx using ffllooaatt, ddoouubbllee,
and lloonngg ddoouubbllee as its scalar type.
The ccoom
mpplleexx template is defined in namespace ssttdd and presented in <ccoom
mpplleexx>:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
680
Numerics
Chapter 22
tteem
mppllaattee<ccllaassss T
T> ccllaassss ssttdd::ccoom
mpplleexx {
T rree, iim
m;
ppuubblliicc:
ttyyppeeddeeff T vvaalluuee__ttyyppee;
ccoom
mpplleexx(ccoonnsstt T
T& r = T
T(), ccoonnsstt T
T& i = T
T()) : rree(rr), iim
m(ii) { }
tteem
mppllaattee<ccllaassss X
X> ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx<X
X>& aa) : rree(aa.rree), iim
m(aa.iim
m) { }
T rreeaall() ccoonnsstt { rreettuurrnn rree; }
T iim
maagg() ccoonnsstt { rreettuurrnn iim
m; }
ccoom
mpplleexx<T
T>& ooppeerraattoorr=(ccoonnsstt T
T& zz); // assign complex(z,0)
tteem
mppllaattee<ccllaassss X
X> ccoom
mpplleexx<T
T>& ooppeerraattoorr=(ccoonnsstt ccoom
mpplleexx<X
X>&);
// similarly: +=, – =, *=, /=
};
The representation and the inline functions are here for illustration. One could – barely – imagine
a standard library ccoom
mpplleexx that used a different representation. Note the use of member templates
to ensure initialization and assignment of any ccoom
mpplleexx type with any other (§13.6.2).
Throughout this book, I have used ccoom
mpplleexx as a class rather than as a template. This is feasible
because I assumed a bit of namespace magic to get the ccoom
mpplleexx of ddoouubbllee that I usually prefer:
ttyyppeeddeeff ssttdd::ccoom
mpplleexx<ddoouubbllee> ccoom
mpplleexx;
The usual unary and binary operators are defined:
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ooppeerraattoorr+(ccoonnsstt ccoom
mpplleexx<T
T>&, ccoonnsstt ccoom
mpplleexx<T
T>&);
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ooppeerraattoorr+(ccoonnsstt ccoom
mpplleexx<T
T>&, ccoonnsstt T
T&);
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ooppeerraattoorr+(ccoonnsstt T
T&, ccoonnsstt ccoom
mpplleexx<T
T>&);
// similarly: – , *, /, ==, and !=
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ooppeerraattoorr+(ccoonnsstt ccoom
mpplleexx<T
T>&);
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ooppeerraattoorr-(ccoonnsstt ccoom
mpplleexx<T
T>&);
The coordinate functions are provided:
tteem
mppllaattee<ccllaassss T
T> T rreeaall(ccoonnsstt ccoom
mpplleexx<T
T>&);
tteem
mppllaattee<ccllaassss T
T> T iim
maagg(ccoonnsstt ccoom
mpplleexx<T
T>&);
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ccoonnjj(ccoonnsstt ccoom
mpplleexx<T
T>&);
// construct from polar coordinates (abs(),arg()):
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ppoollaarr(ccoonnsstt T
T& rrhhoo, ccoonnsstt T
T& tthheettaa);
tteem
mppllaattee<ccllaassss T
T> T aabbss(ccoonnsstt ccoom
mpplleexx<T
T>&);
tteem
mppllaattee<ccllaassss T
T> T aarrgg(ccoonnsstt ccoom
mpplleexx<T
T>&);
// sometimes called rho
// sometimes called theta
tteem
mppllaattee<ccllaassss T
T> T nnoorrm
m(ccoonnsstt ccoom
mpplleexx<T
T>&);
// square of abs()
The usual set of mathematical functions is provided:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.5
Complex Arithmetic
681
tteem
mppllaattee<ccllaassss T
T> ccoom
mpplleexx<T
T> ssiinn(ccoonnsstt ccoom
mpplleexx<T
T>&);
//similarly: sinh, sqrt, tan, tanh, cos, cosh, exp, log, and log10
tteem
mppllaattee<ccllaassss
tteem
mppllaattee<ccllaassss
tteem
mppllaattee<ccllaassss
tteem
mppllaattee<ccllaassss
T
T> ccoom
mpplleexx<T
T> ppoow
w(ccoonnsstt
T
T> ccoom
mpplleexx<T
T> ppoow
w(ccoonnsstt
T
T> ccoom
mpplleexx<T
T> ppoow
w(ccoonnsstt
T
T> ccoom
mpplleexx<T
T> ppoow
w(ccoonnsstt
ccoom
mpplleexx<T
T>&,iinntt);
ccoom
mpplleexx<T
T>&, ccoonnsstt T
T&);
ccoom
mpplleexx<T
T>&, ccoonnsstt ccoom
mpplleexx<T
T>&);
T
T&, ccoonnsstt ccoom
mpplleexx<T
T>&);
Finally, stream I/O is provided:
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr>>(bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>&, ccoom
mpplleexx<T
T>&);
tteem
mppllaattee<ccllaassss T
T, ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>&, ccoonnsstt ccoom
mpplleexx<T
T>&);
A complex is written out in the format (xx,yy) and can be read in the formats xx, (xx), and (xx,yy)
(§21.2.3, §21.3.5). The specializations ccoom
mpplleexx<ffllooaatt>, ccoom
mpplleexx<ddoouubbllee>, and ccoom
mpplleexx<lloonngg
ddoouubbllee> are provided to restrict conversions (§13.6.2) and to provide opportunities for optimized
implementations. For example:
ccllaassss ccoom
mpplleexx<ddoouubbllee> {
ddoouubbllee rree, iim
m;
ppuubblliicc:
ttyyppeeddeeff ddoouubbllee vvaalluuee__ttyyppee;
ccoom
mpplleexx(ddoouubbllee r = 00.00, ddoouubbllee i = 00.00) : rree(rr), iim
m(ii) { }
ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx<ffllooaatt>& aa) : rree(aa.rreeaall()), iim
m(aa.iim
maagg()) { }
eexxpplliicciitt ccoom
mpplleexx(ccoonnsstt ccoom
mpplleexx<lloonngg ddoouubbllee>& aa) : rree(aa.rreeaall()), iim
m(aa.iim
maagg()) { }
// ...
};
Now a ccoom
mpplleexx<ffllooaatt> can be quietly converted to a ccoom
mpplleexx<ddoouubbllee>, while a ccoom
mpplleexx< lloonngg
ddoouubbllee> can’t. Similar specializations ensures that a ccoom
mpplleexx<ffllooaatt> and a ccoom
mpplleexx<ddoouubbllee> can
be quietly converted to a ccoom
mpplleexx< lloonngg ddoouubbllee> but that a ccoom
mpplleexx< lloonngg ddoouubbllee> cannot be
implicitly converted to a ccoom
mpplleexx<ddoouubbllee> or to a ccoom
mpplleexx<ffllooaatt> and a ccoom
mpplleexx<ddoouubbllee> cannot be implicitly converted to a ccoom
mpplleexx<ffllooaatt>. For example:
vvooiidd ff(ccoom
mpplleexx<ffllooaatt> ccff, ccoom
mpplleexx<ddoouubbllee> ccdd, ccoom
mpplleexx<lloonngg ddoouubbllee> cclldd)
{
ccoom
mpplleexx<ddoouubbllee> c = ccff;
// fine
c = ccdd;
// fine
c = cclldd;
// error: possible truncation
c = ccoom
mpplleexx<ddoouubbllee>(cclldd);
// ok: you asked for truncation
ccff = cclldd;
ccff = ccdd;
ccff = ccoom
mpplleexx<ffllooaatt>(cclldd);
ccff = ccoom
mpplleexx<ffllooaatt>(ccdd);
// error: possible truncation
// error: possible truncation
// ok: you asked for truncation
// ok: you asked for truncation
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
682
Numerics
Chapter 22
22.6 Generalized Numeric Algorithms [num.general]
In <nnuum
meerriicc>, the standard library provides a few generalized numeric algorithms in the style of
the non-numeric algorithms from <aallggoorriitthhm
m> (Chapter 18) :
___________________________________________________________________
Generalized Numeric Algorithms <numeric>
__________________________________________________________________
____________________________________________________________________
muullaattee(())
Accumulate results of operation on a sequence
aaccccuum
Accumulate results of operation on two sequences
iinnnneerr__pprroodduucctt(())
ppaarrttiiaall__ssuum
m(())
Generate sequence by operation on a sequence
___________________________________________________________________
aaddjjaacceenntt__ddiiffffeerreennccee(())
Generate sequence by operation on a sequence
These algorithms generalize common operations such as computing a sum by letting them apply to
all kinds of sequences and by making the operation applied to elements on those sequences a
parameter. For each algorithm, the general version is supplemented by a version applying the most
common operator for that algorithm.
22.6.1 Accumulate [num.accumulate]
The aaccccuum
muullaattee() algorithm can be understood as the generalization of a sum of the elements of a
vector. The aaccccuum
muullaattee() algorithm is defined in namespace ssttdd and presented in <nnuum
meerriicc>:
tteem
mppllaattee <ccllaassss IInn, ccllaassss T
T> T aaccccuum
muullaattee(IInn ffiirrsstt, IInn llaasstt, T iinniitt)
{
w
whhiillee (ffiirrsstt != llaasstt) iinniitt = iinniitt + *ffiirrsstt++; // plus
rreettuurrnn iinniitt;
}
tteem
mppllaattee <ccllaassss IInn, ccllaassss T
T, ccllaassss B
BiinnO
Opp> T aaccccuum
muullaattee(IInn ffiirrsstt, IInn llaasstt, T iinniitt, B
BiinnO
Opp oopp)
{
w
whhiillee (ffiirrsstt != llaasstt) iinniitt = oopp(iinniitt,*ffiirrsstt++);
// general operation
rreettuurrnn iinniitt;
}
The simple version of aaccccuum
muullaattee() adds elements of a sequence using their + operator. For
example:
vvooiidd ff(vveeccttoorr<iinntt>& pprriiccee, lliisstt<ffllooaatt>& iinnccrr)
{
iinntt i = aaccccuum
muullaattee(pprriiccee.bbeeggiinn(),pprriiccee.eenndd(),00);
ddoouubbllee d = 00;
d = aaccccuum
muullaattee(iinnccrr.bbeeggiinn(),iinnccrr.eenndd(),dd);
// ...
}
// accumulate in int
// accumulate in double
Note how the type of the initial value passed determines the return type.
Not all items that we want to add are available as elements of a sequence. Where they are not,
we can often supply an operation for aaccccuum
muullaattee() to call in order to produce the items to be
added. The most obvious kind of operation to pass is one that extracts a value from a data structure. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.6.1
Accumulate
683
ssttrruucctt R
Reeccoorrdd {
// ...
iinntt uunniitt__pprriiccee;
iinntt nnuum
mbbeerr__ooff__uunniittss;
};
lloonngg pprriiccee(lloonngg vvaall, ccoonnsstt R
Reeccoorrdd& rr)
{
rreettuurrnn vvaall + rr.uunniitt__pprriiccee * rr.nnuum
mbbeerr__ooff__uunniittss;
}
vvooiidd ff(ccoonnsstt vveeccttoorr<R
Reeccoorrdd>& vv)
{
ccoouutt << "T
Toottaall vvaalluuee: " << aaccccuum
muullaattee(vv.bbeeggiinn(),vv.eenndd(),00,pprriiccee) << ´\\nn´;
}
Operations similar to aaccccuum
muullaattee are called rreedduuccee and rreedduuccttiioonn in some communities.
22.6.2 Inner_product [num.inner]
Accumulating from a sequence is very common, while accumulating from a pair of sequences is
not uncommon. The iinnnneerr__pprroodduucctt() algorithm is defined in namespace ssttdd and presented in
<nnuum
meerriicc>:
tteem
mppllaattee <ccllaassss IInn, ccllaassss IInn22, ccllaassss T
T>
T iinnnneerr__pprroodduucctt(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, T iinniitt)
{
w
whhiillee (ffiirrsstt != llaasstt) iinniitt = iinniitt + *ffiirrsstt++ * *ffiirrsstt22++;
rreettuurrnn iinniitt;
}
tteem
mppllaattee <ccllaassss IInn, ccllaassss IInn22, ccllaassss T
T, ccllaassss B
BiinnO
Opp, ccllaassss B
BiinnO
Opp22>
T iinnnneerr__pprroodduucctt(IInn ffiirrsstt, IInn llaasstt, IInn22 ffiirrsstt22, T iinniitt, B
BiinnO
Opp oopp, B
BiinnO
Opp22 oopp22)
{
w
whhiillee (ffiirrsstt != llaasstt) iinniitt = oopp(iinniitt,oopp22(*ffiirrsstt++,*ffiirrsstt22++));
rreettuurrnn iinniitt;
}
As usual, only the beginning of the second input sequence is passed as an argument. The second
input sequence is assumed to be at least as long as the first.
The key operation in multiplying a M
Maattrriixx by a vvaallaarrrraayy is an iinnnneerr__pprroodduucctt:
vvaallaarrrraayy<ddoouubbllee> ooppeerraattoorr*(ccoonnsstt M
Maattrriixx& m
m, ccoonnsstt vvaallaarrrraayy<ddoouubbllee>& vv)
{
vvaallaarrrraayy<ddoouubbllee> rreess(m
m.ddiim
m11());
ffoorr (iinntt ii=00; ii<m
m.ddiim
m11(); ii++) {
SSlliiccee__iitteerr<ddoouubbllee>& rrii = m
m.rroow
w(ii);
rreess(ii) = iinnnneerr__pprroodduucctt(rrii.bbeeggiinn(),rrii.eenndd(),&vv[00],00);
}
rreettuurrnn rreess;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
684
Numerics
Chapter 22
vvaallaarrrraayy<ddoouubbllee> ooppeerraattoorr*(ccoonnsstt vvaallaarrrraayy<ddoouubbllee>& vv, ccoonnsstt M
Maattrriixx& m
m)
{
vvaallaarrrraayy<ddoouubbllee> rreess(m
m.ddiim
m22());
ffoorr (iinntt jj=00; jj<m
m.ddiim
m22(); jj++) {
SSlliiccee__iitteerr<ddoouubbllee>& ccjj = m
m.ccoolluum
mnn(jj);
rreess(jj) = iinnnneerr__pprroodduucctt(&vv[00],&vv[vv.ssiizzee()],ccjj.bbeeggiinn(),00);
}
rreettuurrnn rreess;
}
Some forms of iinnnneerr__pprroodduucctt are often referred to as ‘‘dot product.’’
22.6.3 Incremental Change [num.incremental]
The ppaarrttiiaall__ssuum
m() and aaddjjaacceenntt__ddiiffffeerreennccee() algorithms are inverses of each other and deal
with the notion of incremental change. They are defined in namespace ssttdd and presented in
<nnuum
meerriicc>:
tteem
mppllaattee <ccllaassss IInn, ccllaassss O
Ouutt> O
Ouutt aaddjjaacceenntt__ddiiffffeerreennccee(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess);
tteem
mppllaattee <ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss B
BiinnO
Opp>
O
Ouutt aaddjjaacceenntt__ddiiffffeerreennccee(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, B
BiinnO
Opp oopp);
Given a sequence aa, bb, cc, dd, etc., aaddjjaacceenntt__ddiiffffeerreennccee() produces aa, bb-aa, cc-bb, dd-cc, etc.
Consider a vector of temperature readings. We could transform it into a vector of temperature
changes like this:
vveeccttoorr<ddoouubbllee> tteem
mppss;
vvooiidd ff()
{
aaddjjaacceenntt__ddiiffffeerreennccee(tteem
mppss.bbeeggiinn(),tteem
mppss.eenndd(),tteem
mppss.bbeeggiinn());
}
For example, 1177, 1199, 2200, 2200, 1177 turns into 1177, 22, 11, 00, -33.
Conversely, ppaarrttiiaall__ssuum
m() allows us to compute the end result of a set of incremental
changes:
tteem
mppllaattee <ccllaassss IInn, ccllaassss O
Ouutt, ccllaassss B
BiinnO
Opp>
O
Ouutt ppaarrttiiaall__ssuum
m(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess, B
BiinnO
Opp oopp)
{
iiff (ffiirrsstt==llaasstt) rreettuurrnn rreess;
*rreess = *ffiirrsstt;
T vvaall = *ffiirrsstt;
w
whhiillee (++ffiirrsstt != llaasstt) {
vvaall = oopp(vvaall,*ffiirrsstt);
*++rreess = vvaall;
}
rreettuurrnn ++rreess;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.6.3
Incremental Change
685
tteem
mppllaattee <ccllaassss IInn, ccllaassss O
Ouutt> O
Ouutt ppaarrttiiaall__ssuum
m(IInn ffiirrsstt, IInn llaasstt, O
Ouutt rreess)
{
rreettuurrnn ppaarrttiiaall__ssuum
m(ffiirrsstt,llaasstt,rreess,pplluuss); // §18.4.3
}
Given a sequence aa, bb, cc, dd, etc. , ppaarrttiiaall__ssuum
m() produces aa, aa+bb, aa+bb+cc, aa+bb+cc+dd, etc. For
example:
vvooiidd ff()
{
ppaarrttiiaall__ssuum
m(tteem
mppss.bbeeggiinn(),tteem
mppss.eenndd(),tteem
mppss.bbeeggiinn());
}
Note the way ppaarrttiiaall__ssuum
m() increments rreess before assigning a new value through it. This allows
rreess to be the same sequence as its input; aaddjjaacceenntt__ddiiffffeerreennccee() behaves similarly. Thus,
ppaarrttiiaall__ssuum
m(vv.bbeeggiinn(),vv.eenndd(),vv.bbeeggiinn());
turns the sequence aa, bb, cc, d into aa, aa+bb, aa+bb+cc, aa+bb+cc+dd, and
aaddjjaacceenntt__ddiiffffeerreennccee(vv.bbeeggiinn(),vv.eenndd(),vv.bbeeggiinn());
turns it back into the original. In particular, ppaarrttiiaall__ssuum
m() turns 1177, 22, 11, 00, -33 back into 1177, 1199,
2200, 2200, 1177.
For people who think of temperature differences as a boring detail of meteorology or science
lab experiments, I point out that analyzing changes in stock prices involves exactly the same two
operations.
22.7 Random Numbers [num.random]
Random numbers are essential to many simulations and games. In <ccssttddlliibb> and <ssttddlliibb.hh>, the
standard library provides a simple basis for the generation of random numbers:
#ddeeffiinnee R
RA
AN
ND
D__M
MA
AX
X iim
mpplleem
meennttaattiioonn__ddeeffiinneedd /* large positive integer */
iinntt rraanndd();
iinntt ssrraanndd(iinntt ii);
// pseudo-random number between 0 and RAND_MAX
// seed random number generator by i
Producing a good random-number generator isn’t easy, and unfortunately not all systems deliver a
good rraanndd(). In particular, the low-order bits of a random number are often suspect, so rraanndd()%nn
is not a good portable way of generating a random number between 0 and nn-11. Often,
(ddoouubbllee(rraanndd())/R
RA
AN
ND
D__M
MA
AX
X)*nn gives acceptable results.
A call of ssrraanndd() starts a new sequence of random numbers from the seed given as argument.
For debugging, it is often important that a sequence of random numbers from a given seed be
repeatable. However, we often want to start each real run with a new seed. In fact, to make games
unpredictable, it is often useful to pick a seed from the environment of a program. For such programs, some bits from a real-time clock often make a good seed.
If you must write your own random-number generator, be sure to test it carefully (§22.9[14]).
A random-number generator is often more useful if represented as a class. In that way,
random-number generators for different distributions are easily built:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
686
Numerics
Chapter 22
ccllaassss R
Raannddiinntt {
// uniform distribution in the interval [0,max]
uunnssiiggnneedd lloonngg rraannddxx;
ppuubblliicc:
R
Raannddiinntt(lloonngg s = 00) { rraannddxx=ss; }
vvooiidd sseeeedd(lloonngg ss) { rraannddxx=ss; }
// magic numbers chosen to use 31 bits of a 32-bit long:
iinntt aabbss(iinntt xx) { rreettuurrnn xx&00xx77ffffffffffffff; }
ssttaattiicc ddoouubbllee m
maaxx() { rreettuurrnn 22114477448833664488.00; } // note: a double
iinntt ddrraaw
w() { rreettuurrnn rraannddxx = rraannddxx*11110033551155224455 + 1122334455; }
ddoouubbllee ffddrraaw
w(){ rreettuurrnn aabbss(ddrraaw
w())/m
maaxx(); }
iinntt ooppeerraattoorr()() { rreettuurrnn aabbss(ddrraaw
w()); }
};
ccllaassss U
Urraanndd : ppuubblliicc R
Raannddiinntt { // uniform distribution in the interval [0:n[
iinntt nn;
ppuubblliicc:
U
Urraanndd(iinntt nnnn) { n = nnnn; }
iinntt ooppeerraattoorr()() { iinntt r = nn*ffddrraaw
w(); rreettuurrnn (rr==nn) ? nn-11 : rr; }
};
ccllaassss E
Erraanndd : ppuubblliicc R
Raannddiinntt { // exponential distribution random number generator
iinntt m
meeaann;
ppuubblliicc:
E
Erraanndd(iinntt m
m) { m
meeaann=m
m; }
iinntt ooppeerraattoorr()() { rreettuurrnn -m
meeaann * lloogg( (m
maaxx()-ddrraaw
w())/m
maaxx() + .55); }
};
Here is a simple test:
iinntt m
maaiinn()
{
U
Urraanndd ddrraaw
w(1100);
m
maapp<iinntt,iinntt> bbuucckkeett;
ffoorr (iinntt i = 00; ii< 11000000000000; ii++) bbuucckkeett[ddrraaw
w()]++;
ffoorr(iinntt j = 00; jj<1100; jj++) ccoouutt << bbuucckkeett[jj] << ´\\nn´;
}
Unless each bucket has approximately the value 10,000, there is a bug somewhere.
These random-number generators are slightly edited versions of what I shipped with the very
first C++ library (actually, the first ‘‘C with Classes’’ library; §1.4).
22.8 Advice [num.advice]
[1] Numerical problems are often subtle. If you are not 100% certain about the mathematical
aspects of a numerical problem, either take expert advice or experiment; §22.1.
[2] Use nnuum
meerriicc__lliim
miittss to determine properties of built-in types; §22.2.
[3] Specialize nnuum
meerriicc__lliim
miittss for user-defined scalar types; §22.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 22.8
Advice
687
[4] Use vvaallaarrrraayy for numeric computation when run-time efficiency is more important than flexibility with respect to operations and element types; §22.4.
[5] Express operations on part of an array in terms of slices rather than loops; §22.4.6.
[6] Use compositors to gain efficiency through elimination of temporaries and better algorithms;
§22.4.7.
[7] Use ssttdd::ccoom
mpplleexx for complex arithmetic; §22.5.
[8] You can convert old code that uses a ccoom
mpplleexx class to use the ssttdd::ccoom
mpplleexx template by
using a ttyyppeeddeeff; §22.5.
[9] Consider aaccccuum
muullaattee(), iinnnneerr__pprroodduucctt(), ppaarrttiiaall__ssuum
m(), and aaddjjaacceenntt__ddiiffffeerreennccee()
before you write a loop to compute a value from a list; §22.6.
[10] Prefer a random-number class for a particular distribution over direct use of rraanndd(); §22.7.
[11] Be careful that your random numbers are sufficiently random; §22.7.
22.9 Exercises [num.exercises]
1. (∗1.5) Write a function that behaves like aappppllyy() from §22.4.3, except that it is a nonmember
function and accepts function objects.
2. (∗1.5) Write a function that behaves like aappppllyy() from §22.4.3 , except that it is a nonmember
function, accepts function objects, and modifies its vvaallaarrrraayy argument.
3. (∗2) Complete SSlliiccee__iitteerr (§22.4.5). Take special care when defining the destructor.
4. (∗1.5) Rewrite the program from §17.4.1.3 using aaccccuum
muullaattee().
5. (∗2) Implement I/O operators << and >> for vvaallaarrrraayy. Implement a ggeett__aarrrraayy() function that
creates a vvaallaarrrraayy of a size specified as part of the input itself.
6. (∗2.5) Define and implement a three-dimensional matrix with suitable operations.
7. (∗2.5) Define and implement an nn-dimensional matrix with suitable operations.
8. (∗2.5) Implement a vvaallaarrrraayy-like class and implement + and * for it. Compare its performance
to the performance of your C++ implementation’s vvaallaarrrraayy. Hint: Include xx=00.55(xx+yy)-zz
among your test cases and try it with a variety of sizes for the vectors xx, yy, and zz.
9. (∗3) Implement a Fortran-style array F
Foorrtt__aarrrraayy where indices start from 1 rather than 00.
10. (∗3) Implement M
Maattrriixx using a vvaallaarrrraayy member as the representation of the elements (rather
than a pointer or a reference to a vvaallaarrrraayy).
11. (∗2.5) Use compositors (§22.4.7) to implement efficient multidimensional subscripting using
the [] notation. For example, vv11[xx], vv22[xx][yy], vv22[xx], vv33[xx][yy][zz], vv33[xx][yy], and
vv33[xx] should all yield the appropriate elements and subarrays using a simple calculation of an
index.
12. (∗2) Generalize the idea from the program in §22.7 into a function that, given a generator as an
argument, prints a simple graphical representation of its distribution that can be used as a crude
visual check of the generator’s correctness.
13. (∗1) If n is an iinntt, what is the distribution of (ddoouubbllee(rraanndd())/R
RA
AN
ND
D__M
MA
AX
X)*nn?
14. (∗2.5) Plot points in a square output area. The coordinate pairs for the points should be generated by U
Urraanndd(N
N), where N is the number of pixels on a side of the output area. What does
the output tell you about the distribution of numbers generated by U
Urraanndd?
15. (∗2) Implement a Normal distribution generator, N
Nrraanndd.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
688
Numerics
Chapter 22
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Part IV
Design Using C++
This part presents C++ and the techniques it supports in the larger picture of software
development. The focus is on design and the effective realization of design in terms of
language constructs.
Chapters
23 Development and Design
24 Design and Programming
25 Roles of Classes
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
690
Design Using C++
Part IV
‘‘... I am just now beginning to discover the difficulty of expressing one’s ideas on
paper. As long as it consists solely of description it is pretty easy; but where reasoning
comes into play, to make a proper connection, a clearness & a moderate fluency, is to
me, as I have said, a difficulty of which I had no idea ...’’
– Charles Darwin
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
23
________________________________________
________________________________________________________________________________________________________________________________________________________________
Development and Design
There is no silver bullet.
– F. Brooks
Building software — aims and means — development process — development cycle —
design aims — design steps — finding classes — specifying operations — specifying
dependencies — specifying interfaces — reorganizing class hierarchies — models —
experimentation and analysis — testing — software maintenance — efficiency — management — reuse — scale — the importance of individuals — hybrid design — bibliography — advice.
23.1 Overview [design.overview]
This chapter is the first of three that present the production of software in increasing detail, starting
from a relatively high-level view of design and ending with C++ specific programming techniques
and concepts directly supporting such design. After the introduction and a brief discussion of the
aims and means of software development in §23.3, this chapter has two major parts:
§23.4 A view of the software development process
§23.5 Practical observations about the organization of software development
Chapter 24 discusses the relationship between design and programming language. Chapter 25 presents some roles that classes play in the organization of software from a design perspective. Taken
as a whole, the three chapters of Part 4 aim to bridge the gap between would-be languageindependent design and programming that is myopically focussed on details. Both ends of this
spectrum have their place in a large project, but to avoid disaster and excessive cost, they must be
part of a continuum of concerns and techniques.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
692
Development and Design
Chapter 23
23.2 Introduction [design.intro]
Constructing any nontrivial piece of software is a complex and often daunting task. Even for an
individual programmer, the actual writing of program statements is only one part of the process.
Typically, issues of problem analysis, overall program design, documentation, testing, and maintenance, as well as the management of all of this, dwarf the task of writing and debugging individual
pieces of code. Naturally, one might simply label the totality of these activities ‘‘programming’’
and thereafter make a logically coherent claim that ‘‘I don’t design, I just program;’’ but whatever
one calls the activity, it is important sometimes to focus on its individual parts – just as it is important occasionally to consider the complete process. Neither the details nor the big picture must be
permanently lost in the rush to get a system shipped – although often enough that is exactly what
happens.
This chapter focusses on the parts of program development that do not involve writing and
debugging individual pieces of code. The discussion is less precise and less detailed than the discussions of individual language features and specific programming techniques presented elsewhere
in this book. This is necessary because there can be no cookbook method for creating good software. Detailed ‘‘how to’’ descriptions can exist for specific well-understood kinds of applications,
but not for more general application areas. There is no substitute for intelligence, experience, and
taste in programming. In consequence, this chapter offers only general advice, alternative
approaches, and cautionary observations.
The discussion is hampered by the abstract nature of software and the fact that techniques that
work for smaller projects (say, for one or two people writing 10,000 lines of code) do not necessarily scale to medium and large projects. For this reason, some discussions are formulated in terms
of analogies from less abstract engineering disciplines rather than in terms of code examples.
Please remember that ‘‘proof by analogy’’ is fraud, so analogy is used here for exposition only.
Discussions of design issues phrased in C++ specific terms and with examples can be found in
Chapter 24 and Chapter 25. The ideas expressed in this chapter are reflected in both the C++ language itself and in the presentation of the individual examples throughout this book.
Please also remember that because of the extraordinary diversity of application areas, people,
and program-development environments, you cannot expect every observation made here to apply
directly to your current problem. The observations are drawn from real-life projects and apply to a
wide variety of situations, but they cannot be considered universal. Look at these observations with
a healthy degree of skepticism.
C++ can be used simply as a better C. However, doing so leaves the most powerful techniques
and language features unused so that only a small fraction of the potential benefits of using C++
will be gained. This chapter focusses on approaches to design that enable effective use of C++’s
data abstraction and object-oriented programming facilities; such techniques are often called
object-oriented design.
A few major themes run through this chapter:
– The most important single aspect of software development is to be clear about what you are
trying to build.
– Successful software development is a long-term activity.
– The systems we construct tend to be at the limit of the complexity that we and our tools can
handle.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.2
Introduction
693
– There are no ‘‘cookbook’’ methods that can replace intelligence, experience, and good taste
in design and programming.
– Experimentation is essential for all nontrivial software development.
– Design and programming are iterative activities.
– The different phases of a software project, such as design, programming, and testing, cannot
be strictly separated.
– Programming and design cannot be considered without also considering the management of
these activities.
It is easy – and typically expensive – to underestimate any of these points. It is hard to transform
the abstract ideas they embody into practice. The need for experience should be noted. Like boat
building, bicycling, and programming, design is not a skill that can be mastered through theoretical
study alone.
Too often, we forget the human aspects of system building and consider the software development process as simply ‘‘a series of well-defined steps, each performing specific actions on inputs
according to predefined rules to produce the desired outputs.’’ The very language used conceals
the human involvement! Design and programming are human activities; forget that and all is lost.
This chapter is concerned with the design of systems that are ambitious relative to the experience and resources of the people building the system. It seems to be the nature of individuals and
organizations to attempt projects that are at the limits of their ability. Projects that don’t offer such
challenges don’t need a discussion of design. Such projects already have established frameworks
that need not be upset. Only when something ambitious is attempted is there a need to adopt new
and better tools and procedures. There is also a tendency to assign projects that ‘‘we know how to
do’’ to relative novices who don’t.
There is no ‘‘one right way’’ to design and build all systems. I would consider belief in ‘‘the
one right way’’ a childhood disease, if experienced programmers and designers didn’t succumb to it
so often. Please remember that just because a technique worked for you last year and for one project, it does not follow that it will work unmodified for someone else or for a different project. It is
most important to keep an open mind.
Clearly, much of the discussion here relates to larger-scale software development. Readers who
are not involved in such development can sit back and enjoy a look at the horrors they have
escaped. Alternatively, they can look for the subset of the discussion that relates to individual
work. There is no lower limit to the size of programs for which it is sensible to design before starting to code. There is, however, a lower limit for which any particular approach to design and documentation is appropriate. See §23.5.2 for a discussion of issues of scale.
The most fundamental problem in software development is complexity. There is only one basic
way of dealing with complexity: divide and conquer. A problem that can be separated into two
sub-problems that can be handled separately is more than half solved by that separation. This simple principle can be applied in an amazing variety of ways. In particular, the use of a module or a
class in the design of systems separates the program into two parts – the implementation and its
users – connected only by an (ideally) well-defined interface. This is the fundamental approach to
handling the inherent complexity of a program. Similarly, the process of designing a program can
be broken into distinct activities with (ideally) well-defined interactions between the people
involved. This is the basic approach to handling the inherent complexity of the development process and the people involved in it.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
694
Development and Design
Chapter 23
In both cases, the selection of the parts and the specification of the interfaces between the parts
is where the most experience and taste is required. Such selection is not a simple mechanical process but typically requires insights that can be achieved only through a thorough understanding of a
system at suitable levels of abstraction (see §23.4.2, §24.3.1, and §25.3). A myopic view of a program or of a software development process often leads to seriously flawed systems. Note also that
for both people and programs, separation is easy. The hard part is to ensure effective
communication between parties on different sides of a barrier without destroying the barrier or stifling the communication necessary to achieve cooperation.
This chapter presents an approach to design, not a complete design method. A complete formal
design method is beyond the scope of this book. The approach presented here can be used with different degrees of formalization and as the basis for different formalizations. Similarly, this chapter
is not a literature survey and does not attempt to touch every topic relevant to software development or to present every viewpoint. Again, that is beyond the scope of this book. A literature survey can be found in [Booch,1994]. Note that terms are used here in fairly general and conventional
ways. Most ‘‘interesting’’ terms, such as design, prototype, and programmer, have several different and often conflicting definitions in the literature. Please be careful not to read something unintended into what is said here based on specialized or locally precise definitions of the terms.
23.3 Aims and Means [design.aims]
The purpose of professional programming is to deliver a product that satisfies its users. The primary means of doing so is to produce software with a clean internal structure and to grow a group
of designers and programmers skilled enough and motivated enough to respond quickly and effectively to change and opportunities.
Why? The internal structure of the program and the process by which it was created are ideally
of no concern to the end user. Stronger: if the end user has to worry about how the program was
written, then there is something wrong with that program. Given that, what is the importance of the
structure of a program and of the people who create the program?
A program needs a clean internal structure to ease:
– testing,
– porting,
– maintenance,
– extension,
– reorganization, and
– understanding.
The main point is that every successful major piece of software has an extended life in which it is
worked on by a succession of programmers and designers, ported to new hardware, adapted to
unanticipated uses, and repeatedly reorganized. Throughout the software’s life, new versions of it
must be produced with acceptable error rates and on time. Not planning for this is planning to fail.
Note that even though end users ideally don’t have to know the internal structure of a system,
they might actually want to. For example, a user might want to know the design of a system in
detail to be able to assess its likely reliability and potential for revision and extension. If the software in question is not a complete system – rather, a set of libraries for building other software –
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.3
Aims and Means
695
then the users will want to know more ‘‘details’’ to be able to better use the libraries and also to
better benefit from them as sources of ideas.
A balance has to be struck between the lack of an overall design for a piece of software and
overemphasis on structure. The former leads to endless cutting of corners (‘‘we’ll just ship this one
and fix the problem in the next release’’). The latter leads to overelaborate designs in which essentials are lost in formalism and to situations where implementation gets delayed by program reorganizations (‘‘but this new structure is much better than the old one; people will want to wait for it’’).
It also often results in systems so demanding of resources that they are unaffordable to most potential users. Such balancing acts are the most difficult aspects of design and the area in which talent
and experience show themselves. The choices are hard for the individual designer or programmer
and harder for the larger projects in which more people with differing skills are involved.
A program needs to be produced and maintained by an organization that can do this despite
changes of personnel, direction, and management structure. A popular approach to coping with this
problem has been to try to reduce system development into a few relatively low-level tasks slotted
into a rigid framework. That is, the idea is to create a class of easy-to-train (cheap) and interchangeable low-level programmers (‘‘coders’’) and a class of somewhat less cheap but equally
interchangeable (and therefore equally dispensable) designers. The coders are not supposed to
make design decisions, while the designers are not supposed to concern themselves with the grubby
details of coding. This approach often fails. Where it does work, it produces overly large systems
with poor performance.
The problems with this approach are:
– insufficient communication between implementers and designers, which leads to missed
opportunities, delays, inefficiencies, and repeated problems due to failure to learn from
experience; and
– insufficient scope for initiative among implementers, which leads to lack of professional
growth, lack of initiative, sloppiness, and high turnover.
Basically, such a system lacks feedback mechanisms to allow people to benefit from other people’s
experience. It is wasteful of scarce human talent. Creating a framework within which people can
utilize diverse talents, develop new skills, contribute ideas, and enjoy themselves is not just the
only decent thing to do but also makes practical and economic sense.
On the other hand, a system cannot be built, documented, and maintained indefinitely without
some form of formal structure. Simply finding the best people and letting them attack the problem
as they think best is often a good start for a project requiring innovation. However, as the project
progresses, more scheduling, specialization, and formalized communication between the people
involved in the project become necessary. By ‘‘formal’’ I don’t mean a mathematical or mechanically verifiable notation (although that is nice, where available and applicable) but rather a set of
guidelines for notation, naming, documentation, testing, etc. Again, a balance and a sense of appropriateness is necessary. A too-rigid system can prevent growth and stifle innovation. In this case,
it is the manager’s talent and experience that is tested. For the individual, the equivalent dilemma
is to choose where to try to be clever and where to simply ‘‘do it by the book.’’
The recommendation is to plan not just for the next release of the current project but also for the
longer term. Looking only to the next release is planning to fail. We must develop organizations
and software development strategies aimed at producing and maintaining many releases of many
projects; that is, we must plan for a series of successes.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
696
Development and Design
Chapter 23
The purpose of ‘‘design’’ is to create a clean and relatively simple internal structure, sometimes
also called an architecture, for a program. In other words, we want to create a framework into
which the individual pieces of code can fit and thereby guide the writing of those individual pieces
of code.
A design is the end product of the design process (as far as there is an end product of an iterative process). It is the focus of the communication between the designer and the programmer and
between programmers. It is important to have a sense of proportion here. If I – as an individual
programmer – design a small program that I’m going to implement tomorrow, the appropriate level
of precision and detail may be some scribbles on the back of an envelope. At the other extreme, the
development of a system involving hundreds of designers and programmers may require books of
specifications carefully written using formal or semi-formal notations. Determining a suitable level
of detail, precision, and formality for a design is in itself a challenging technical and managerial
task.
In this and the following chapters, I assume that the design of a system is expressed as a set of
class declarations (typically with their private declarations omitted as spurious details) and their
relationships. This is a simplification. Many more issues enter into a specific design; for example,
concurrency, management of namespaces, uses of nonmember function and data, parameterization
of classes and functions, organization of code to minimize recompilation, persistence, and use of
multiple computers. However, simplification is necessary for a discussion at this level of detail,
and classes are the proper focus of design in the context of C++. Some of these other issues are
mentioned in passing in this chapter, and some that directly affect the design of C++ programs are
discussed in Chapter 24 and Chapter 25. For a more detailed discussion and examples of a specific
object-oriented design method, see [Booch,1994].
I leave the distinction between analysis and design vague because a discussion of this issue is
beyond the scope of this book and is sensitive to variations in specific design methods. It is essential to pick an analysis method to match the design method and to pick a design method to match
the programming style and language used.
23.4 The Development Process [design.process]
Software development is an iterative and incremental process. Each stage of the process is revisited repeatedly during the development, and each visit refines the end products of that stage. In
general, the process has no beginning and no end. When designing and implementing a system,
you start from a base of other people’s designs, libraries, and application software. When you finish, you leave a body of design and code for others to refine, revise, extend, and port. Naturally, a
specific project can have a definite beginning and end, and it is important (though often surprisingly hard) to delimit the project cleanly and precisely in time and scope. However, pretending that
you are starting from a clean slate can cause serious problems. Pretending that the world ends at
the ‘‘final delivery’’ can cause equally serious problems for your successors (often yourself in a
different role).
One implication of this is that the following sections could be read in any order because the
aspects of design and implementation can be almost arbitrarily interleaved in a real project. That is,
‘‘design’’ is almost always redesign based on a previous design and some implementation
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4
The Development Process
697
experience. Furthermore, the design is constrained by schedules, the skills of the people involved,
compatibility issues, etc. A major challenge to a designer/manager/programmer is to create order
in this process without stifling innovation and destroying the feedback loops that are necessary for
successful development.
The development process has three stages:
– Analysis: defining the scope of the problem to be solved
– Design: creating an overall structure for a system
– Implementation: writing and testing the code
Please remember the iterative nature of this process – it is significant that these stages are not numbered. Note that some major aspects of program development don’t appear as separate stages
because they ought to permeate the process:
– Experimentation
– Testing
– Analysis of the design and the implementation
– Documentation
– Management
Software ‘‘maintenance’’ is simply more iterations through this development process (§23.4.6).
It is most important that analysis, design, and implementation don’t become too detached from
each other and that the people involved share a culture so that they can communicate effectively. In
larger projects, this is all too often not the case. Ideally, individuals move from one stage to
another during a project; the best way to transfer subtle information is in a person’s head. Unfortunately, organizations often establish barriers against such transfers, for example, by giving designers higher status and/or higher pay than ‘‘mere programmers.’’ If it is not practical for people to
move around to learn and teach, they should at least be encouraged to talk regularly with individuals involved in ‘‘the other’’ stages of the development.
For small-to-medium projects, there often is no distinction made between analysis and design;
these two phases have been merged into one. Similarly, in small projects there often is no distinction made between design and programming. Naturally, this solves the communication problems.
It is important to apply an appropriate degree of formality for a given project and to maintain an
appropriate degree of separation between these phases (§23.5.2). There is no one right way to do
this.
The model of software development described here differs radically from the traditional
‘‘waterfall model.’’ In a waterfall model, the development progresses in an orderly and linear fashion through the development stages from analysis to testing. The waterfall model suffers from the
fundamental problem that information tends to flow only one way. When problems are found
‘‘downstream,’’ there is often strong methodological and organizational pressure to provide a local
fix; that is, there is pressure to solve the problem without affecting the previous stages of the process. This lack of feedback leads to deficient designs, and the local fixes lead to contorted implementations. In the inevitable cases in which information does flow back toward the source and
cause changes to the design, the result is a slow and cumbersome ripple effect through a system that
is geared to prevent the need for such change and therefore unwilling and slow to respond. The
argument for ‘‘no change’’ or for a ‘‘local fix’’ thus becomes an argument that one suborganization
cannot impose large amounts of work on other suborganizations ‘‘for its own convenience.’’ In
particular, by the time a major flaw is found there has often been so much paperwork generated
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
698
Development and Design
Chapter 23
relating to the flawed decision that the effort involved in modifying the documentation dwarfs the
effort needed to fix the code. In this way, paperwork can become the major problem of software
development. Naturally, such problems can – and do – occur however one organizes the development of large systems. After all, some paperwork is essential. However, the pretense of a linear
model of development (a waterfall) greatly increases the likelihood that this problem will get out of
hand.
The problem with the waterfall model is insufficient feedback and the inability to respond to
change. The danger of the iterative approach outlined here is a temptation to substitute a series of
nonconverging changes for real thought and progress. Both problems are easier to diagnose than to
solve, and however one organizes a task, it is easy and tempting to mistake activity for progress.
Naturally, the emphasis on the different stages of the development process changes as a project progresses. Initially, the emphasis is on analysis and design, and programming issues receive less
attention. As time passes, resources shift towards design and programming and then become more
focussed on programming and testing. However, the key is never to focus on one part of the
analysis/design/implementation spectrum to the exclusion of all other concerns.
Remember that no amount of attention to detail, no application of proper management technique, no amount of advanced technology can help you if you don’t have a clear idea of what you
are trying to achieve. More projects fail for lack of well-defined and realistic goals than for any
other reason. Whatever you do and however you go about it, be clear about your aims, define tangible goals and milestones, and don’t look for technological solutions to sociological problems. On
the other hand, do use whatever appropriate technology is available – even if it involves an investment; people do work better with appropriate tools and in reasonable surroundings. Don’t get
fooled into believing that following this advice is easy.
23.4.1 The Development Cycle [design.cycle]
Developing a system should be an iterative activity. The main loop consists of repeated trips
through this sequence:
[0] Examine the problem.
[1] Create an overall design.
[2] Find standard components.
– Customize the components for this design.
[3] Create new standard components.
– Customize the components for this design.
[4] Assemble the design.
As an analogy, consider a car factory. For a project to start, there needs to be an overall design for
a new type of car. This first cut will be based on some kind of analysis and specifies the car in general terms related mostly to its intended use rather than to details of how to achieve desired properties. Deciding which properties are desirable – or even better, providing a relatively simple guide
to deciding which properties are desirable – is often the hardest part of a project. When done well,
this is typically the work of a single insightful individual and is often called a vision. It is quite
common for projects to lack such clear goals – and for projects to falter or fail for that reason.
Say we want to build a medium-sized car with four doors and a fairly powerful engine. The
first stage in the design is most definitely not to start designing the car (and all of its sub-
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.1
The Development Cycle
699
components) from scratch. A software designer or programmer in a similar circumstance might
unwisely try exactly that.
The first stage is to consider which components are available from the factory’s own inventory
and from reliable suppliers. The components thus found need not be exactly right for the new car.
There will be ways of customizing the components. It might even be possible to affect the specification of the ‘‘next release’’ of such components to make them more suitable for our project. For
example, there may be an engine available with the right properties except for a slight deficiency in
delivered power. Either we or the engine supplier might be able to add a turbocharger to compensate without affecting the basic design. Note that making such a change ‘‘without affecting the
basic design’’ is unlikely unless the original design anticipated at least some form of customization.
Such customization will typically require cooperation between you and your engine supplier. A
software designer or programmer has similar options. In particular, polymorphic classes and templates can often be used effectively for customization. However, don’t expect to be able to effect
arbitrary extensions without foresight by or cooperation with the provider of such a class.
Having run out of suitable standard components, the car designer doesn’t rush to design optimal
new components for the new car. That would simply be too expensive. Assume that there were no
suitable air conditioning unit available and that there was a suitable L-shaped space available in the
engine compartment. One solution would be to design an L-shaped air conditioning unit. However, the probability that this oddity could be used in other car types – even after extensive customization – is low. This implies that our car designer will not be able to share the cost of producing such units with the designers of other car types and that the useful life of the unit will be short.
It will thus be worthwhile to design a unit that has a wider appeal; that is, design a unit that has a
cleaner design and is more suited for customization than our hypothetical L-shaped oddity. This
will probably involve more work than the L-shaped unit and might even involve a modification of
the overall design of our car to accommodate the more general-purpose unit. Because the new unit
was designed to be more widely useful than our L-shaped wonder, it will presumably need a bit of
customization to fit our revised needs perfectly. Again, the software designer or programmer has a
similar option. That is, rather than writing project-specific code the designer can design a new
component of a generality that makes it a good candidate to become a standard in some universe.
Finally, when we have run out of potential standard components we assemble the ‘‘final’’
design. We use as few specially designed widgets as possible because next year we will have to go
through a variant of this exercise again for the next new model and the specially designed widgets
will be the ones we most likely will have to redo or throw away. Sadly, the experience with traditionally designed software is that few parts of a system can even be recognized as discrete components, and few of those are of use outside their original project.
I’m not saying that all car designers are as rational as I have outlined in this analogy or that all
software designers make the mistakes mentioned. On the contrary, this model can be made to work
with software. In particular, this chapter and the next present techniques for making it work with
C++. I do claim, however, that the intangible nature of software makes those mistakes harder to
avoid (§24.3.1, §24.3.4), and in §23.5.3 I argue that corporate culture often discourages people
from using the model outlined here.
Note that this model of development really works well only when you consider the longer term.
If your horizon extends only to the next release, the creation and maintenance of standard components makes no sense. It will simply be seen as spurious overhead. This model is suggested for an
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
700
Development and Design
Chapter 23
organization with a life that spans several projects and of a size that makes worthwhile the necessary extra investment in tools (for design, programming, and project management) and education
(of designers, programmers, and managers). It is a sketch of a kind of software factory. Curiously
enough, it differs only in scale from the practices of the best individual programmers, who over the
years build up a stock of techniques, designs, tools, and libraries to enhance their personal effectiveness. It seems, in fact, that most organizations have failed to take advantage of the best personal practices due to both a lack of vision and an inability to manage such practices on more than a
very small scale.
Note that it is unreasonable to expect ‘‘standard components’’ to be universally standard. There
will exist a few international standard libraries. However, most components will be standard (only)
within a country, an industry, a company, a product line, a department, an application area, etc.
The world is simply too large for universal standards to be a realistic or indeed a to be desirable aim
for all components and tools.
Aiming for universality in an initial design is a prescription for a project that will never be completed. One reason that the development cycle is a cycle is that it is essential to have a working
system from which to gain experience (§23.4.3.6).
23.4.2 Design Aims [design.design]
What are the overall aims of a design? Simplicity is one, of course, but simplicity according to
what criteria? We assume that a design will have to evolve. That is, the system will have to be
extended, ported, tuned, and generally changed in a number of ways that cannot all be foreseen.
Consequently, we must aim for a design and an implemented system that is simple under the constraint that it will be changed in many ways. In fact, it is realistic to assume that the requirements
for the system will change several times between the time of the initial design and the first release
of the system.
The implication is that the system must be designed to remain as simple as possible under a
sequence of changes. We must design for change; that is, we must aim for
– flexibility,
– extensibility, and
– portability.
This is best done by trying to encapsulate the areas of a system that are likely to change and by providing non-intrusive ways for a later designer/programmer to modify the behavior of the code.
This is done by identifying the key concepts of an application and giving each class the exclusive
responsibility for the maintenance of all information relating to a single concept. In that case, a
change can be effected by a modification of that class only. Ideally, a change to a single concept
can be done by deriving a new class (§23.4.3.5) or by passing a different argument to a template.
Naturally, this ideal is much easier to state than to follow.
Consider an example. In a simulation involving meteorological phenomena, we want to display
a rain cloud. How do we do that? We cannot have a general routine to display the cloud because
what a cloud looks like depends on the internal state of the cloud, and that state should be the sole
responsibility of the cloud.
A first solution to this problem is to let the cloud display itself. This style of solution is acceptable in many limited contexts. However, it is not general because there are many ways to view a
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.2
Design Aims
701
cloud: for example, as a detailed picture, as a rough outline, or as an icon on a map. In other words,
what a cloud looks like depends on both the cloud and its environment.
A second solution to the problem is to make the cloud aware of its environment and then let the
cloud display itself. This solution is acceptable in even more contexts. However, it is still not a
general solution. Having the cloud know about such details of its environment violates the dictum
that a class is responsible for one thing only and that every ‘‘thing’’ is the responsibility of some
class. It may not be possible to come up with a coherent notion of ‘‘the cloud’s environment’’
because in general what a cloud looks like depends on both the cloud and the viewer. Even in real
life, what the cloud looks like to me depends rather strongly on how I look at it; for example, with
my naked eyes, through a polarizing filter, or with a weather radar. In addition to the viewer and
the cloud, some ‘‘general background’’ such as the relative position of the sun might have to be
taken into account. Adding other objects, such as other clouds and airplanes, further complicates
the matter. To make life really hard for the designer, add the possibility of having several simultaneous viewers.
A third solution is to have the cloud – and other objects such as airplanes and the sun –
describe themselves to a viewer. This solution has sufficient generality to serve most purposes†. It
may, however, impose a significant cost in both complexity and run-time overhead. For example,
how do we arrange for a viewer to understand the descriptions produced by clouds and other
objects?
Rain clouds are not particularly common in programs (but for an example, see §15.2), but
objects that need to be involved in a variety of I/O operations are. This makes the cloud example
relevant to programs in general and to the design of libraries in particular. C++ code for a logically
similar example can be found in the manipulators used for formatted output in the stream I/O system (§21.4.6, §21.4.6.3). Note that the third solution is not ‘‘the right solution;’’ it is simply the
most general solution. A designer must balance the various needs of a system to choose the level
of generality and abstraction that is appropriate for a given problem in a given system. As a rule of
thumb, the right level of abstraction for a long-lived program is the most general you can comprehend and afford, not the absolutely most general. Generalization beyond the scope of a given project and beyond the experience of the people involved can be harmful; that is, it can cause delays,
unacceptable inefficiencies, unmanageable designs, and plain failure.
To make such techniques manageable and economical, we must also design and manage for
reuse (§23.5.1) and not completely forget about efficiency (§23.4.7).
23.4.3 Design Steps [design.steps]
Consider designing a single class. Typically, this is not a good idea. Concepts do not exist in isolation; rather, a concept is defined in the context of other concepts. Similarly, a class does not exist
in isolation but is defined together with logically related classes. Typically, one works on a set of
related classes. Such a set is often called a class library or a component. Sometimes all classes in a
component constitute a single class hierarchy, sometimes they are members of a single namespace,
and sometimes they are a more ad-hoc collection of declarations (§24.4).
__________________
† Even this model is unlikely to be sufficient for extreme cases like high-quality graphics based on ray tracing. I suspect that
achieving such detail requires the designer to move to a different level of abstraction.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
702
Development and Design
Chapter 23
The set of classes in a component is united by some logical criteria, often by a common style
and often by a reliance on common services. A component is thus the unit of design, documentation, ownership, and often reuse. This does not mean that if you use one class from a component,
you must understand and use all the classes from the component or maybe get the code for every
class in the component loaded into your program. On the contrary, we typically strive to ensure
that a class can be used with only minimal overhead in machine resources and human effort. However, to use any part of a component we need to understand the logical criteria that define the component (hopefully made abundantly clear in the documentation), the conventions and style embodied in the design of the component and its documentation, and the common services (if any).
So consider how one might approach the design of a component. Because this is often a challenging task, it is worthwhile breaking it into steps to help focus on the various subtasks in a logical
and complete way. As usual, there is no one right way of doing this. However, here is a series of
steps that have worked for some people:
[1] Find the concepts/classes and their most fundamental relationships.
[2] Refine the classes by specifying the sets of operations on them.
– Classify these operations. In particular, consider the needs for construction, copying,
and destruction.
– Consider minimalism, completeness, and convenience.
[3] Refine the classes by specifying their dependencies.
– Consider parameterization, inheritance, and use dependencies.
[4] Specify the interfaces.
– Separate functions into public and protected operations.
– Specify the exact type of the operations on the classes.
Note that these are steps in an iterative process. Typically, several loops through this sequence are
needed to produce a design one can comfortably use for an initial implementation or a reimplementation. One advantage of well-done analysis and data abstraction as described here is that
it becomes relatively easy to reshuffle class relationships even after code has been written. This is
never a trivial task, though.
After that, we implement the classes and go back and review the design based on what was
learned from implementing them. In the following subsections, I discuss these steps one by one.
23.4.3.1 Step 1: Find Classes [design.find]
Find the concepts/classes and their most fundamental relationships. The key to a good design is to
model some aspect of ‘‘reality’’ directly – that is, capture the concepts of an application as classes,
represent the relationships between classes in well-defined ways such as inheritance, and do this
repeatedly at different levels of abstraction. But how do we go about finding those concepts?
What is a practical approach to deciding which classes we need?
The best place to start looking is in the application itself, as opposed to looking in the computer
scientist’s bag of abstractions and concepts. Listen to someone who will become an expert user of
the system once it has been built and to someone who is a somewhat dissatisfied user of the system
being replaced. Note the vocabulary they use.
It is often said that the nouns will correspond to the classes and objects needed in the program;
often that is indeed the case. However, that is by no means the end of the story. Verbs may denote
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.3.1
Step 1: Find Classes
703
operations on objects, traditional (global) functions that produce new values based on the value of
their arguments, or even classes. As examples of the latter, note the function objects (§18.4) and
manipulators (§21.4.6). Verbs such as ‘‘iterate’’ or ‘‘commit’’ can be represented by an iterator
object and an object representing a database commit operation, respectively. Even adjectives can
often usefully be represented by classes. Consider the adjectives ‘‘storable,’’ ‘‘concurrent,’’ ‘‘registered,’’ and ‘‘bounded.’’ These may be classes intended to allow a designer or programmer to
pick and choose among desirable attributes for later-designed classes by specifying virtual base
classes (§15.2.4).
Not all classes correspond to application-level concepts. For example, some represent system
resources and implementation-level abstractions (§24.3.1). It is also important to avoid modeling
an old system too closely. For example, we don’t want a system that is centered around a database
to faithfully replicate aspects of a manual system that exist only to allow individuals to manage the
physical shuffling of pieces of paper.
Inheritance is used to represent commonality among concepts. Most important, it is used to
represent hierachical organization based on the behavior of classes representing individual concepts
(§1.7, §12.2.6, §24.3.2). This is sometimes referred to as classification or even taxonomy. Commonality must be actively sought. Generalization and classification are high-level activities that
require insight to give useful and lasting results. A common base should represent a more general
concept rather than simply a similar concept that happens to require less data to represent.
Note that the classification should be of aspects of the concepts that we model in our system,
rather than aspects that may be valid in other areas. For example, in mathematics a circle is a kind
of an ellipse, but in most programs a circle should not be derived from an ellipse or an ellipse
derived from a circle. The often-heard arguments ‘‘because that’s the way it is in mathematics’’
and ‘‘because the representation of a circle is a subset of that of an ellipse’’ are not conclusive and
most often wrong. This is because for most programs, the key property of a circle is that it has a
center and a fixed distance to its perimeter. All behavior of a circle (all operations) must maintain
this property (invariant; §24.3.7.1). On the other hand, an ellipse is characterized by two focal
points that in many programs can be changed independently of each other. If those focal points
coincide, the ellipse looks like a circle, but it is not a circle because its operations do not preserve
the circle invariant. In most systems, this difference will be reflected by having a circle and an
ellipse provide sets of operations that are not subsets of each other.
We don’t just think up a set of classes and relationships between classes and use them for the
final system. Instead, we create an initial set of classes and relationships. These are then refined
repeatedly (§23.4.3.5) to reach a set of class relationships that are sufficiently general, flexible, and
stable to be of real help in the further evolution of a system.
The best tool for finding initial key concepts/classes is a blackboard. The best method for their
initial refinement is discussions with experts in the application domain and a couple of friends.
Discussion is necessary to develop a viable initial vocabulary and conceptual framework. Few people can do that alone. One way to evolve a set of useful classes from an initial set of candidates is
to simulate a system, with designers taking the roles of classes. This brings the inevitable absurdities of the initial ideas out into the open, stimulates discussion of alternatives, and creates a shared
understanding of the evolving design. This activity can be supported by and documented by notes
on index cards. Such cards are usually called CRC cards (‘‘Class, Responsibility, and Collaborators’’; [Wirfs-Brock,1990]) because of the information they record.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
704
Development and Design
Chapter 23
A use case is a description of a particular use of a system. Here is a simple example of a use
case for a telephony system: take the phone off hook, dial a number, the phone at the other end
rings, the phone at the other end is taken off hook. Developing a set of such use cases can be of
immense value at all stages of development. Initially, finding use cases can help us understand
what we are trying to build. During design, they can be used to trace a path through the system (for
example, using CRC cards) to check that the relatively static description of the system in terms of
classes and objects actually makes sense from a user’s point of view. During programming and
testing, the use cases become a source of test cases. In this way, use cases provide an orthogonal
way of viewing the system and act as a reality check.
Use cases view the system as a (dynamic) working entity. They can therefore trap a designer
into a functional view of a system and distract from the essential task of finding useful concepts
that can be mapped into classes. Especially in the hands of someone with a background in structured analysis and weak experience with object-oriented programming/design, an emphasis on use
cases can lead to a functional decomposition. A set of use cases is not a design. A focus on the use
of the system must be matched by a complementary focus on the system’s structure.
A team can become trapped into an inherently futile attempt to find and describe aallll of the use
cases. This is a costly mistake. Much as when we look for candidate classes for a system, there
comes a time when we must say, ‘‘Enough is enough. The time has come to try out what we have
and see what happens.’’ Only by using a plausible set of classes and a plausible set of use cases in
further development can we obtain the feedback that is essential to obtaining a good system. It is
always hard to know when to stop a useful activity. It is especially hard to know when to stop
when we know that we must return later to complete the task.
How many cases are enough? In general it is impossible to answer that question. However, in
a given project, there comes a time when it is clear that most of the ordinary functioning of the system has been covered and a fair bit of the more unusual and error handling issues have been
touched upon. Then it is time to get on with the next round of design and programming.
When you are trying to estimate the coverage of the system by a set of use cases, it can be useful to separate the cases into primary and secondary use cases. The primary ones describe the
system’s most common and ‘‘normal’’ actions, and the secondary describe the more unusual and
error-handling scenarios. An example of a secondary use case would be a variant of the ‘‘make a
phone call’’ case, in which the called phone is off hook, dialing its caller. It is often said that when
80% of the primary use cases and some of the secondary ones have been covered, it is time to proceed, but since we cannot know what constitutes ‘‘all of the cases’’ in advance, this is simply a rule
of thumb. Experience and good sense matter here.
The concepts, operations, and relationships mentioned here are the ones that come naturally
from our understanding of the application area or that arise from further work on the class structure.
They represent our fundamental understanding of the application. Often, they are classifications of
the fundamental concepts. For example, a hook-and-ladder is a fire engine, which is a truck, which
is a vehicle. Sections §23.4.3.2 and §23.4.5 explain a few ways of looking at classes and class hierarchies with the view of making improvements.
Beware of viewgraph engineering! At some stage, you will be asked to present the design to
someone and you will produce a set of diagrams explaining the structure of the system being built.
This can be a very useful exercise because it helps focus your attention on what is important about
the system and forces you to express your ideas in terms that others can understand. A presentation
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.3.1
Step 1: Find Classes
705
is an invaluable design tool. Preparing a presentation with the aim of conveying real understanding
to people with the interest and ability to produce constructive criticism is an exercise in conceptualization and clean expression of ideas.
However, a formal presentation of a design is also a very dangerous activity because there is a
strong temptation to present an ideal system – a system you wished you could build, a system your
high management wish they had – rather than what you have and what you might possibly produce
in a reasonable time. When different approaches compete and executives don’t really understand or
care about ‘‘the details,’’ presentations can become lying competitions, in which the team that presents the most grandiose system gets to keep its job. In such cases, clear expression of ideas is
often replaced by heavy jargon and acronyms. If you are a listener to such a presentation – and
especially if you are a decision maker and you control development resources – it is desperately
important that you distinguish wishful thinking from realistic planning. High-quality presentation
materials are no guarantee of quality of the system described. In fact, I have often found that organizations that focus on the real problems get caught short when it comes to presenting their results
compared to organizations that are less concerned with the production of real systems.
When looking for concepts to represent as classes, note that there are important properties of a
system that cannot be represented as classes. For example, reliability, performance, and testability
are important measurable properties of a system. However, even the most thoroughly objectoriented system will not have its reliability localized in a reliability object. Pervasive properties of
a system can be specified, designed for, and eventually verified through measurement. Concern for
such properties must be applied across all classes and may be reflected in rules for the design and
implementation of individual classes and components (§23.4.3).
23.4.3.2 Step 2: Specify Operations [design.operations]
Refine the classes by specifying the sets of operations on them. Naturally, it is not possible to separate finding the classes from figuring out what operations are needed on them. However, there is a
practical difference in that finding the classes focusses on the key concepts and deliberately deemphasizes the computational aspects of the classes, whereas specifying the operations focusses on
finding a complete and usable set of operations. It is most often too hard to consider both at the
same time, especially since related classes should be designed together. When it is time to consider
both together, CRC cards (§23.4.3.1) are often helpful.
In considering what functions are to be provided, several philosophies are possible. I suggest
the following strategy:
[1] Consider how an object of the class is to be constructed, copied (if at all), and destroyed.
[2] Define the minimal set of operations required by the concept the class is representing. Typically, these operations become the member functions (§10.3).
[3] Consider which operations could be added for notational convenience. Include only a few
really important ones. Often, these operations become the nonmember ‘‘helper functions’’
(§10.3.2).
[4] Consider which operations are to be virtual, that is, operations for which the class can act as
an interface for an implementation supplied by a derived class.
[5] Consider what commonality of naming and functionality can be achieved across all the
classes of the component.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
706
Development and Design
Chapter 23
This is clearly a statement of minimalism. It is far easier to add every function that could conceivably be useful and to make all operations virtual. However, the more functions, the more likely
they are to remain unused and the more likely they are to constrain the implementation and the further evolution of the system. In particular, functions that directly read or write part of the state of
an object of a class often constrain the class to a single implementation strategy and severely limit
the potential for redesign. Such functions lower the level of abstraction from a concept to one
implementation of it. Adding functions also causes more work for the implementer – and for the
designer in the next redesign. It is much easier to add a function once the need for it has been
clearly established than to remove it once it has become a liability.
The reason for requiring that the decision to make a function virtual be explicit rather than a
default or an implementation detail is that making a function virtual critically affects the use of its
class and the relationships between that class and other classes. Objects of a class with even a single virtual function have a nontrivial layout compared to objects in languages such as C and Fortran. A class with even a single virtual function potentially acts as the interface to yet-to-be-defined
classes, and a virtual function implies a dependency on yet-to-be-defined classes (§24.3.2.1).
Note that minimalism requires more work from the designer, rather than less.
When choosing operations, it is important to focus on what is to be done rather than how it is to
be done. That is, we should focus more on desired behavior than on implementation issues.
It is sometimes useful to classify operations on a class in terms of their use of the internal state
of objects:
– Foundation operators: constructors, destructors and copy operators
– Inspectors: operations that do not modify the state of an object
– Modifiers: operations that do modify the state of an object
– Conversions: operations that produce an object of another type based on the value (state) of
the object to which they are applied
– Iterators: operations that allow access to or use of a sequence of contained objects
These categories are not orthogonal. For example, an iterator can be designed to be either an
inspector or a modifier. These categories are simply a classification that has helped people
approach the design of class interfaces. Naturally, other classifications are possible. Such classifications are especially useful for maintaining consistency across a set of classes within a component.
C++ provides support for the distinction between inspectors and modifiers in the form of ccoonnsstt
and non-ccoonnsstt member functions. Similarly, the notions of constructors, destructors, copy operations, and conversion functions are directly supported.
23.4.3.3 Step 3: Specify Dependencies [design.dependencies]
Refine the classes by specifying their dependencies. The various dependencies are discussed in
§24.3. The key ones to consider in the context of design are parameterization, inheritance, and use
relationships. Each involves consideration of what it means for a class to be responsible for a single property of a system. To be responsible certainly doesn’t mean that the class has to hold all the
data itself or that its member functions have to perform all the necessary operations directly. On
the contrary, each class having a single area of responsibility ensures that much of the work of a
class is done by directing requests ‘‘elsewhere’’ for handling by some other class that has that particular subtask as its responsibility. However, be warned that overuse of this technique can lead to
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.3.3
Step 3: Specify Dependencies
707
inefficient and incomprehensible designs by proliferating classes and objects to the point where no
work is done except by a cascade of forwarded requests for service. What can be done here and
now, should be.
The need to consider inheritance and use relationships at the design stage (and not just during
implementation) follows directly from the use of classes to represent concepts. It also implies that
the component (§23.4.3, §24.4), and not the individual class, is the unit of design.
Parameterization – often leading to the use of templates – is a way of making implicit dependencies explicit so that several alternatives can be represented without adding new concepts. Often,
there is a choice between leaving something as a dependency on a context, representing it as a
branch of an inheritance tree, or using a parameter (§24.4.1).
23.4.3.4 Step 4: Specify Interfaces [design.interfaces]
Specify the interfaces. Private functions don’t usually need to be considered at the design stage.
What implementation issues must be considered in the design stage are best dealt with as part of the
consideration of dependencies in Step 2. Stronger: I use as a rule of thumb that unless at least two
significantly different implementations of a class are possible, then there is probably something
wrong with the class. That is, it is simply an implementation in disguise and not a representation of
a proper concept. In many cases, considering if some form of lazy evaluation is feasible for a class
is a good way of approaching the question, ‘‘Is the interface to this class sufficiently
implementation-independent?’’
Note that public bases and friends are part of the public interface of a class; see also §11.5 and
§24.4.2. Providing separate interfaces for inheriting and general clients by defining separate protected and public interfaces can be a rewarding exercise.
This is the step where the exact types of arguments are considered and specified. The ideal is to
have as many interfaces as possible statically typed with application-level types; see §24.2.3 and
§24.4.2.
When specifying the interfaces, look out for classes where the operations seem to support more
than one level of abstraction. For example, some member functions of a class F
Fiillee may take arguments of type F
Fiillee__ddeessccrriippttoorr and others string arguments that are meant to be file names. The
F
Fiillee__ddeessccrriippttoorr operations operate on a different level of abstraction than do the file name operations, so one must wonder whether they belong in the same class. Maybe it would be better to have
two file classes, one supporting the notion of a file descriptor and another supporting the notion of a
file name. Typically, all operations on a class should support the same level of abstraction. When
they don’t, a reorganization of the class and related classes should be considered.
23.4.3.5 Reorganization of Class Hierarchies [design.hier]
In Step 1 and again in Step 3, we examine the classes and class hierarchies to see if they adequately
serve our needs. Typically they don’t, and we have to reorganize to improve that structure or a
design and/or an implementation.
The most common reorganizations of a class hierarchy are factoring the common part of two
classes into a new class and splitting a class into two new ones. In both cases, the result is three
classes: a base class and two derived classes. When should such reorganizations be done? What
are common indicators that such a reorganization might be useful?
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
708
Development and Design
Chapter 23
Unfortunately, there are no simple, general answers to such questions. This is not really surprising because what we are talking about are not minor implementation details, but changes to the
basic concepts of a system. The fundamental – and nontrivial – operation is to look for commonality between classes and factor out the common part. The exact criteria for commonality are undefined but should reflect commonality in the concepts of the system, not just implementation conveniences. Clues that two or more classes have commonality that might be factored out into a common base class are common patterns of use, similarity of sets of operations, similarity of implementations, and simply that these classes often turn up together in design discussions. Conversely, a
class might be a good candidate for splitting into two if subsets of the operations of that class have
distinct usage patterns, if such subsets access separate subsets of the representation, and if the class
turns up in apparently unrelated design discussions. Sometimes, making a set of related classes
into a template is a way of providing necessary alternatives in a systematic manner (§24.4.1).
Because of the close relationship between classes and concepts, problems with the organization
of a class hierarchy often surface as problems with the naming of classes and the use of class names
in design discussions. If design discussion using class names and the classification implied by the
class hierarchies sounds awkward, then there is probably an opportunity to improve the hierarchies.
Note that I’m implying that two people are much better at analyzing a class hierarchy than is one.
Should you happen to be without someone with whom to discuss a design, then writing a tutorial
description of the design using the class names can be a useful alternative.
One of the most important aims of a design is to provide interfaces that can remain stable in the
face of changes (§23.4.2). Often, this is best achieved by making a class on which many classes
and functions depend into an abstract class presenting very general operations. Details are best relegated to more specialized derived classes on which fewer classes and functions directly depend.
Stronger: the more classes that depend on a class, the more general that class should be and the
fewer details it should reveal.
There is a strong temptation to add operations (and data) to a class used by many. This is often
seen as a way of making that class more useful and less likely to need (further) change. The effect
of such thinking is a class with a fat interface (§24.4.3) and with data members supporting several
weakly related functions. This again implies that the class must be modified whenever there is a
significant change to one of the many classes it supports. This, in turn, implies changes to apparently unrelated user classes and derived classes. Instead of complicating a class that is central to a
design, we should usually keep it general and abstract. When necessary, specialized facilities
should be presented as derived classes. See [Martin,1995] for examples.
This line of thought leads to hierarchies of abstract classes, with the classes near the roots being
the most general and having the most other classes and functions dependent on them. The leaf
classes are the most specialized and have only very few pieces of code depending directly on them.
As an example, consider the final version of the IIvvaall__bbooxx hierarchy (§12.4.3, §12.4.4).
23.4.3.6 Use of Models [design.model]
When I write an article, I try to find a suitable model to follow. That is, rather than immediately
starting to type I look for papers on a similar topic to see if I can find one that can be an initial pattern for my paper. If the model I choose is a paper I wrote myself on a related topic, I might even
be able to leave parts of the text in place, modify other parts as needed, and add new information
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.3.6
Use of Models
709
only where the logic of the information I’m trying to convey requires it. For example, this book is
written that way based on its first and second editions. An extreme form of this writing technique
is the form letter. In that case, I simply fill in a name and maybe add a few lines to ‘‘personalize’’
the letter. In essence, I’m writing such letters by specifying the differences from a basic model.
Such use of existing systems as models for new designs is the norm rather than the exception in
all forms of creative endeavors. Whenever possible, design and programming should be based on
previous work. This limits the degrees of freedom that the designer has to deal with and allows
attention to be focussed on a few issues at a time. Starting a major project ‘‘completely from
scratch’’ can be exhilarating. However, often a more accurate description is ‘‘intoxicating’’ and the
result is a drunkard’s walk through the design alternatives. Having a model is not constraining and
does not require that the model should be slavishly followed; it simply frees the designer to consider one aspect of a design at a time.
Note that the use of models is inevitable because any design will be synthesized from the experiences of its designers. Having an explicit model makes the choice of a model a conscious decision, makes assumptions explicit, defines a common vocabulary, provides an initial framework for
the design, and increases the likelihood that the designers have a common approach.
Naturally, the choice of an initial model is in itself an important design decision and often can
be made only after a search for potential models and careful evaluation of alternatives. Furthermore, in many cases a model is suitable only with the understanding that major modification is necessary to adapt the ideas to a particular new application. Software design is hard, and we need all
the help we can get. We should not reject the use of models out of misplaced disdain for ‘‘imitation.’’ Imitation is the sincerest form of flattery, and the use of models and previous work as inspiration is – within the bounds of propriety and copyright law – acceptable technique for innovative
work in all fields: what was good enough for Shakespeare is good enough for us. Some people
refer to such use of models in design as ‘‘design reuse.’’
Documenting general elements that turn up in many designs together with some description of
the design problem they solve and the conditions under which they can be used is an obvious idea
– at least once you think of it. The word pattern is often used to describe such a general and useful
design element, and a literature exists documenting patterns and their use (for example,
[Gamma,1994] and [Coplien,1995]).
It is a good idea for a designer to be acquainted with popular patterns in a given application
domain. As a programmer, I prefer patterns that have some code associated with them as concrete
examples. Like most people, I understand a general idea (in this case, a pattern) best when I have a
concrete example (in this case, a piece of code illustrating a use of the pattern) to help me. People
who use patterns heavily have a specialized vocabulary to ease communication among themselves.
Unfortunately, this can become a private language that effectively excludes outsiders from understanding. As always, it is essential to ensure proper communication among people involved in different parts of a project (§23.3) and also with the design and programming communities at large.
Every successful large system is a redesign of a somewhat smaller working system. I know of
no exceptions to this rule. The closest I can think of are projects that failed, muddled on for years
at great cost, and then eventually became successes years after their intended completion date.
Such projects unintentionally – and often unacknowledged – simply first built a nonworking system, then transformed that into a working system, and finally redesigned that into a system that
approximated the original aims. This implies that it is a folly to set out to build a large system from
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
710
Development and Design
Chapter 23
scratch exactly right according to the latest principles. The larger and the more ambitious a system
we aim for, the more important it is to have a model from which to work. For a large system, the
only really acceptable model is a somewhat smaller, related working system.
23.4.4 Experimentation and Analysis [design.experiment]
At the start of an ambitious development project, we do not know the best way to structure the system. Often, we don’t even know precisely what the system should do because particulars will
become clear only through the effort of building, testing, and using the system. How – short of
building the complete system – do we get the information necessary to understand what design
decisions are significant and to estimate their ramifications?
We conduct experiments. Also, we analyze the design and implementation as soon as we have
something to analyze. Most frequently and importantly, we discuss the design and implementation
alternatives. In all but the rarest cases, design is a social activity in which designs are developed
through presentations and discussions. Often, the most important design tool is a blackboard; without it, the embryonic concepts of a design cannot be developed and shared among designers and
programmers.
The most popular form of experiment seems to be to build a prototype, that is, a scaled-down
version of the system or a part of the system. A prototype doesn’t have stringent performance criteria, machine and programming-environment resources are typically ample, and the designers and
programmers tend to be uncommonly well educated, experienced, and motivated. The idea is to get
a version running as fast as possible to enable exploration of design and implementation choices.
This approach can be very successful when done well. It can also be an excuse for sloppiness.
The problem is that the emphasis of a prototype can easily shift from ‘‘exploring design alternatives’’ to ‘‘getting some sort of system running as soon as possible.’’ This easily leads to a disinterest in the internal structure of the prototype (‘‘after all, it is only a prototype’’) and a neglect of
the design effort in favor of playing around with the prototype implementation. The snag is that
such an implementation can degenerate into the worst kind of resource hog and maintenance nightmare while giving the illusion of an ‘‘almost complete’’ system. Almost by definition, a prototype
does not have the internal structure, the efficiency, and the maintenance infrastructure that allows it
to scale to real use. Consequently, a ‘‘prototype’’ that becomes an ‘‘almost product’’ soaks up time
and energy that could have been better spent on the product. The temptation for both developers
and managers is to make the prototype into a product and postpone ‘‘performance engineering’’
until the next release. Misused this way, prototyping is the negation of all that design stands for.
A related problem is that the prototype developers can fall in love with their tools. They can
forget that the expense of their (necessary) convenience cannot always be afforded by a production
system and that the freedom from constraints and formalities offered by their small research group
cannot easily be maintained for a larger group working toward a set of interlocking deadlines.
On the other hand, prototypes can be invaluable. Consider designing a user interface. In this
case, the internal structure of the part of the system that doesn’t interact directly with the user often
is irrelevant and there are no other feasible ways of getting experience with users’ reactions to the
look and feel of a system. Another example is a prototype designed strictly for studying the internal workings of a system. Here, the user interface can be rudimentary – possibly with simulated
users instead of real ones.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.4
Experimentation and Analysis
711
Prototyping is a way of experimenting. The desired results from building a prototype are the
insights that building it brings, not the prototype itself. Maybe the most important criterion for a
prototype is that it has to be so incomplete that it is obviously an experimental vehicle and cannot
be turned into a product without a major redesign and reimplementation. Having a prototype
‘‘incomplete’’ helps keep the focus on the experiment and minimizes the danger of having the prototype become a product. It also minimizes the temptation to try to base the design of the product
too closely on the design of the prototype – thus forgetting or ignoring the inherent limitations of
the prototype. After use, a prototype should be thrown away.
It should be remembered that in many cases, there are experimental techniques that can be used
as alternatives to prototyping. Where those can be used, they are often preferable because of their
greater rigor and lower demands on designer time and system resources. Examples are mathematical models and various forms of simulators. In fact, one can see a continuum from mathematical
models, through more and more detailed simulations, through prototypes, through partial implementations, to a complete system.
This leads to the idea of growing a system from an initial design and implementation through
repeated redesign and reimplementation. This is the ideal strategy, but it can be very demanding on
design and implementation tools. Also, the approach suffers from the risk of getting burdened with
so much code reflecting initial design decisions that a better design cannot be implemented. At
least for now, this strategy seems limited to small-to-medium-scale projects, in which major
changes to the overall design are unlikely, and for redesigns and reimplementations after the initial
release of the system, where such a strategy is inevitable.
In addition to experiments designed to provide insights into design choices, analysis of a design
and/or an implementation itself can be an important source of further insights. For example, studies of the various dependencies between classes (§24.3) can be most helpful, and traditional
implementer’s tools such as call graphs, performance measurements, etc., must not be ignored.
Note that specifications (the output of the analysis phase) and designs are as prone to errors as is
the implementation. In fact, they may be more so because they are even less concrete, are often
specified less precisely, are not executable, and typically are not supported by tools of a sophistication comparable to what is available for checking and analyzing the implementation. Increasing the
formality of the language/notation used to express a design can go some way toward enabling the
application of tools to help the designer. This must not be done at the cost of impoverishing the
programming language used for implementation (§24.3.1). Also, a formal notation can itself be a
source of complexity and problems. This happens when the formalism is ill suited to the practical
problem to which it is applied, when the rigor of the formalism exceeds the mathematical background and maturity of the designers and programmers involved, and when the formal description
of a system gets out of touch with the system it is supposedly describing.
Design is inherently error-prone and hard to support with effective tools. This makes experience and feedback essential. Consequently, it is fundamentally flawed to consider the softwaredevelopment process a linear process starting with analysis and ending with testing. An emphasis
on iterative design and implementation is needed to gain sufficient feedback from experience during the various stages of development.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
712
Development and Design
Chapter 23
23.4.5 Testing [design.test]
A program that has not been tested does not work. The ideal of designing and/or verifying a program so that it works the first time is unattainable for all but the most trivial programs. We should
strive toward that ideal, but we should not be fooled into thinking that testing is easy.
‘‘How to test?’’ is a question that cannot be answered in general. ‘‘When to test?’’ however,
does have a general answer: as early and as often as possible. Test strategies should be generated
as part of the design and implementation efforts or at least should be developed in parallel with
them. As soon as there is a running system, testing should begin. Postponing serious testing until
‘‘after the implementation is complete’’ is a prescription for slipped schedules and/or flawed
releases.
Wherever possible, a system should be designed specifically so that it is relatively easy to test.
In particular, mechanisms for testing can often be designed right into the system. Sometimes this is
not done out of fear of causing expensive run-time testing or for fear that the redundancy necessary
for consistency checks will unduly enlarge data structures. Such fear is usually misplaced because
most actual testing code and redundancy can, if necessary, be stripped out of the code before the
system is shipped. Assertions (§24.3.7.2) are sometimes useful here.
More important than specific tests is the idea that the structure of the system should be such that
we have a reasonable chance of convincing ourselves and our users/customers that we can eliminate
errors by a combination of static checking, static analysis, and testing. Where a strategy for fault
tolerance is developed (§14.9), a testing strategy can usually be designed as a complementary and
closely related aspect of the total design.
If testing issues are completely discounted in the design phase, then testing, delivery date, and
maintenance problems will result. The class interfaces and the class dependencies (as described in
§24.3 and §24.4.2) are usually a good place to start work on a testing strategy.
Determining how much testing is enough is usually hard. However, too little testing is a more
common problem than too much. Exactly how many resources should be allocated to testing compared to design and implementation naturally depends on the nature of the system and the methods
used to construct it. However, as a rule of thumb, I can suggest that more resources in time, effort,
and talent should be spent testing a system than on constructing the initial implementation. Testing
should focus on problems that would have disastrous consequences and on problems that would
occur frequently.
23.4.6 Software Maintenance [design.maintain]
‘‘Software maintenance’’ is a misnomer. The word ‘‘maintenance’’ suggests a misleading analogy
to hardware. Software doesn’t need oiling, doesn’t have moving parts that wear down, and doesn’t
have crevices in which water can collect and cause rust. Software can be replicated exactly and
transported over long distances at minute costs. Software is not hardware.
The activities that go under the name of software maintenance are really redesign and reimplementation and thus belong under the usual program development cycle. When flexibility, extensibility, and portability are emphasized in the design, the traditional sources of maintenance problems
are addressed directly.
Like testing, maintenance must not be an afterthought or an activity segregated from the mainstream of development. In particular, it is important to have some continuity in the group of people
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.4.6
Software Maintenance
713
involved in a project. It is not easy to successfully transfer maintenance to a new (and typically
less-experienced) group of people with no links to the original designers and implementers. When
a major change of people is necessary, there must be an emphasis on transferring an understanding
of the system’s structure and of the system’s aims to the new people. If a ‘‘maintenance crew’’ is
left guessing about the architecture of the system or must deduce the purpose of system components from their implementation, the structure of a system can deteriorate rapidly under the impact
of local patches. Documentation is typically much better at conveying details than in helping new
people to understand key ideas and principles.
23.4.7 Efficiency [design.efficiency]
Donald Knuth observed that ‘‘premature optimization is the root of all evil.’’ Some people have
learned that lesson all too well and consider all concern for efficiency evil. On the contrary, efficiency must be kept in mind throughout the design and implementation effort. However, that does
not mean the designer should be concerned with micro-efficiencies, but that first-order efficiency
issues must be considered.
The best strategy for efficiency is to produce a clean and simple design. Only such a design can
remain relatively stable over the lifetime of the project and serve as a base for performance tuning.
Avoiding the gargantuanism that plagues large projects is essential. Far too often people add features ‘‘just in case’’ (§23.4.3.2, §23.5.3) and end up doubling and quadrupling the size and runtime of systems to support frills. Worse, such overelaborate systems are often unnecessarily hard to
analyze so that it becomes difficult to distinguish the avoidable overheads from the unavoidable.
Thus, even basic analysis and optimization is discouraged. Optimization should be the result of
analysis and performance measurement, not random fiddling with the code. Especially in larger
systems, a designer’s or programmer’s ‘‘intuition’’ is an unreliable guide in matters of efficiency.
It is important to avoid inherently inefficient constructs and constructs that will take much time
and cleverness to optimize to an acceptable performance level. Similarly, it is important to minimize the use of inherently nonportable constructs and tools because using such tools and constructs
condemns the project to run on older (less powerful and/or more expensive) computers.
23.5 Management [design.management]
Provided it makes some minimum of sense, most people do what they are encouraged to do. In
particular, if in the context of a software project you reward certain ways of operating and penalize
others, only exceptional programmers and designers will risk their careers to do what they consider
right in the face of management opposition, indifference, and red tape†. It follows that an organization should have a reward structure that matches its stated aims of design and programming. However, all too often this is not the case: a major change of programming style can be achieved only
through a matching change of design style, and both typically require changes in management style
to be effective. Mental and organizational inertia all too easily leads to a local change that is not
__________________
† An organization that treats its programmers as morons will soon have programmers that are willing and able to act like morons only.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
714
Development and Design
Chapter 23
supported by global changes required to ensure its success. A fairly typical example is a change to
a language that supports object-oriented programming, such as C++, without a matching change in
the design strategies to take advantage of its facilities (see also §24.2). Another is a change to
‘‘object-oriented design’’ without the introduction of a programming language to support it.
23.5.1 Reuse [design.reuse]
Increased reuse of code and design is often cited as a major reason for adopting a new programming language or design strategy. However, most organizations reward individuals and groups that
choose to re-invent the wheel. For example, a programmer may have his productivity measured in
lines of code; will he produce small programs relying on standard libraries at the cost of income
and, possibly, status? A manager may be paid somewhat proportionally to the number of people in
her group; is she going to use software produced in another group when she can hire another couple
of programmers for her own group instead? A company can be awarded a government contract,
where the profit is a fixed percentage of the development cost; is that company going to minimize
its profits by using the most effective development tools? Rewarding reuse is hard, but unless management finds ways to encourage and reward it, reuse will not happen.
Reuse is primarily a social phenomenon. I can use someone else’s software provided that:
[1] It works: to be reusable, software must first be usable.
[2] It is comprehensible: program structure, comments, documentation, and tutorial material are
important.
[3] It can coexist with software not specifically written to coexist with it.
[4] It is supported (or I’m willing to support it myself; typically, I’m not).
[5] It is economical (can I share the development and maintenance costs with other users?).
[6] I can find it.
To this, we may add that a component is not reusable until someone has ‘‘reused’’ it. The task of
fitting a component into an environment typically leads to refinements in its operation, generalizations of its behavior, and improvements in its ability to coexist with other software. Until this exercise has been done at least once, even components that have been designed and implemented with
the greatest care tend to have unintended and unexpected rough corners.
My experience is that the conditions necessary for reuse will exist only if someone makes it
their business to make such sharing work. In a small group, this typically means that an individual,
by design or by accident, becomes the keeper of common libraries and documentation. In a larger
organization, this means that a group or department is chartered to gather, build, document, popularize, and maintain software for use by many groups.
The importance of such a ‘‘standard components’’ group cannot be overestimated. Note that as
a first approximation, a system reflects the organization that produced it. If an organization has no
mechanism for promoting and rewarding cooperation and sharing, cooperation and sharing will be
rare. A standard components group must actively promote its components. This implies that good
traditional documentation is essential but insufficient. In addition, the components group must provide tutorials and other information that allow a potential user to find a component and understand
why it might be of help. This implies that activities that traditionally are associated with marketing
and education must be undertaken by the components group.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.5.1
Reuse
715
Whenever possible, the members of this group should work in close cooperation with applications builders. Only then can they be sufficiently aware of the needs of users and alert to the opportunities for sharing components among different applications. This argues for there to be a consultancy role for such an organization and for the use of internships to transfer information into and
out of the components group.
The success of a ‘‘components group’’ must be measured in terms of the success of its clients.
If its success is measured simply in terms of the amount of tools and services it can convince development organizations to accept, such a group can become corrupted into a mere peddler of commercial software and a proponent of ever-changing fads.
Not all code needs to be reusable, and reusability is not a universal property. Saying that a
component is ‘‘reusable’’ means that its reuse within a certain framework requires little or no work.
In most cases, moving to a different framework will require significant work. In this respect, reuse
strongly resembles portability. It is important to note that reuse is the result of design aimed at
reuse, refinement of components based on experience, and deliberate effort to search out existing
components to (re)use. Reuse does not magically arise from mindless use of specific language features or coding techniques. C++ features such as classes, virtual functions, and templates allow
designs to be expressed so that reuse is made easier (and thus more likely), but in themselves such
features do not ensure reusability.
23.5.2 Scale [design.scale]
It is easy for an individual or an organization to get excited about ‘‘doing things right.’’ In an institutional setting, this often translates into ‘‘developing and strictly following proper procedures.’’
In both cases, common sense can be the first victim of a genuine and often ardent desire to improve
the way things are done. Unfortunately, once common sense is missing there is no limit to the
damage that can unwittingly be done.
Consider the stages of the development process listed in §23.4 and the stages of the design steps
listed in §23.4.3. It is relatively easy to elaborate these stages into a proper design method where
each stage is more precisely defined and has well-defined inputs and outputs and a semiformal
notation for expressing these inputs and outputs. Checklists can be developed to ensure that the
design method is adhered to, and tools can be developed to enforce a large number of the procedural and notational conventions. Further, looking at the classification of dependencies presented in
§24.3 one could decree that certain dependencies were good and others bad and provide analysis
tools to ensure that these value judgements were applied uniformly across a project. To complete
this ‘‘firming up’’ of the software-production process, one would define standards for documentation (including rules for spelling and grammar and typesetting conventions) and for the general
look of the code (including specifications of which language features can and cannot be used, specifications of what kinds of libraries can and cannot be used, conventions for indentation and the
naming of functions, variables, and types, etc.).
Much of this can be helpful for the success of a project. At least, it would be a folly to set out
to design a system that will eventually contain ten million lines of code that will be developed by
hundreds of people and maintained and supported by thousands more over a decade or more without a fairly well-defined and somewhat rigid framework along the lines described previously.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
716
Development and Design
Chapter 23
Fortunately, most systems do not fall into this category. However, once the idea is accepted
that such a design method or adherence to such a set of coding and documentation standards is ‘‘the
right way,’’ pressure builds to apply it universally and in every detail. This can lead to ludicrous
constraints and overheads on small projects. In particular, it can lead to paper shuffling and forms
filling replacing productive work as the measure of progress and success. If that happens, real
designers and programmers will leave the project and be replaced with bureaucrats.
Once such a ridiculous misapplication of a (hopefully perfectly reasonable) design method has
occurred in a community, its failure becomes the excuse for avoiding almost all formality in the
development process. This in turn naturally leads to the kind of messes and failures that the design
method was designed to prevent in the first place.
The real problem is to find an appropriate degree of formality for the development of a particular project. Don’t expect to find an easy answer to this problem. Essentially every approach works
for a small project. Worse, it seems that essentially every approach – however ill conceived and
however cruel to the individuals involved – also works for a large project, provided you are willing
to throw indecent amounts of time and money at the problem.
A key problem in every software project is how to maintain the integrity of the design. This
problem increases more than linearly with scale. Only an individual or a small group of people can
grasp and keep sight of the overall aims of a major project. Most people must spend so much of
their time on subprojects, technical details, day-to-day administration, etc., that the overall design
aims are easily forgotten or subordinated to more local and immediate goals. It also is a recipe for
failure not to have an individual or group with the explicit task of maintaining the integrity of the
design. It is a recipe for failure not to enable such an individual or group to have an effect on the
project as a whole.
Lack of a consistent long-term aim is much more damaging to a project and an organization
than the lack of any individual feature. It should be the job of some small number of individuals to
formulate such an overall aim, to keep that aim in mind, to write the key overall design documents,
to write the introductions to the key concepts, and generally to help others to keep the overall aim
in mind.
23.5.3 Individuals [design.people]
Use of design as described here places a premium on skillful designers and programmers. Thus, it
makes the choice of designers and programmers critical to the success of an organization.
Managers often forget that organizations consist of individuals. A popular notion is that programmers are equal and interchangeable. This is a fallacy that can destroy an organization by driving out many of the most effective individuals and condemning the remaining people to work at
levels well below their potential. Individuals are interchangeable only if they are not allowed to
take advantage of skills that raise them above the absolute minimum required for the task in question. Thus, the fiction of interchangeability is inhumane and inherently wasteful.
Most programming performance measures encourage wasteful practices and fail to take critical
individual contributions into account. The most obvious example is the relatively widespread practice of measuring progress in terms of number of lines of code produced, number of pages of documentation produced, number of tests passed, etc. Such figures look good on management charts
but bear only the most tenuous relation to reality. For example, if productivity is measured in terms
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.5.3
Individuals
717
of number of lines of code produced, a successful application of reuse will appear to cause negative
performance of programmers. A successful application of the best principles in the redesign of a
major piece of software typically has the same effect.
Quality of work produced is far harder to measure than quantity of output, yet individuals and
groups must be rewarded based on the quality of their output rather than by crude quantity measures. Unfortunately, the design of practical quality measures has – to the best of my knowledge –
hardly begun. In addition, measures that incompletely describe the state of a project tend to warp
development. People adapt to meet local deadlines and to optimize individual and group performance as defined by the measures. As a direct result, overall system integrity and performance suffer. For example, if a deadline is defined in terms of bugs removed or known bugs remaining, we
may see that deadline met at the expense of run-time performance or hardware resources needed to
run the system. Conversely, if only run-time performance is measured the error rate will surely rise
when the developers struggle to optimize the system for benchmarks. The lack of good and comprehensive quality measures places great demands on the technical expertise of managers, but the
alternative is a systematic tendency to reward random activity rather than progress. Don’t forget
that managers are also individuals. Managers need as least as much education on new techniques
as do the people they manage.
As in other areas of software development, we must consider the longer term. It is essentially
impossible to judge the performance of an individual on the basis of a single year’s work. Most
individuals do, however, have consistent long-term track records that can be reliable predictors of
technical judgement and a useful help in evaluating immediate past performance. Disregard of
such records – as is done when individuals are considered merely as interchangeable cogs in the
wheels of an organization – leaves managers at the mercy of misleading quantity measurements.
One consequence of taking a long-term view and avoiding the ‘‘interchangeable morons school
of management’’ is that individuals (both developers and managers) need longer to grow into the
more demanding and interesting jobs. This discourages job hopping as well as job rotation for
‘‘career development.’’ A low turnover of both key technical people and key managers must be a
goal. No manager can succeed without a rapport with key designers and programmers and some
recent and relevant technical knowledge. Conversely, no group of designers and developers can
succeed in the long run without support from competent managers and a minimum of understanding of the larger nontechnical context in which they work.
Where innovation is needed, senior technical people, analysts, designers, programmers, etc.,
have a critical and difficult role to play in the introduction of new techniques. These are the people
who must learn new techniques and in many cases unlearn old habits. This is not easy. These individuals have typically made great personal investments in the old ways of doing things and rely on
successes achieved using these ways of operating for their technical reputation. So do many technical managers.
Naturally, there is often a fear of change among such individuals. This can lead to an overestimation of the problems involved in a change and a reluctance to acknowledge problems with the
old ways of doing things. Equally naturally, people arguing for change tend to overestimate the
beneficial effects of new ways of doing things and to underestimate the problems involved in a
change. These two groups of individuals must communicate, they must learn to talk the same language, they must help each other hammer out a model for transition. The alternative is organizational paralysis and the departure of the most capable individuals from both groups. Both groups
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
718
Development and Design
Chapter 23
should remember that the most successful ‘‘old timers’’ are often the ‘‘young turks’’ of yesteryear.
Given a chance to learn without humiliation, more experienced programmers and designers can
become the most successful and insightful proponents of change. Their healthy skepticism, knowledge of users, and acquaintance with the organizational hurdles can be invaluable. Proponents of
immediate and radical change must realize that a transition, often involving a gradual adoption of
new techniques, is more often than not necessary. Conversely, individuals who have no desire to
change should search out areas in which no change is needed rather than fight vicious rear-guard
battles in areas in which new demands have already significantly altered the conditions for success.
23.5.4 Hybrid Design [design.hybrid]
Introducing new ways of doing things into an organization can be painful. The disruption to the
organization and the individuals in the organization can be significant. In particular, an abrupt
change that overnight turns productive and proficient members of ‘‘the old school’’ into ineffective
novices in ‘‘the new school’’ is typically unacceptable. However, it is rare to achieve major gains
without changes, and significant changes typically involve risks.
C++ was designed to minimize such risks by allowing a gradual adoption of techniques.
Although it is clear that the largest benefits from using C++ are achieved through data abstraction,
object-oriented programming, and object-oriented design, it is not clear that the fastest way to
achieve these gains is a radical break with the past. Occasionally, such a clean break is feasible.
More often, the desire for improvement is – or should be – tempered by concerns about how to
manage the transition. Consider:
– Designers and programmers need time to acquire new skills.
– New code needs to cooperate with old code.
– Old code needs to be maintained (often indefinitely).
– Work on existing designs and programs needs to be completed (on time).
– Tools supporting the new techniques need to be introduced into the local environment.
These factors lead naturally to a hybrid style of design – even where that isn’t the intention of some
designers. It is easy to underestimate the first two points.
By supporting several programming paradigms, C++ supports the notion of a gradual introduction into an organization in several ways:
– Programmers can remain productive while learning C++.
– C++ can yield significant benefits in a tool-poor environment.
– C++ program fragments can cooperate well with code written in C and other traditional languages.
– C++ has a large C-compatible subset.
The idea is that programmers can make the move to C++ from a traditional language by first adopting C++ while retaining a traditional (procedural) style of programming. Then they use the data
abstraction techniques. Finally – when the language and its associated tools have been mastered –
they move on to object-oriented programming and generic programming. Note that a welldesigned library is much easier to use than it was to design and implement, so a novice can benefit
from the more advanced uses of abstraction even during the early stages of this progress.
The idea of learning object-oriented design, object-oriented programming, and C++ in stages is
supported by facilities for mixing C++ code with code written in languages that do not support
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.5.4
Hybrid Design
719
C++’s notions of data abstraction and object-oriented programming (§24.2.1). Many interfaces can
simply be left procedural because there will be no immediate benefits in doing anything more complicated. For many key libraries, this will already have been done by the library provider so that
the C++ programmer can stay ignorant of the actual implementation language. Using libraries written in languages such as C is the first, and initially most important, form of reuse in C++.
The next stage – to be used only where a more elaborate technique is actually needed – is to
present facilities written in languages such as C and Fortran as classes by encapsulating the data
structures and functions in C++ interface classes. A simple example of lifting the semantics from
the procedure plus data structure level to the data abstraction level is the string class from §11.12.
There, encapsulation of the C character string representation and the standard C string functions is
used to produce a string type that is much simpler to use.
A similar technique can be used to fit a built-in or stand-alone type into a class hierarchy
(§23.5.1). This allows designs for C++ to evolve to use data abstraction and class hierarchies in the
presence of code written in languages in which these concepts are missing and even under the constraint that the resulting code must be callable from procedural languages.
23.6 Annotated Bibliography [design.ref]
This chapter only scratches the surface of the issues of design and of the management of programming projects. For that reason, a short annotated bibliography is provided. An extensive annotated
bibliography can be found in [Booch,1994].
[Anderson,1990]
Bruce Anderson and Sanjiv Gossain: An Iterative Design Model for Reusable Object-Oriented Software. Proc. OOPSLA’90. Ottawa, Canada. A
description of an iterative design and redesign model with a specific example and a discussion of experience.
[Booch,1994]
Grady Booch: Object-Oriented Analysis and Design with Applications.
Benjamin/Cummings. 1994. ISBN 0-8053-5340-2. Contains a detailed
description of design, a specific design method with a graphical notation,
and several large examples of designs expressed in C++. It is an excellent
book to which this chapter owes much. It provides a more in-depth treatment of many of the issues in this chapter.
[Booch,1996]
Grady Booch: Object Solutions. Benjamin/Cummings. 1996. ISBN 08053-0594-7. Describes the development of object-oriented systems from
a management perspective. Contains extensive C++ code examples.
[Brooks,1982]
Fred Brooks: The Mythical Man Month. Addison-Wesley. 1982. Everyone should read this book every couple of years. A warning against
hubris. It is a bit dated on technical matters, but it is not at all dated in
matters related to individuals, organizations, and scale. Republished with
additions in 1997. ISBN 1-201-83595-9.
[Brooks,1987]
Fred Brooks: No Silver Bullet. IEEE Computer, Vol. 20, No. 4. April
1987. A summary of approaches to large-scale software development,
with a much-needed warning against belief in miracle cures (‘‘silver bullets’’).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
720
Development and Design
[Coplien,1995]
[Gamma,1994]
[DeMarco,1987]
[Jacobson,1992]
[Kerr,1987]
[Liskov,1987]
[Martin,1995]
[Parkinson,1957]
[Meyer,1988]
[Shlaer,1988]
[Snyder,1986]
Chapter 23
James O. Coplien and Douglas C. Schmidt (editors): Pattern Languages of
Program Design. Addison-Wesley. 1995. ISBN 1-201-60734-4.
Eric Gamma, et. al.: Design Patterns. Addison-Wesley. 1994. ISBN 0201-63361-2. A practical catalog of techniques for creating flexible and
reusable software, with a nontrivial, well-explained example. Contains
extensive C++ code examples.
T. DeMarco and T. Lister: Peopleware. Dorset House Publishing Co.
1987. One of the few books that focusses on the role of people in the production of software. A must for every manager. Smooth enough for bedside reading. An antidote for much silliness.
Ivar Jacobson et. al.: Object-Oriented Software Engineering. AddisonWesley. 1992. ISBN 0-201-54435-0. A thorough and practical description of software development in an industrial setting with an emphasis on
use cases (§23.4.3.1). Miscasts C++ by describing it as it was ten years
ago.
Ron Kerr: A Materialistic View of the Software ‘‘Engineering’’ Analogy.
In SIGPLAN Notices, March 1987. The use of analogy in this chapter and
the next owes much to the observations in this paper and to the presentations by and discussions with Ron that preceded it.
Barbara Liskov: Data Abstraction and Hierarchy. Proc. OOPSLA’87
(Addendum). Orlando, Florida. A discussion of how the use of inheritance can compromise data abstraction. Note, C++ has specific language
support to help avoid most of the problems mentioned (§24.3.4).
Robert C. Martin: Designing Object-Oriented C++ Applications Using the
Booch Method. Prentice-Hall. 1995. ISBN 0-13-203837-4. Shows how
to go from a problem to C++ code in a fairly systematic way. Presents
alternative designs and principles for choosing between them. More practical and more concrete than most books on design. Contains extensive
C++ code examples.
C. N. Parkinson: Parkinson’s Law and other Studies in Administration.
Houghton Mifflin. Boston. 1957. One of the funniest and most cutting
descriptions of disasters caused by administrative processes.
Bertrand Meyer: Object Oriented Software Construction. Prentice Hall.
1988. Pages 1-64 and 323-334 give a good introduction to one view of
object-oriented programming and design with many sound pieces of practical advice. The rest of the book describes the Eiffel language. Tends to
confuse Eiffel with universal principles.
S. Shlaer and S. J. Mellor: Object-Oriented Systems Analysis and Object
Lifecycles. Yourdon Press. ISBN 0-13-629023-X and 0-13-629940-7.
Presents a view of analysis, design, and programming that differs strongly
from the one presented here and embodied in C++ and does so using a
vocabulary that makes it sound rather similar.
Alan Snyder: Encapsulation and Inheritance in Object-Oriented Programming Languages. Proc. OOPSLA’86. Portland, Oregon. Probably the
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 23.6
Annotated Bibliography
721
first good description of the interaction between encapsulation and inheritance. Also provides a nice discussion of some notions of multiple inheritance.
[Wirfs-Brock,1990] Rebecca Wirfs-Brock, Brian Wilkerson, and Lauren Wiener: Designing
Object-Oriented Software. Prentice Hall. 1990. Describes an anthropomorphic design method based on role playing using CRC (Classes,
Responsibilities, and Collaboration) cards. The text, if not the method
itself, is biased toward Smalltalk.
23.7 Advice [design.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
Know what you are trying to achieve; §23.3.
Keep in mind that software development is a human activity; §23.2, §23.5.3.
Proof by analogy is fraud; §23.2.
Have specific and tangible aims; §23.4.
Don’t try technological fixes for sociological problems; §23.4.
Consider the longer term in design and in the treatment of people; §23.4.1, §23.5.3.
There is no lower limit to the size of programs for which it is sensible to design before starting
to code; §23.2.
Design processes to encourage feedback; §23.4.
Don’t confuse activity for progress; §23.3, §23.4.
Don’t generalize beyond what is needed, what you have direct experience with, and what can
be tested; §23.4.1, §23.4.2.
Represent concepts as classes; §23.4.2, §23.4.3.1.
There are properties of a system that should not be represented as a class; §23.4.3.1.
Represent hierarchical relationships between concepts as class hierarchies; §23.4.3.1.
Actively search for commonality in the concepts of the application and implementation and
represent the resulting more general concepts as base classes; §23.4.3.1, §23.4.3.5.
Classifications in other domains are not necessarily useful classifications in an inheritance
model for an application; §23.4.3.1.
Design class hierarchies based on behavior and invariants; §23.4.3.1, §23.4.3.5, §24.3.7.1.
Consider use cases; §23.4.3.1.
Consider using CRC cards; §23.4.3.1.
Use existing systems as models, as inspiration, and as starting points; §23.4.3.6.
Beware of viewgraph engineering; §23.4.3.1.
Throw a prototype away before it becomes a burden; §23.4.4
Design for change, focusing on flexibility, extensibility, portability, and reuse; §23.4.2.
Focus on component design; §23.4.3.
Let each interface represent a concept at a single level of abstraction; §23.4.3.1.
Design for stability in the face of change; §23.4.2.
Make designs stable by making heavily-used interfaces minimal, general, and abstract;
§23.4.3.2, §23.4.3.5.
Keep it small. Don’t add features ‘‘just in case;’’ §23.4.3.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
722
Development and Design
Chapter 23
[28] Always consider alternative representations for a class. If no alternative representation is plausible, the class is probably not representing a clean concept; §23.4.3.4.
[29] Repeatedly review and refine both the design and the implementation; §23.4, §23.4.3.
[30] Use the best tools available for testing and for analyzing the problem, the design, and the
implementation; §23.3, §23.4.1, §23.4.4.
[31] Experiment, analyze, and test as early as possible and as often as possible; §23.4.4, §23.4.5.
[32] Don’t forget about efficiency; §23.4.7.
[33] Keep the level of formality appropriate to the scale of the project; §23.5.2.
[34] Make sure that someone is in charge of the overall design; §23.5.2.
[35] Document, market, and support reusable components; §23.5.1.
[36] Document aims and principles as well as details; §23.4.6.
[37] Provide tutorials for new developers as part of the documentation; §23.4.6.
[38] Reward and encourage reuse of designs, libraries, and classes; §23.5.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
24
________________________________________
________________________________________________________________________________________________________________________________________________________________
Design and Programming
Keep it simple:
as simple as possible,
but no simpler.
– A. Einstein
Design and programming language — classes — inheritance — type checking — programming — what do classes represent? — class hierarchies — dependencies — containment — containment and inheritance — design tradeoffs — use relationships —
programmed-in relationships — invariants — assertions — encapsulation — components — templates — interfaces and implementations — advice.
24.1 Overview [lang.overview]
This chapter considers the ways programming languages in general and C++ in particular can support design:
§24.2 The fundamental role of classes, class hierarchies, type checking, and programming itself
§24.3 Uses of classes and class hierarchies, focussing on dependencies between different parts
of a program
§24.4 The notion of a component, which is the basic unit of design, and some practical observations about how to express interfaces
More general design issues are found in Chapter 23, and the various uses of classes are discussed in
more detail in Chapter 25.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
724
Design and Programming
Chapter 24
24.2 Design and Programming Language [lang.intro]
If I were to build a bridge, I would seriously consider what material to build it out of. Also, the
design of the bridge would be heavily influenced by the choice of material and vice versa. Reasonable designs for stone bridges differ from reasonable designs for steel bridges, from reasonable
designs for wooden bridges, etc. I would not expect to be able to select the proper material for a
bridge without knowing a bit about the various materials and their uses. Naturally, you don’t have
to be an expert carpenter to design a wooden bridge, but you do have to know the fundamentals of
wooden constructions to choose between wood and iron as the material for a bridge. Furthermore,
even though you don’t personally have to be an expert carpenter to design a wooden bridge, you do
need quite a detailed knowledge of the properties of wood and the mores of carpenters.
The analogy is that to choose a language for some software, you need knowledge of several languages, and to design a piece of software successfully, you need a fairly detailed knowledge of the
chosen implementation language – even if you never personally write a single line of that software.
The good bridge designer respects the properties of materials and uses them to enhance the design.
Similarly, the good software designer builds on the strengths of the implementation language and –
as far as possible – avoids using it in ways that cause problems for implementers.
One might think that this sensitivity to language issues comes naturally when only a single
designer/programmer is involved. However, even in such cases the programmer can be seduced
into misusing the language due to inadequate experience or undue respect for styles of programming established for radically different languages. When the designer is different from the programmer – and especially if they do not share a common culture – the likelihood of introducing
error, inelegance, and inefficiencies into the resulting system approaches certainty.
So what can a programming language do for a designer? It can provide features that allow the
fundamental notions of the design to be represented directly in the programming language. This
eases the implementation, makes it easier to maintain the correspondence between the design and
the implementation, enables better communication between designers and implementers, and
allows better tools to be built to support both designers and implementers.
For example, most design methods are concerned about dependencies between different parts of
a program (usually to minimize them and to ensure that they are well defined and understood). A
language that supports explicit interfaces between parts of a program can support such design
notions. It can guarantee that only the expected dependencies actually exist. Because many dependencies are explicit in code written in such a language, tools that read a program to produce charts
of dependencies can be provided. This eases the job of designers and others that need to understand the structure of a program. A programming language such as C++ can be used to decrease the
gap between design and program and consequently reduce the scope for confusion and misunderstandings.
The key notion of C++ is that of a class. A C++ class is a type. Together with namespaces,
classes are also a primary mechanism for information hiding. Programs can be specified in terms
of user-defined types and hierarchies of such user-defined types. Both built-in and user-defined
types obey statically checked type rules. Virtual functions provide a mechanism for run-time binding without breaking the static type rules. Templates support the design of parameterized types.
Exceptions provide a way of making error handling more regular. These C++ features can be used
without incurring overhead compared to C programs. These are the first-order properties of C++
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.2
Design and Programming Language
725
that must be understood and considered by a designer. In addition, generally available major
libraries – such as matrix libraries, database interfaces, graphical user interface libraries, and concurrency support libraries – can strongly affect design choices.
Fear of novelty sometimes leads to sub-optimal use of C++. So does misapplication of lessons
from other languages, systems, and application areas. Poor design tools can also warp designs.
Five ways designers fail to take advantage of language features and fail to respect limitations are
worth mentioning:
[1] Ignore classes and express the design in a way that constrains implementers to use the C
subset only.
[2] Ignore derived classes and virtual functions and use only the data abstraction subset.
[3] Ignore the static type checking and express the design in such a way that implementers are
constrained to simulate dynamic type checking.
[4] Ignore programming and express systems in a way that aims to eliminate programmers.
[5] Ignore everything except class hierarchies.
These variants are typical for designers with
[1] a C, traditional CASE, or structured design background,
[2] an Ada83, Visual Basic, or data abstraction background,
[3] a Smalltalk or Lisp background,
[4] a nontechnical or very specialized background,
[5] a background with heavy emphasis on ‘‘pure’’ object-oriented programming,
respectively. In each case, one must wonder if the implementation language was well chosen, if the
design method was well chosen, or if the designer had failed to adapt to the tool in hand.
There is nothing unusual or shameful in such a mismatch. It is simply a mismatch that delivers
sub-optimal designs and imposes unnecessary burdens on programmers. It does the same to
designers when the conceptual framework of the design method is noticeably poorer than C++’s
conceptual framework. Therefore, we avoid such mismatches wherever possible.
The following discussion is phrased as answers to objections because that is the way it often
occurs in real life.
24.2.1 Ignoring Classes [lang.ignore.class]
Consider design that ignores classes. The resulting C++ program will be roughly equivalent to the
C program that would have resulted from the same design process – and this program would again
be roughly equivalent to the COBOL program that would have resulted from the same design process. In essence, the design has been made ‘‘programming language independent’’ at the cost of
forcing the programmer to code in the common subset of C and COBOL. This approach does have
advantages. For example, the strict separation of data and code that results makes it easy to use traditional databases that are designed for such programs. Because a minimal programming language
is used, it would appear that less skill – or at least different skills – would be required from programmers. For many applications – say, a traditional sequential database update program – this
way of thinking is quite reasonable, and the traditional techniques developed over decades are adequate for the job.
However, suppose the application differs sufficiently from traditional sequential processing of
records (or characters) or the complexity involved is higher – say, in an interactive CASE system.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
726
Design and Programming
Chapter 24
The lack of language support for data abstraction implied by the decision to ignore classes will
hurt. The inherent complexity will show up in the application somewhere, and if the system is
implemented in an impoverished language, the code will not reflect the design directly. The program will have too many lines of source code, lack type checking, and will in general not be amenable to tools. This is the prescription for a maintenance nightmare.
A common band-aid for this problem is to build specific tools to support the notions of the
design method. These tools then provide higher-level constructs and checking to compensate for
deficiencies of the (deliberately impoverished) implementation language. Thus, the design method
becomes a special-purpose and typically corporate-owned programming language. Such programming languages are in most contexts poor substitutes for a widely available, general-purpose programming language supported by suitable design tools.
The most common reason for ignoring classes in design is simple inertia. Traditional programming languages don’t support the notion of a class, and traditional design techniques reflect this
deficiency. The most common focus of design has been the decomposition of the problems into a
set of procedures performing required actions. This notion, called procedural programming in
Chapter 2, is in the context of design often called functional decomposition. A common question
is, ‘‘Can we use C++ together with a design method based on functional decomposition?’’ You
can, but you will most likely end up using C++ as simply a better C and will suffer the problems
mentioned previously. This may be acceptable in a transition period, for already completed
designs, and for subsystems in which classes do not appear to offer significant benefits (given the
experience of the individuals involved at this time). For the longer term and in general, however,
the policy against large-scale use of classes implied by functional decomposition is not compatible
with effective use of C++ or any other language that has support for abstraction.
The procedure-oriented and object-oriented views of programming are fundamentally different
and typically lead to radically different solutions to the same problem. This observation is as true
for the design phase as it is for the implementation phase: you can focus the design on the actions
taken or on the entities represented, but not simultaneously on both.
So why prefer ‘‘object-oriented design’’ over the traditional design methods based on functional decomposition? A first-order answer is that functional decomposition leads to insufficient
data abstraction. From this, it follows that the resulting design is
– less resilient to change,
– less amenable to tools,
– less suited for parallel development, and
– less suited for concurrent execution.
The problem is that functional decomposition causes interesting data to become global because
when a system is structured as a tree of functions, any data accessed by two functions must be global to both. This ensures that ‘‘interesting’’ data bubbles up toward the root of the tree as more and
more functions require access to it (as ever in computing, trees grow from the root down). Exactly
the same process can be seen in single-rooted class hierarchies, in which ‘‘interesting’’ data and
functions tend to bubble up toward a root class (§24.4). Focussing on the specification of classes
and the encapsulation of data addresses this problem by making the dependencies between different
parts of a program explicit and tractable. More important, though, it reduces the number of dependencies in a system by improving locality of reference to data.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.2.1
Ignoring Classes
727
However, some problems are best solved by writing a set of procedures. The point of an
‘‘object-oriented’’ approach to design is not that there should never be any nonmember functions in
a program or that no part of a system may be procedure-oriented. Rather, the key point is to decouple different parts of a program to better reflect the concepts of the application. Typically, that is
best done when classes, not functions, are the primary focus on the design effort. The use of a procedural style should be a conscious decision and not simply a default. Both classes and procedures
should be used appropriately relative to the application and not just as artifacts of an inflexible
design method.
24.2.2 Avoiding Inheritance [lang.avoid.hier]
Consider design that avoids inheritance. The resulting programs simply fail to take advantage of a
key C++ feature, while still reaping many benefits of C++ compared to C, Pascal, Fortran, COBOL,
etc. Common reasons for doing this – apart from inertia – are claims that ‘‘inheritance is an implementation detail,’’ ‘‘inheritance violates information hiding,’’ and ‘‘inheritance makes cooperation
with other software harder.’’
Considering inheritance merely an implementation detail ignores the way that class hierarchies
can directly model key relationships between concepts in the application domain. Such relationships should be explicit in the design to allow designers to reason about them.
A strong case can be made for excluding inheritance from the parts of a C++ program that must
interface directly with code written in other languages. This is, however, not a sufficient reason for
avoiding the use of inheritance throughout a system; it is simply a reason for carefully specifying
and encapsulating a program’s interface to ‘‘the outer world.’’ Similarly, worries about compromising information hiding through the use of inheritance (§24.3.2.1) are a reason to be careful with
the use of virtual functions and protected members (§15.3). They are not a reason for general
avoidance.
In many cases, there is no real advantage to be gained from inheritance. However, in a large
project a policy of ‘‘no inheritance’’ will result in a less comprehensible and less flexible system in
which inheritance is ‘‘faked’’ using more traditional language and design constructs. Further, I
suspect that despite such a policy, inheritance will eventually be used anyway because C++ programmers will find convincing arguments for inheritance-based designs in various parts of the system. Therefore, a ‘‘no inheritance’’ policy will ensure only that a coherent overall architecture will
be missing and will restrict the use of class hierarchies to specific subsystems.
In other words, keep an open mind. Class hierarchies are not an essential part of every good
program, but in many cases they can help in both the understanding of the application and the
expression of a solution. The fact that inheritance can be misused and overused is a reason for caution; it is a not reason for prohibition.
24.2.3 Ignoring Static Type Checking [lang.type]
Consider design that ignores static type checking. Commonly stated reasons to ignore static type
checking in the design phase are that ‘‘types are an artifact of the programming language,’’ that ‘‘it
is more natural to think about objects without bothering about types,’’ and that ‘‘static type checking forces us to think about implementation issues too early.’’ This attitude is fine as far as it goes
and harmless up to a point. It is reasonable to ignore details of type checking in the design stage,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
728
Design and Programming
Chapter 24
and it is often safe to ignore type issues almost completely in the analysis stage and early design
stages. However, classes and class hierarchies are very useful in the design. In particular, they
allow us to be specific about concepts, allow us to be precise about their relationships, and help us
reason about the concepts. As the design progresses, this precision takes the form of increasingly
precise statements about classes and their interfaces.
It is important to realize that precisely-specified and strongly-typed interfaces are a fundamental
design tool. C++ was designed with this in mind. A strongly-typed interface ensures (up to a
point) that only compatible pieces of software can be compiled and linked together and thus allows
these pieces of software to make relatively strong assumptions about each other. These assumptions are guaranteed by the type system. The effect of this is to minimize the use of run-time tests,
thus promoting efficiency and causing significant reductions in the integration phase of multiperson
projects. In fact, strong positive experience with integrating systems that provide strongly-typed
interfaces is the reason integration isn’t a major topic of this chapter.
Consider an analogy. In the physical world, we plug gadgets together all the time, and a seemingly infinite number of standards for plugs exists. The most obvious thing about these plugs is
that they are specifically designed to make it impossible to plug two gadgets together unless the
gadgets were designed to be plugged together, and then they can be connected only in the right
way. You cannot plug an electric shaver into a high-power socket. Had you been able to, you
would have ended up with a fried shaver or a fried shavee. Much ingenuity is expended on ensuring that incompatible pieces of hardware cannot be plugged together. The alternative to using
many incompatible plugs is gadgets that protect themselves against undesirable behavior from gadgets plugged into their sockets. A surge protector is a good example of this. Because perfect compatibility cannot be guaranteed at the ‘‘plug compatibility level,’’ we occasionally need the more
expensive protection of circuitry that dynamically adapts to and/or protects from a range of inputs.
The analogy is almost exact. Static type checking is equivalent to plug compatibility, and
dynamic checking corresponds to protection/adaptation circuitry. If both checks fail – in either the
physical world or the software world – serious damage can result. In large systems, both forms of
checking are used. In the early stages of a design, it may be reasonable simply to say, ‘‘These two
gadgets should be plugged together.’’ However, it soon becomes relevant exactly how they should
be plugged together. What guarantees does the plug provide about behavior? What error conditions are possible? What are the first-order cost estimates?
The use of ‘‘static typing’’ is not limited to the physical world. The use of units (for example,
meters, kilograms, and seconds) to prevent the mixing of incompatible entities is pervasive in physics and engineering.
In the description of the design steps in §23.4.3, type information enters the picture in Step 2
(presumably after being superficially considered in Step 1) and becomes a major issue in Step 4.
Statically-checked interfaces are the prime vehicle for ensuring cooperation between C++ software developed by different groups. The documentation of these interfaces (including the exact
types involved) is the primary means of communication between separate groups of programmers.
These interfaces are one of the most important outputs of the design process and a focus of communication between designers and programmers.
Ignoring type issues when considering interfaces leads to designs that obscure the structure of
the program and postpone error detection until run time. For example, an interface can be specified
in terms of self-identifying objects:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.2.3
Ignoring Static Type Checking
729
// Example assuming dynamic type checking instead of static checking:
SSttaacckk ss; // Stack can hold pointers to objects of any type
vvooiidd ff()
{
ss.ppuusshh(nneew
w SSaaaabb990000);
ss.ppuusshh(nneew
w SSaaaabb3377B
B);
ss.ppoopp()->ttaakkeeooffff();
ss.ppoopp()->ttaakkeeooffff();
// fine: a Saab 37B is a plane
// run-time error: car cannot take off
}
This is a severe underspecification of the interface (of SSttaacckk::ppuusshh()) that forces dynamic checking rather than static checking. The stack s is meant to hold P
Pllaannees, but that was left implicit in the
code, so it becomes the user’s obligation to make sure the requirement is upheld.
A more precise specification – a template plus virtual functions rather than unconstrained
dynamic type checking – moves error detection from run time to compile time:
SSttaacckk<P
Pllaannee*> ss; // Stack can hold pointers to Planes
vvooiidd ff()
{
ss.ppuusshh(nneew
w SSaaaabb990000);
ss.ppuusshh(nneew
w SSaaaabb3377B
B);
ss.ppoopp()->ttaakkeeooffff();
ss.ppoopp()->ttaakkeeooffff();
// error: a Saab900 is not a Plane
// fine: a Saab 37B is a plane
}
A similar point is made in §16.2.2. The difference in run time between dynamic checking and
static checking can be significant. The overhead of dynamic checking is usually a factor in the
range of 3 to 10.
One should not go to the other extreme, though. It is not possible to catch all errors by static
checking. For example, even the most thoroughly statically checked program is vulnerable to hardware failures. See also §25.4.1 for an example where complete static checking would be infeasible.
However, the ideal is to have the vast majority of interfaces be statically typed with applicationlevel types; see §24.4.2.
Another problem is that a design can be perfectly reasonable in the abstract but can cause serious trouble because it fails to take into account limitations of a basic tool, in this case C++. For
example, a function ff() that needs to perform an operation ttuurrnn__rriigghhtt() on an argument can do so
only provided all of its arguments are of a common type:
ccllaassss P
Pllaannee {
// ...
vvooiidd ttuurrnn__rriigghhtt();
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
730
Design and Programming
Chapter 24
ccllaassss C
Caarr {
// ...
vvooiidd ttuurrnn__rriigghhtt();
};
vvooiidd ff(X
X* pp) // what type should X be?
{
pp->ttuurrnn__rriigghhtt();
// ...
}
Some languages (such as Smalltalk and CLOS) allow two types to be used interchangeably if they
have the same operations by relating every type through a common base and postponing name resolution until run time. However, C++ (intentionally) supports this notion through templates and
compile-time resolution only. A non-template function can accept arguments of two types only if
the two types can be implicitly converted to a common type. Thus, in the previous example X must
be a common base of P
Pllaannee and C
Caarr (e.g., a V
Veehhiiccllee class).
Typically, examples inspired by notions alien to C++ can be mapped into C++ by expressing the
assumptions explicitly. For example, given P
Pllaannee and C
Caarr (without a common base), we can still
create a class hierarchy that allows us to pass an object containing a C
Caarr or a P
Pllaannee to ff(X
X*)
(§25.4.1). However, doing this often requires an undesirable amount of mechanism and cleverness.
Templates are often a useful tool for such concept mappings. A mismatch between design notions
and C++ typically leads to ‘‘unnatural-looking’’ and inefficient code. Maintenance programmers
tend to dislike the non-idiomatic code that arises from such mismatches.
A mismatch between the design technique and the implementation language can be compared to
word-for-word translation between natural languages. For example, English with German grammar
is as awkward as German with English grammar, and both can be close to incomprehensible to
someone fluent in only one of those languages.
Classes in a program are the concrete representation of the concepts of the design. Consequently, obscuring the relationships between the classes obscures the fundamental concepts of the
design.
24.2.4 Avoiding Programming [lang.prog]
Programming is costly and unpredictable compared to many other activities, and the resulting code
is often less than 100% reliable. Programming is labor-intensive and – for a variety of reasons –
most serious project delays manifest themselves by code not being ready to ship. So, why not eliminate programming as an activity altogether?
To many managers, getting rid of the arrogant, undisciplined, over-paid, technology-obsessed,
improperly-dressed, etc. programmers† would appear to be a significant added benefit. To a programmer, this suggestion may sound absurd. However, important problem areas with realistic
alternatives to traditional programming do exist. For specific areas, it is possible to generate code
directly from a high-level specification. In other areas, code can be generated by manipulating
shapes on a screen. For example, useful user interfaces can be constructed by direct manipulation
__________________
† Yes, I’m a programmer.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.2.4
Avoiding Programming
731
in a tiny fraction of the time it would take to construct the same interface by writing traditional
code. Similarly, database layouts and the code for accessing data according to such layouts can be
generated from specifications that are far simpler than the code needed to express those operations
directly in C++ or in any other general-purpose programming language. State machines that are
smaller, faster, and more correct than most programmers could produce can be generated from
specifications or by a direct manipulation interface.
These techniques work well in specific areas where there is either a sound theoretical foundation (e.g., math, state machines, and relational databases) or where a general framework exists into
which small application fragments can be embedded (e.g., graphical user interfaces, network simulations, and database schema). The obvious usefulness of these techniques in limited – and typically crucial – areas can tempt people to think that the elimination of traditional programming by
these techniques is ‘‘just around the corner.’’ It is not. The reason is that expanding specification
techniques outside areas with sound theoretical frameworks implies that the complexity of a
general-purpose programming language would be needed in the specification language. This
defeats the purpose of a clean and well-founded specification language.
It is sometimes forgotten that the framework that allows elimination of traditional programming
in an area is a system or library that has been designed, programmed, and tested in the traditional
way. In fact, one popular use of C++ and the techniques described in this book is to design and
build such systems.
A compromise that provides a small fraction of the expressiveness of a general-purpose language is the worst of both worlds when applied outside a restricted application domain. Designers
who stick to a high-level modeling point of view are annoyed by the added complexity and produce
specifications from which horrendous code is produced. Programmers who apply ordinary programming techniques are frustrated by the lack of language support and generate better code only
by excessive effort and by abandoning high-level models.
I see no signs that programming as an activity can be successfully eliminated outside areas that
either have well-founded theoretical bases or in which the basic programming is provided by a
framework. In either case, there is a dramatic drop in the effectiveness of the techniques as one
leaves the original framework and attempts more general-purpose work. Pretending otherwise is
tempting and dangerous. Conversely, ignoring the high-level specification techniques and the
direct-manipulation techniques in domains in which they are well-founded and reasonably mature
would be a folly.
Designing tools, libraries, and frameworks is one of the highest forms of design and programming. Constructing a useful mathematically-based model of an application area is one of the highest forms of analysis. Thus, providing a tool, language, framework, etc., that makes the result of
such work available to thousands is a way for programmers and designers to escape the trap of
becoming craftsmen of one-of-a-kind artifacts.
It is most important that a specification system or a foundation library be able to interface effectively with a general-purpose programming language. Otherwise, the framework provided is inherently limiting. This implies that specification systems and direct-manipulation systems that generate code at a suitable high level into an accepted general-purpose programming language have a
great advantage. A proprietary language is a long-term advantage to its provider only. If the code
generated is so low-level that general code added must be written without the benefits of abstraction, then reliability, maintainability, and economy are lost. In essence, a generation system should
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
732
Design and Programming
Chapter 24
be designed to combine the strengths of higher-level specifications and higher-level programming
languages. To exclude one or the other is to sacrifice the interests of system builders to the interests of tool providers. Successful large systems are multilevel and modular and evolve over time.
Consequently, successful efforts to produce such systems involve a variety of languages, libraries,
tools, and techniques.
24.2.5 Using Class Hierarchies Exclusively [lang.pure]
When we find that something new actually works, we often go a bit overboard and apply it indiscriminately. In other words, a great solution to some problems often appears to be the solution to
almost all problems. Class hierarchies and operations that are polymorphic on their (one) object
provide a great solution to many problems. However, not every concept is best represented as a
part of a hierarchy and not every software component is best represented as a class hierarchy.
Why not? A class hierarchy expresses relationships between its classes and a class represents a
concept. Now what is the common relationship between a smile, the driver for my CD-ROM
reader, a recording of Richard Strauss’ Don Juan, a line of text, a satellite, my medical records, and
a real-time clock? Placing them all in a single hierarchy when their only shared property is that
they are programming artifacts (they are all ‘‘objects’’) is of little fundamental value and can cause
confusion (§15.4.5). Forcing everything into a single hierarchy can introduce artificial similarities
and obscure real ones. A hierarchy should be used only if analysis reveals conceptual commonality
or if design and programming discover useful commonality in the structures used to implement the
concepts. In the latter case, we have to be very careful to distinguish genuine commonality (to be
reflected as subtyping by public inheritance) and useful implementation simplifications (to be
reflected as private inheritance; §24.3.2.1).
This line of thinking leads to a program that has several unrelated or weakly-related class hierarchies, each representing a set of closely related concepts. It also leads to the notion of a concrete
class (§25.2) that is not part of a hierarchy because placing such a class in a hierarchy would compromise its performance and its independence of the rest of the system.
To be effective, most critical operations on a class that is part of a class hierarchy must be virtual functions. Furthermore, much of that class’ data must be protected rather than private. This
makes it vulnerable to modification from further derived classes and can seriously complicate testing. Where stricter encapsulation makes sense from a design point of view, non-virtual functions
and private data should be used (§24.3.2.1).
Having one argument of an operation (the one designating ‘‘the object’’) special can lead to
contorted designs. When several arguments are best treated equally, an operation is best represented as a nonmember function. This does not imply that such functions should be global. In fact,
almost all such free-standing functions should be members of a namespace (§24.4).
24.3 Classes [lang.class]
The most fundamental notion of object-oriented design and programming is that the program is a
model of some aspects of reality. The classes in the program represent the fundamental concepts of
the application and, in particular, the fundamental concepts of the ‘‘reality’’ being modeled. Realworld objects and artifacts of the implementation are represented by objects of these classes.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3
Classes
733
The analysis of relationships between classes and within parts of a class is central to the design
of a system:
§24.3.2 Inheritance relationships
§24.3.3 Containment relationships
§24.3.5 Use relationships
§24.2.4 Programmed-in relationships
§24.3.7 Relationships within a class
Because a C++ class is a type, classes and the relationships between classes receive significant support from compilers and are generally amenable to static analysis.
To be relevant in a design, a class doesn’t just have to represent a useful concept; it must also
provide a suitable interface. Basically, the ideal class has a minimal and well-defined dependence
on the rest of the world and presents an interface that exposes the minimal amount of information
necessary to the rest of the world (§24.4.2).
24.3.1 What Do Classes Represent? [lang.what]
There are essentially two kinds of classes in a system:
[1] Classes that directly reflect the concepts in the application domain; that is, concepts that are
used by end-users to describe their problems and solutions
[2] Classes that are artifacts of the implementation; that is, concepts that are used by the designers and programmers to describe their implementation techniques.
Some of the classes that are artifacts of the implementation may also represent real-world entities.
For example, the hardware and software resources of a system provide good candidates for classes
in an application. This reflects the fact that a system can be viewed from several viewpoints. This
implies that one person’s implementation detail is another person’s application. A well-designed
system will contain classes supporting logically separate views of the system. For example:
[1] Classes representing user-level concepts (e.g., cars and trucks)
[2] Classes representing generalizations of the user-level concepts (e.g. vehicles)
[3] Classes representing hardware resources (e.g., a memory management class)
[4] Classes representing system resources (e.g., output streams)
[5] Classes used to implement other classes (e.g., lists, queues, locks)
[6] Built-in data types and control structures.
In larger systems, keeping logically separate types of classes separate and maintaining separation
between several levels of abstraction becomes a challenge. A simple example can be considered to
have three levels of abstraction:
[1+2] Provide an application level view of the system
[3+4] Represent the machine on which the model runs
[5+6] Represent a low-level (programming language) view of the implementation.
The larger the system, the more levels of abstraction are typically needed for the description of the
system and the more difficult it becomes to define and maintain the levels. Note that such levels of
abstraction have direct counterparts in nature and in other types of human constructions. For example, a house can be considered as consisting of
[1] atoms;
[2] molecules;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
734
Design and Programming
Chapter 24
[3] lumber and bricks;
[4] floors, walls, and ceilings; and
[5] rooms.
As long as these levels of abstraction are kept separate, you can maintain a coherent view of the
house. However, if you mix them, absurdities arise. For example, the statement, ‘‘My house consists of several thousand pounds of carbon, some complex polymers, about 5,000 bricks, two bathrooms, and 13 ceilings,’’ is silly. Given the abstract nature of software, the equivalent statement
about a complex system is not always recognized for what it is.
The translation of a concept in the application area into a class in a design is not a simple
mechanical operation. It often requires significant insights. Note that the concepts in an application area are themselves abstractions. For example, ‘‘taxpayers,’’ ‘‘monks,’’ and ‘‘employees’’
don’t really exist in nature; such concepts are themselves labels put on individuals to classify them
relative to some system. The real or even the imagined world (literature, especially science fiction)
is sometimes simply a source of ideas for concepts that mutate radically in the transition into
classes. For example, the screen of my PC doesn’t really resemble my desktop despite its being
designed to support the desktop metaphor†, and the windows on my screen bear only the slightest
relation to the contraptions that let drafts into my office. The point about modeling reality is not to
slavishly follow what we see but rather to use it as a starting point for design, a source of inspiration, and an anchor to hold on to when the intangible nature of software threatens to overcome our
ability to understand our programs.
A word of caution: beginners often find it hard to ‘‘find the classes,’’ but that problem is usually soon overcome without long-term ill effects. Next, however, often follows a phase in which
classes – and their inheritance relationships – seem to multiply uncontrollably. This can cause
long-term problems with the complexity, comprehensibility, and efficiency of the resulting program. Not every minute detail needs to be represented by a distinct class, and not every relationship between classes needs to be represented as an inheritance relationship. Try to remember that
the aim of a design is to model a system at an appropriate level of detail and at appropriate levels
of abstraction. Finding a balance between simplicity and generality is not easy.
24.3.2 Class Hierarchies [lang.hier]
Consider simulating the traffic flow of a city to determine the likely times needed for emergency
vehicles to reach their destinations. Clearly, we need to represent cars, trucks, ambulances, fire
engines of various sorts, police cars, busses, etc. Inheritance comes into play because a real-world
concept does not exist in isolation; it exists with numerous relationships to other concepts. Without
understanding these relationships, we cannot understand the concepts. Consequently, a model that
does not represent such relationships does not adequately represent our concepts. That is, in our
programs we need classes to represent concepts, but that is not enough. We also need ways of representing relationships between classes. Inheritance is one powerful way of representing hierarchical relationships directly. In our example, we would probably consider emergency vehicles special
and want also to distinguish between car-like and truck-like vehicles. This would yield a class hierarchy along these lines:
__________________
† I wouldn’t be able to tolerate such a mess on my screen, anyway.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.2
Class Hierarchies
735
V
Veehhiiccllee
C
Caarr
P
Poolliiccee__ccaarr
E
Em
meerrggeennccyy
A
Am
mbbuullaannccee
T
Trruucckk
F
Fiirree__eennggiinnee
.
H
Hooookk__aanndd__llaaddddeerr
Here, E
Em
meerrggeennccyy represents the aspects of an emergency vehicle that are relevant to the simulation:
it can violate some traffic rules, has priority in intersections when on an emergency call, it is under
control of a dispatcher, etc.
Here is the C++ version:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
V
Veehhiiccllee { /* ... */ };
E
Em
meerrggeennccyy { /* ... */ };
C
Caarr : ppuubblliicc V
Veehhiiccllee { /* ... */ };
T
Trruucckk : ppuubblliicc V
Veehhiiccllee { /* ... */ };
P
Poolliiccee__ccaarr : ppuubblliicc C
Caarr , pprrootteecctteedd E
Em
meerrggeennccyy { /* ... */ };
A
Am
mbbuullaannccee : ppuubblliicc C
Caarr , pprrootteecctteedd E
Em
meerrggeennccyy { /* ... */ };
F
Fiirree__eennggiinnee : ppuubblliicc T
Trruucckk , pprrootteecctteedd E
Em
meerrggeennccyy { /* ... */ };
H
Hooookk__aanndd__llaaddddeerr : ppuubblliicc F
Fiirree__eennggiinnee { /* ... */ };
Inheritance is the highest level relationship that can be represented directly in C++ and the one that
figures largest in the early stages of a design. Often there is a choice between using inheritance to
represent a relationship and using membership. Consider an alternative notion of what it means to
be an emergency vehicle: a vehicle is an emergency vehicle if it displays a flashing light. This
would allow a simplification of the class hierarchy by replacing the E
Em
meerrggeennccyy class by a member
in class V
Veehhiiccllee:
V
Veehhiiccllee { eeppttrr }
C
Caarr
P
Poolliiccee__ccaarr
T
Trruucckk
A
Am
mbbuullaannccee
F
Fiirree__eennggiinnee
.
H
Hooookk__aanndd__llaaddddeerr
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
736
Design and Programming
Chapter 24
Class E
Em
meerrggeennccyy is now simply used as a member in classes that might need to act as emergency
vehicles:
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
ccllaassss
E
Em
meerrggeennccyy { /* ... */ };
V
Veehhiiccllee { pprrootteecctteedd: E
Em
meerrggeennccyy* eeppttrr; /* ... */ }; // better: provide proper interface to eptr
C
Caarr : ppuubblliicc V
Veehhiiccllee { /* ... */ };
T
Trruucckk : ppuubblliicc V
Veehhiiccllee { /* ... */ };
P
Poolliiccee__ccaarr : ppuubblliicc C
Caarr { /* ... */ };
A
Am
mbbuullaannccee : ppuubblliicc C
Caarr { /* ... */ };
F
Fiirree__eennggiinnee : ppuubblliicc T
Trruucckk { /* ... */ };
H
Hooookk__aanndd__llaaddddeerr : ppuubblliicc F
Fiirree__eennggiinnee { /* ... */ };
Here, a vehicle is an emergency vehicle if V
Veehhiiccllee::eeppttrr is nonzero. The ‘‘plain’’ cars and trucks
are initialized with V
Veehhiiccllee::eeppttrr zero; the others are initialized with V
Veehhiiccllee::eeppttrr nonzero. For
example:
C
Caarr::C
Caarr()
// Car constructor
{
eeppttrr = 00;
}
P
Poolliiccee__ccaarr::P
Poolliiccee__ccaarr() // Police_car constructor
{
eeppttrr = nneew
w E
Em
meerrggeennccyy;
}
Defining things this way enables a simple conversion of an emergency vehicle to an ordinary vehicle and vice versa:
vvooiidd ff(V
Veehhiiccllee* pp)
{
ddeelleettee pp->eeppttrr;
pp->eeppttrr = 00;
// no longer an emergency vehicle
// ...
pp->eeppttrr = nneew
w E
Em
meerrggeennccyy;
// an emergency vehicle again
}
So, which variant of the class hierarchy is best? The general answer is, ‘‘The program that most
directly models the aspects of the real world that we are interested in is the best.’’ That is, in
choosing between models we should aim for greater realism under the inevitable constraints of efficiency and simplicity. In this case, the easy conversion between ordinary vehicles and emergency
vehicles seems unrealistic to me. Fire engines and ambulances are purpose-built vehicles manned
by trained personnel and operated using dispatch procedures requiring specialized communication
equipment. This view indicates that being an emergency vehicle should be a fundamental concept
and represented directly in the program to improve type checking and other uses of tools. Had we
been modeling a place where the roles of vehicles were less firmly defined – say, an area where
private vehicles were routinely used to carry emergency personnel to accident sites and where communication was primarily based on portable radios – the other way of modeling the system might
have been more appropriate.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.2
Class Hierarchies
737
For people who consider traffic simulations esoteric, it might be worth pointing out that such
tradeoffs between inheritance and membership almost invariably occur in a design. The scrollbar
example in §24.3.3 is an equivalent example.
24.3.2.1 Dependencies within a Class Hierarchy [lang.internal]
Naturally, a derived class depends on its base classes. It is less often appreciated that the opposite
can also be true†. If a class has a virtual function, the class depends on derived classes to implement part of its functionality whenever a derived class overrides that function. If a member of a
base class itself calls one of the class’ virtual functions, then the base class depends on its derived
classes for its own implementation. Similarly, if a class uses a protected member, then it is again
dependent on its derived classes for its own implementation. Consider:
ccllaassss B {
// ...
pprrootteecctteedd:
iinntt aa;
ppuubblliicc:
vviirrttuuaall iinntt ff();
iinntt gg() { iinntt x = ff(); rreettuurrnn xx-aa; }
};
What does gg() do? The answer critically depends on the definition of ff() in some derived class.
Here is a version that will ensure that gg() returns 11:
ccllaassss D
D11 : ppuubblliicc B {
iinntt ff() { rreettuurrnn aa+11; }
};
and a version that makes gg() write ‘‘H
Heelllloo, w
woorrlldd!’’ and return 00:
ccllaassss D
D22 : ppuubblliicc B {
iinntt ff() { ccoouutt<<"H
Heelllloo, w
woorrlldd!\\nn"; rreettuurrnn aa; }
};
This example illustrates one of the most important points about virtual functions. Why is it silly?
Why wouldn’t a programmer ever write something like that? The answer is that a virtual function
is part of an interface to a base class, and that class can supposedly be used without knowledge of
the classes derived from it. Consequently, it must be possible to describe the expected behavior of
an object of the base class in such a way that programs can be written without knowledge of the
derived classes. Every class that overrides the virtual function must implement a variant of that
behavior. For example, the virtual function rroottaattee() of a SShhaappee class rotates a shape. The
rroottaattee() functions for derived classes such as C
Ciirrccllee and T
Trriiaannggllee must rotate objects of their
respective type; otherwise, a fundamental assumption about class SShhaappee is violated. No such
assumption about behavior is made for class B or its derived classes D
D11 and D
D22; thus, the example
is nonsensical. Even the names B
B, D
D11, D
D22, ff, and g were chosen to obscure any possible meanings.
__________________
† This observation has been summarized as: ‘‘Insanity is hereditary. You get it from your children.’’
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
738
Design and Programming
Chapter 24
The specification of the expected behavior of virtual functions is a major focus of class design.
Choosing good names for classes and functions is important – and not always easy.
Is a dependency on unknown (possibly yet unwritten) derived classes good or bad? Naturally,
that depends on the intent of the programmer. If the intent is to isolate a class from all external
influences so that it can be proven to behave in a specific way, then protected members and virtual
functions are best avoided. If, however, the intent is to provide a framework into which a later programmer (such as the same programmer a few weeks later) can add code, then virtual functions are
often an elegant mechanism for achieving this; and protected member functions have proven convenient for supporting such use. This technique is used in the stream I/O library (§21.6) and was
illustrated by the final version of the IIvvaall__bbooxx hierarchy (§12.4.2).
If a vviirrttuuaall function is meant to be used only indirectly by a derived class, it can be left pprriivvaattee.
For example, consider a simple buffer template:
tteem
mppllaattee<ccllaassss T
T> ccllaassss B
Buuffffeerr {
ppuubblliicc:
vvooiidd ppuutt(T
T); // call overflow(T) if buffer is full
T ggeett();
// call underflow() if buffer is empty
// ...
pprriivvaattee:
vviirrttuuaall iinntt oovveerrfflloow
w(T
T);
vviirrttuuaall iinntt uunnddeerrfflloow
w();
// ...
};
The ppuutt() and ggeett() functions call vviirrttuuaall functions oovveerrfflloow
w() and uunnddeerrfflloow
w(), respectively.
A user can now implement a variety of buffer types to suit a variety of needs by overriding oovveerr-fflloow
w() and uunnddeerrfflloow
w():
tteem
mppllaattee<ccllaassss T
T> ccllaassss C
Ciirrccuullaarr__bbuuffffeerr : ppuubblliicc B
Buuffffeerr<T
T> {
iinntt oovveerrfflloow
w(T
T);
// wrap around if full
iinntt uunnddeerrfflloow
w();
// ...
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss E
Exxppaannddiinngg__bbuuffffeerr : ppuubblliicc B
Buuffffeerr<T
T> {
iinntt oovveerrfflloow
w(T
T);
// increase buffer size if full
iinntt uunnddeerrfflloow
w();
// ...
};
Only if a derived class needed to call oovveerrfflloow
w() and uunnddeerrfflloow
w() directly would these functions
need to be pprrootteecctteedd rather than pprriivvaattee.
24.3.3 Containment Relationships [lang.contain]
Where containment is used, there are two major alternatives for representing an object of a class X
X:
[1] Declare a member of type X
X.
[2] Declare a member of type X
X* or type X
X&.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.3
Containment Relationships
739
If the value of the pointer is never changed, these alternatives are equivalent, except for efficiency
issues and the way you write constructors and destructors:
ccllaassss X {
ppuubblliicc:
X
X(iinntt);
// ...
};
ccllaassss C {
X aa;
X
X* pp;
X
X& rr;
ppuubblliicc:
C
C(iinntt ii, iinntt jj, iinntt kk) : aa(ii), pp(nneew
w X
X(jj)), rr(*nneew
w X
X(kk)) { }
~C
C() { ddeelleettee pp; ddeelleettee &rr; }
};
In such cases, membership of the object itself, as in the case of C
C::aa, is usually preferable because
it is the most efficient in time, space, and keystrokes. It is also less error-prone because the connection between the contained object and the containing object is covered by the rules of construction
and destruction (§10.4.1, §12.2.2, §14.4.1). However, see also §24.4.2 and §25.7.
The pointer solution should be used when there is a need to change the pointer to the ‘‘contained’’ object during the life of the ‘‘containing’’ object. For example:
ccllaassss C
C22 {
X
X* pp;
ppuubblliicc:
C
C22(iinntt ii) : pp(nneew
w X
X(ii)) { }
~C
C22() { ddeelleettee pp; }
X
X* cchhaannggee(X
X* qq)
{
X
X* t = pp;
p = qq;
rreettuurrnn tt;
}
};
Another reason for using a pointer member is to allow the ‘‘contained’’ member to be supplied as
an argument:
ccllaassss C
C33 {
X
X* pp;
ppuubblliicc:
C
C33(X
X* qq) : pp(qq) { }
// ...
};
By having objects contain pointers to other objects, we create what are often called object
hierarchies. This is an alternative and complementary technique to using class hierarchies. As
shown in the emergency vehicle example in §24.3.2, it is often a tricky design issue to choose
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
740
Design and Programming
Chapter 24
between representing a property of a class as a base class or representing it as a member. A need to
override is an indication that the former is the better choice. Conversely, a need to be able to allow
the property to be represented by a variety of types is an indication that the latter is the better
choice. For example:
ccllaassss X
XX
X : ppuubblliicc X { /* ... */ };
ccllaassss X
XX
XX
X : ppuubblliicc X { /* ... */ };
vvooiidd ff()
{
C
C33* pp11 = nneew
w C
C33(nneew
w X
X);
// C3 ‘‘contains’’ an X
C
C33* pp22 = nneew
w C
C33(nneew
w X
XX
X);
// C3 ‘‘contains’’ an XX
C
C33* pp33 = nneew
w C
C33(nneew
w X
XX
XX
X); // C3 ‘‘contains’’ an XXX
// ...
}
This could not be modeled by a derivation of C
C33 from X or by C
C33 having a member of type X
X,
because the exact type of a member needs to be used. This is important for classes with virtual
functions, such as a shape class (§2.6.2) or an abstract set class (§25.3).
References can be used to simplify classes based on pointer membership when only one object
is referred to during the life of the containing object. For example:
ccllaassss C
C44 {
X
X& rr;
ppuubblliicc:
C
C44(X
X& qq) : rr(qq) { }
// ...
};
Pointer and reference members are also needed when an object needs to be shared:
X
X* p = nneew
w X
XX
X;
C
C44 oobbjj11(*pp);
C
C44 oobbjj22(*pp); // obj1 and obj2 now share the new XX
Naturally, management of shared objects requires extra care – especially in concurrent systems.
24.3.4 Containment and Inheritance [lang.cont.hier]
Given the importance of inheritance relationships, it is not surprising that they are frequently
overused and misunderstood. When a class D is publicly derived from another class B
B, it is often
said that a D is a B
B:
ccllaassss B { /* ... */ };
ccllaassss D : ppuubblliicc B { /* ... */ };
// D is a kind of B
Alternatively, this is expressed by saying that inheritance is an is-a relationship or – somewhat
more precisely – that a D is a kind of B
B. In contrast, a class D that has a member of another class B
is often said to have a B or contain a B
B. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.4
Containment and Inheritance
741
ccllaassss D { // a D contains a B
ppuubblliicc:
B bb;
// ...
};
Alternatively, this is expressed by saying that membership is a has-a relationship.
For given classes B and D
D, how do we choose between inheritance and membership? Consider
an A
Aiirrppllaannee and an E
Ennggiinnee. Novices often wonder if it might be a good idea to derive class A
Aiirr-ppllaannee from E
Ennggiinnee. This is a bad idea, though, because an A
Aiirrppllaannee is not an E
Ennggiinnee; it has an
E
Ennggiinnee. One way of seeing this is to consider if an A
Aiirrppllaannee might have two or more engines.
Because that seems feasible (even if we are considering a program in which all of our A
Aiirrppllaannees
will be single-engine ones), we should use membership rather than inheritance. The question ‘‘can
it have two?’’ is useful in many cases when there is doubt. As usual, it is the intangible nature of
software that makes this discussion relevant. Had all classes been as easy to visualize as A
Aiirrppllaannee
and E
Ennggiinnee, trivial mistakes like deriving an A
Aiirrppllaannee from an E
Ennggiinnee would be easily avoided.
Such mistakes are, however, quite frequent – particularly among people who consider derivation as
simply another mechanism for combining programming-language-level constructs. Despite the
conveniences and shorthand notation that derivation provides, it should be used almost exclusively
to express relationships that are well defined in a design. Consider:
ccllaassss B {
ppuubblliicc:
vviirrttuuaall vvooiidd ff();
vvooiidd gg();
};
ccllaassss D
D11 {
ppuubblliicc:
B bb;
vvooiidd ff();
};
vvooiidd hh11(D
D11* ppdd)
{
B
B* ppbb = ppdd;
ppbb = &ppdd->bb;
ppbb->gg();
ppdd->gg();
ppdd->bb.gg();
ppbb->ff();
ppdd->ff();
}
// a D1 contains a B
// does not override b.f()
// error: no D1* to B* conversion
// calls B::g()
// error: D1 doesn’t have a member g()
// calls B::f (not overridden by D1::f())
// calls D1::f()
Note that there is no implicit conversion from a class to one of its members and that a class containing a member of another class does not override the virtual functions of that member. This contrasts with the public derivation case:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
742
Design and Programming
Chapter 24
ccllaassss D
D22 : ppuubblliicc B {
// a D2 is a B
ppuubblliicc:
vvooiidd ff();
// overrides B::f()
};
vvooiidd hh22(D
D22* ppdd)
{
B
B* ppbb = ppdd;
ppbb->gg();
ppdd->gg();
ppbb->ff();
ppdd->ff();
}
// ok: implicit D2* to B* conversion
// calls B::g()
// calls B::g()
// virtual call: invokes D2::f()
// invokes D2::f()
The notational convenience provided by the D
D22 example compared to the D
D11 example is a factor
that can lead to overuse. It should be remembered, though, that there is a cost of increased dependency between B and D
D22 to be paid for that notational convenience (see §24.3.2.1). In particular, it
is easy to forget the implicit conversion from D
D22 to B
B. Unless such conversions are an acceptable
part of the semantics of your classes, public derivation is to be avoided. When a class is used to
represent a concept and derivation is used to represent an is-a relationship, such conversions are
most often exactly what is desired.
There are cases in which you would like inheritance but cannot afford to have the conversion
happen. Consider writing a class C
Cffiieelldd (controlled field) that – in addition to whatever else it does
– provides run-time access control for another class F
Fiieelldd. At first glance, defining C
Cffiieelldd by
deriving it from F
Fiieelldd seems just right:
ccllaassss C
Cffiieelldd : ppuubblliicc F
Fiieelldd { /* ... */ };
This expresses the notion that a C
Cffiieelldd really is a kind of F
Fiieelldd, allows notational convenience
when writing a C
Cffiieelldd function that uses a member of the F
Fiieelldd part of the C
Cffiieelldd, and – most
importantly – allows a C
Cffiieelldd to override F
Fiieelldd virtual functions. The snag is that the C
Cffiieelldd* to
F
Fiieelldd* conversion implied in the declaration of C
Cffiieelldd defeats all attempts to control access to the
F
Fiieelldd:
vvooiidd gg(C
Cffiieelldd* pp)
{
*pp = "aassddff";
F
Fiieelldd* q = pp;
*qq = "aassddff";
// access to Field controlled by Cfield’s assignment operator:
// p– >Cfield::operator=("asdf")
// implicit Cfield* to Field* conversion
// OOPS! no control
}
A solution would be to define C
Cffiieelldd to have a F
Fiieelldd as a member, but doing that precludes C
Cffiieelldd
from overriding F
Fiieelldd virtual functions. A better solution would be to use private derivation:
ccllaassss C
Cffiieelldd : pprriivvaattee F
Fiieelldd { /* ... */ };
From a design perspective, private derivation is equivalent to containment, except for the (occasionally essential) issue of overriding. An important use of this is the technique of deriving a class
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.4
Containment and Inheritance
743
publicly from an abstract base class that defines an interface and using private or protected derivation from a concrete class to provide an implementation (§2.5.4, §12.3, §25.3). Because the inheritance implied in private and protected derivation is an implementation detail that is not reflected in
the type of the derived class, it is sometimes called implementation inheritance and contrasted to
public derivation, whereby the interface of the base class is inherited and the implicit conversion to
the base type is allowed. The latter is sometimes referred to as subtyping, or interface inheritance.
Another way of stating this is to point out that an object of a derived class should be usable
wherever an object of its public base class is. This is sometimes called ‘‘the Liskov Substitution
Principle’’ (§23.6[Liskov,1987]). The public/protected/private distinction supports this directly for
polymorphic types manipulated through pointers and references.
24.3.4.1 Member/Hierarchy Tradeoffs [lang.mem]
To further examine the design choices involving containment and inheritance, consider how to represent a scrollbar in an interactive graphics system and how to attach a scrollbar to a window. We
need two kinds of scrollbars: horizontal and vertical. We can represent this either by two types –
H
Hoorriizzoonnttaall__ssccrroollllbbaarr and V
Veerrttiiccaall__ssccrroollllbbaarr – or by a single SSccrroollllbbaarr type that takes an argument that says whether its layout is horizontal or vertical. The former choice implies the need for a
third type, the plain SSccrroollllbbaarr, as the base class of the two specific scollbar types. The latter choice
implies the need for an extra argument to the scrollbar type and the need to choose values to represent the two kinds of scrollbars. For example:
eennuum
m O
Orriieennttaattiioonn { hhoorriizzoonnttaall, vveerrttiiccaall };
Once a choice is made, it determines the kind of change needed to extend the system. In the scrollbar example, we might want to introduce a third type of scrollbar. We may originally have thought
that there could be only two kinds of scrollbars (‘‘after all, a window has only two dimensions’’).
However, in this case – as in most – there are possible extensions that surface as redesign issues.
For example, one might like to use a ‘‘navigation button’’ instead of two scrollbars. Such a button
would cause scrolling in different directions depending on where a user pressed it. Pressing the
middle of the top would cause ‘‘scrolling up,’’ pressing the middle left would cause ‘‘scrolling
left,’’ while pressing the top-left corner would cause ‘‘scrolling up and left.’’ Such buttons are not
uncommon. They can be seen as a refinement of the notion of a scrollbar that is particularly suited
to applications in which the information scrolled over isn’t plain text but rather more general sorts
of pictures.
Adding a navigation button to a program with a three-scrollbar class hierarchy involves adding
a new class, but it requires no changes to the old scrollbar code:
SSccr
. roollllbbaarr
H
Hoorriizzoonnttaall__ssccrroollllbbaarr
V
Veerrttiiccaall__ssccrroollllbbaarr
N
Naavviiggaattiioonn__bbuuttttoonn
This is the nice aspect of the ‘‘hierarchical’’ solution.
Passing the orientation of the scrollbar as an argument implies the presence of type fields in the
scrollbar objects and the use of switch statements in the code of the scrollbar member functions.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
744
Design and Programming
Chapter 24
That is, we are facing a tradeoff between expressing this aspect of the structure of the system in
terms of declarations or in terms of code. The former increases the degree of static checking and
the amount of information on which tools have to work. The latter postpones decisions to run time
and allows changes to be made by modifying individual functions without affecting the overall
structure of the system as seen by the type checker and other tools. In most situations, I recommend using a class hierarchy to directly model hierarchical relationships of the concepts.
The single scrollbar type solution makes it easy to store and pass information specifying a kind
of scrollbar:
vvooiidd hheellppeerr(O
Orriieennttaattiioonn oooo)
{
// ...
p = nneew
w SSccrroollllbbaarr(oooo);
// ...
}
vvooiidd m
mee()
{
hheellppeerr(hhoorriizzoonnttaall);
// ...
}
This representation would also make it easy to re-orient a scrollbar at run time. This is unlikely to
be of major importance in the case of scrollbars, but it can be important for equivalent examples.
The point here is that there are always tradeoffs, and the tradeoffs are often nontrivial.
24.3.4.2 Containment/Hierarchy Tradeoffs [lang.tradeoff]
Now consider how to attach a scrollbar to a window. If we consider a W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr as
something that is both a W
Wiinnddoow
w and a SSccrroollllbbaarr, we get something like:
ccllaassss W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr : ppuubblliicc W
Wiinnddoow
w, ppuubblliicc SSccrroollllbbaarr {
// ...
};
This allows any W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr to act like a SSccrroollllbbaarr and like a W
Wiinnddoow
w, but it constrains us to using the single scrollbar-type solution.
On the other hand, if we consider a W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr as a W
Wiinnddoow
w that has a SSccrroollllbbaarr,
we get something like:
ccllaassss W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr : ppuubblliicc W
Wiinnddoow
w{
// ...
SSccrroollllbbaarr* ssbb;
ppuubblliicc:
W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr(SSccrroollllbbaarr* pp, /* ... */) : W
Wiinnddoow
w(/* ...*/), ssbb(pp) { /* ... */ }
// ...
};
This allows us to use the scrollbar-hierarchy solution. Passing the scrollbar as an argument allows
the window to be oblivious to the exact type of its scrollbar. We could even pass a SSccrroollllbbaarr
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.4.2
Containment/Hierarchy Tradeoffs
745
around the way we passed an O
Orriieennttaattiioonn (§24.3.4.1). If we need to have W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr
act as a scrollbar, we can add a conversion operator:
W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr::ooppeerraattoorr SSccrroollllbbaarr&()
{
rreettuurrnn *ssbb;
}
My preference is to have a window contain a scrollbar. I find it easier to think of a window having
a scrollbar than of a window being a scrollbar in addition to being a window. In fact, my favorite
design strategy involves a scrollbar being a special kind of window, which is then contained in a
window that needs scrollbar services. This strategy forces the decision in favor of the containment
solution. An alternative argument for the containment solution comes from the ‘‘can it have two?’’
rule of thumb (§24.3.4). Because there is no logical reason why a window shouldn’t have two
scrollbars (in fact, many windows do have both a horizontal and a vertical scrollbar),
W
Wiinnddoow
w__w
wiitthh__ssccrroollllbbaarr ought not be derived from SSccrroollllbbaarr.
Note that it is not possible to derive from an unknown class. The exact type of a base class
must be known at compile time (§12.2). On the other hand, if an attribute of a class is passed as an
argument to its constructor, then somewhere in the class there must be a member that represents it.
However, if that member is a pointer or a reference we can pass an object of a class derived from
the class specified for the member. For example, The SSccrroollllbbaarr* member ssbb in the previous example can point to a SSccrroollllbbaarr of a type, such as N
Naavviiggaattiioonn__bbuuttttoonn, that is unknown to users of the
SSccrroollllbbaarr*.
24.3.5 Use Relationships [lang.use]
Knowledge of what other classes are used by a class and in which ways is often critical in order to
express and understand a design. Such dependencies are supported only implicitly by C++. A class
can use only names that have been declared (somewhere), but a list of names used is not provided
in the C++ source. Tools (or in the absence of suitable tools, careful reading) are necessary for
extracting such information. The ways a class X can use another class Y can be classified in several
ways. Here is one way:
– X uses the name Y
Y.
– X uses Y
Y.
– X calls a Y member function.
– X reads a member of Y
Y.
– X writes a member of Y
Y.
– X creates a Y
Y.
– X allocates an aauuttoo or ssttaattiicc variable of Y
Y.
– X creates a Y using nneew
w.
– X takes the size of a Y
Y.
Taking the size of an object is classified separately because doing so requires knowledge of the
class declaration, but doesn’t depend on the constructors. Naming Y is also classified as a separate
dependency because just doing that – for example, in declaring a Y
Y* or mentioning Y in the declaration of an external function – doesn’t require access to the declaration of Y at all (§5.7):
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
746
Design and Programming
Chapter 24
ccllaassss Y
Y; // Y is the name of a class
Y
Y* pp;
eexxtteerrnn Y ff(ccoonnsstt Y
Y&);
It is often important to distinguish between the dependencies of a class’ interface (the class declaration) and the dependencies of the class implementation (the class member definitions). In a welldesigned system, the latter typically have many more dependencies, and those are far less interesting to a user than are the dependencies of the class declaration (§24.4.2). Typically, a design aims
at minimizing the dependencies of an interface because they become dependencies of the class’
users (§8.2.4.1, §9.3.2, §12.4.1.1, §24.4) .
C++ doesn’t require the implementer of a class to specify in detail what other classes are used
and how. One reason for this is that most significant classes depend on so many other classes, that
an abbreviation of the list of those classes, such as an #iinncclluuddee directive, would be necessary for
readability. Another is that the classification and granularity of such dependencies doesn’t appear
to be a programming language issue. Rather, exactly how uses dependencies are viewed depends
on the purpose of the designer, programmer, or tool. Finally, which dependencies are interesting
may also depend on details of the language implementation.
24.3.6 Programmed-In Relationships [lang.prog]
A programming language cannot – and should not – directly support every concept from every
design method. Similarly, a design language should not support every feature of every programming language. A design language should be richer and less concerned with details than a language
suitable for systems programming must be. Conversely, a programming language must be able to
support a variety of design philosophies, or it will fail for lack of adaptability.
When a programming language does not provide facilities for representing a concept from the
design directly, a conventional mapping between the design construct and the programming language constructs should be used. For example, a design method may have a notion of delegation.
That is, the design can specify that every operation not defined for a class A should be serviced by
an object of a class B pointed to by a pointer pp. C++ cannot express this directly. However, the
expression of that idea in C++ is so stylized that one could easily imagine a program generating the
code. Consider:
ccllaassss B {
// ...
vvooiidd ff();
vvooiidd gg();
vvooiidd hh();
};
ccllaassss A {
B
B* pp;
// ...
vvooiidd ff();
vvooiidd ffff();
};
A specification that A delegated to B through A
A::pp would result in code like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.6
Programmed-In Relationships
747
ccllaassss A {
B
B* pp;
// delegation through p
// ...
vvooiidd ff();
vvooiidd ffff();
vvooiidd gg() { pp->gg(); }
// delegate g()
vvooiidd hh() { pp->hh(); }
// delegate h()
};
It is fairly obvious to a programmer what is going on here, but simulating a design concept in code
is clearly inferior to a one-to-one correspondence. Such ‘‘programmed-in’’ relationships are not as
well ‘‘understood’’ by the programming language and are therefore less amenable to manipulation
by tools. For example, standard tools would not recognize the ‘‘delegation’’ from A to B through
A
A::pp as different from any other use of a B
B*.
A one-to-one mapping between the design concepts and the programming language concepts
should be used wherever possible. A one-to-one mapping ensures simplicity and guarantees that
the design really is reflected in the program so that programmers and tools can take advantage of it.
Conversion operators provide a language mechanism for expressing a class of programmed-in
relationships. That is, a conversion operator X
X::ooppeerraattoorr Y
Y() specifies that wherever a Y is
acceptable, an X can be used (§11.4.1). A constructor Y
Y::Y
Y(X
X) expresses the same relationship.
Note that a conversion operator (and a constructor) produces a new object rather than changing the
type of an existing object. Declaring a conversion function to Y is simply a way of requesting
implicit application of a function that returns a Y
Y. Because the implicit application of conversions
defined by constructors and conversion operators can be treacherous, it is sometimes useful to analyze them separately in a design.
It is important to ensure that the conversion graphs for a program do not contain cycles. If they
do, the resulting ambiguity errors will render the types involved in the cycles unusable in combination. For example:
ccllaassss R
Raattiioonnaall;
ccllaassss B
Biigg__iinntt {
ppuubblliicc:
ffrriieenndd B
Biigg__iinntt ooppeerraattoorr+(B
Biigg__iinntt,B
Biigg__iinntt);
ooppeerraattoorr R
Raattiioonnaall();
// ...
};
ccllaassss R
Raattiioonnaall {
ppuubblliicc:
ffrriieenndd R
Raattiioonnaall ooppeerraattoorr+(R
Raattiioonnaall,R
Raattiioonnaall);
ooppeerraattoorr B
Biigg__iinntt();
// ...
};
The R
Raattiioonnaall and B
Biigg__iinntt types will not interact as smoothly as one might have hoped:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
748
Design and Programming
Chapter 24
vvooiidd ff(R
Raattiioonnaall rr, B
Biigg__iinntt ii)
{
gg(rr+ii);
// error, ambiguous: operator+(r,Rational(i)) or operator+(Big_int(r),i) ?
gg(rr+R
Raattiioonnaall(ii)); // one explicit resolution
gg(B
Biigg__iinntt(rr)+ii);
// another explicit resolution
}
One can avoid such ‘‘mutual’’ conversions by making at least some of them explicit. For example,
the B
Biigg__iinntt to R
Raattiioonnaall conversion might have been defined as m
maakkee__R
Raattiioonnaall() instead of as a
conversion operator, and the addition would have been resolved to gg(B
Biigg__iinntt(rr),ii). Where
‘‘mutual’’ conversion operators cannot be avoided, one must resolve the resulting clashes either by
explicit conversions as shown or by defining many separate versions of binary operators, such as +.
24.3.7 Relationships within a Class [lang.within]
A class can conceal just about any implementation detail and just about any amount of dirt – and
sometimes it has to. However, the objects of most classes do themselves have a regular structure
and are manipulated in ways that are fairly easy to describe. An object of a class is a collection of
other sub-objects (often called members), and many of these are pointers and references to other
objects. Thus, an object can be seen as the root of a tree of objects and the objects involved can be
seen as constituting an ‘‘object hierarchy’’ that is complementary to the class hierarchy, as
described in §24.3.2.1. For example, consider a very simple SSttrriinngg:
ccllaassss SSttrriinngg {
iinntt sszz;
cchhaarr* pp;
ppuubblliicc:
SSttrriinngg(ccoonnsstt cchhaarr* qq);
~SSttrriinngg();
// ...
};
A SSttrriinngg object can be represented graphically like this:
iinntt sszz;
cchhaarr* pp;
..
........
........
........
....
... elements ... \0
24.3.7.1 Invariants [lang.invariant]
The values of the members and the objects referred to by members are collectively called the state
of the object (or simply, its value). A major concern of a class design is to get an object into a
well-defined state (initialization/construction), to maintain a well-defined state as operations are
performed, and finally to destroy the object gracefully. The property that makes the state of an
object well-defined is called its invariant.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.7.1
Invariants
749
Thus, the purpose of initialization is to put an object into a state for which the invariant holds.
Typically, this is done by a constructor. Each operation on a class can assume it will find the
invariant true on entry and must leave the invariant true on exit. The destructor finally invalidates
the invariant by destroying the object. For example, the constructor SSttrriinngg::SSttrriinngg(ccoonnsstt cchhaarr*)
ensures that p points to an array of at least sszz+11 elements, where sszz has a reasonable value and
pp[sszz]==00. Every string operation must leave that assertion true.
Much of the skill in class design involves making a class simple enough to make it possible to
implement it so that it has a useful invariant that can be expressed simply. It is easy enough to state
that every class needs an invariant. The hard part is to come up with a useful invariant that is easy
to comprehend and that doesn’t impose unacceptable constraints on the implementer or on the efficiency of the operations. Note that ‘‘invariant’’ here is used to denote a piece of code that can
potentially be run to check the state of an object. A stricter and more mathematical notion is clearly
possible and, in some contexts, more appropriate. An invariant, as discussed here, is a practical –
and therefore typically economical and logically incomplete – check on an object’s state.
The notion of invariants has its origins in the work of Floyd, Naur, and Hoare on preconditions
and postconditions and is present in essentially all work on abstract data types and program verification done over the last 30 years or so. It is also a staple of C debugging.
Typically, the invariant is not maintained during the execution of a member function. Functions
that may be called while the invariant is invalid should not be part of the public interface. Private
and protected functions can serve that purpose.
How can we express the notion of an invariant in a C++ program? A simple way is to define an
invariant-checking function and insert calls to it in the public operations. For example:
ccllaassss SSttrriinngg {
iinntt sszz;
cchhaarr* pp;
ppuubblliicc:
ccllaassss R
Raannggee {};
ccllaassss IInnvvaarriiaanntt {};
// exception classes
eennuum
m{T
TO
OO
O__L
LA
AR
RG
GE
E = 1166000000 };
// length limit
vvooiidd cchheecckk();
// invariant check
SSttrriinngg(ccoonnsstt cchhaarr* qq);
SSttrriinngg(ccoonnsstt SSttrriinngg&);
~SSttrriinngg();
cchhaarr& ooppeerraattoorr[](iinntt ii);
iinntt ssiizzee() { rreettuurrnn sszz; }
// ...
};
vvooiidd SSttrriinngg::cchheecckk()
{
iiff (pp==00 || sszz<00 || T
TO
OO
O__L
LA
AR
RG
GE
E<=sszz || pp[sszz-11]) tthhrroow
w IInnvvaarriiaanntt();
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
750
Design and Programming
cchhaarr& SSttrriinngg::ooppeerraattoorr[](iinntt ii)
{
cchheecckk();
iiff (ii<00 || sszz<=ii) tthhrroow
w R
Raannggee();
cchheecckk();
rreettuurrnn pp[ii];
}
Chapter 24
// check on entry
// do work
// check on exit
This will work nicely and is hardly any work for the programmer. However, for a simple class like
SSttrriinngg the invariant checking will dominate the run time and maybe even the code size. Therefore,
programmers often execute the invariant checks only during debugging:
iinnlliinnee vvooiidd SSttrriinngg::cchheecckk()
{
#iiffnnddeeff N
ND
DE
EB
BU
UG
G
iiff (pp==00 || sszz<00 || T
TO
OO
O__L
LA
AR
RG
GE
E<=sszz || pp[sszz]) tthhrroow
w IInnvvaarriiaanntt();
#eennddiiff
}
Here, the N
ND
DE
EB
BU
UG
G macro is used in a way similar to the way it is used in the standard C aasssseerrtt()
macro. N
ND
DE
EB
BU
UG
G is conventionally set to indicate that debugging is not being done.
The simple act of defining invariants and using them during debugging is an invaluable help in
getting the code right and – more importantly – in getting the concepts represented by the classes
well defined and regular. The point is that when you are designing invariants, a class will be considered from an alternative viewpoint and the code will contain redundancy. Both increase the likelihood of spotting mistakes, inconsistencies, and oversights.
24.3.7.2 Assertions [lang.assert]
An invariant is a special form of an assertion. An assertion is simply a statement that a given logical criterion must hold. The question is what to do when it doesn’t.
The C standard library – and by implication the C++ standard library – provides the aasssseerrtt()
macro in <ccaasssseerrtt> or <aasssseerrtt.hh>. An aasssseerrtt() evaluates its argument and calls aabboorrtt() if the
result is nonzero. For example:
vvooiidd ff(iinntt* pp)
{
aasssseerrtt(pp!=00); // assert that p!=0; abort() if p is zero
// ...
}
Before aborting, aasssseerrtt() outputs the name of its source file and the number of the line on which it
appears. This makes aasssseerrtt() a useful debugging aid. N
ND
DE
EB
BU
UG
G is usually set by compiler
options on a per-compilation-unit basis. This implies that aasssseerrtt() shouldn’t be used in inline
functions and template functions that are included in several translation units unless great care is
taken that N
ND
DE
EB
BU
UG
G is set consistently (§9.2.3). Like all macro magic, this use of N
ND
DE
EB
BU
UG
G is too
low-level, messy, and error-prone. Also, it is typically a good idea to leave at least some checks
active in even the best-checked program, and N
ND
DE
EB
BU
UG
G isn’t well suited for that. Furthermore,
calling aabboorrtt() is rarely acceptable in production code.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.7.2
Assertions
751
The alternative is to use an A
Asssseerrtt() template that throws an exception rather than aborting so
that assertions can be left in production code when that is desirable. Unfortunately, the standard
library doesn’t provide an A
Asssseerrtt(). However, it is trivially defined:
tteem
mppllaattee<ccllaassss X
X, ccllaassss A
A> iinnlliinnee vvooiidd A
Asssseerrtt(A
A aasssseerrttiioonn)
{
iiff (!aasssseerrttiioonn) tthhrroow
w X
X();
}
A
Asssseerrtt() throws the exception X
X() if the aasssseerrttiioonn is false. For example:
ccllaassss B
Baadd__aarrgg { };
vvooiidd ff(iinntt* pp)
{
A
Asssseerrtt<B
Baadd__aarrgg>(pp!=00); // assert p!=0; throw Bad_arg unless p!=0
// ...
}
This style of assertion has the condition explicit, so if we want to check only while debugging we
must say so. For example:
vvooiidd ff22(iinntt* pp)
{
A
Asssseerrtt<B
Baadd__aarrgg>(N
ND
DE
EB
BU
UG
G || pp!=00);
// ...
}
// either I’m not debugging or p!=0
The use of || rather than && in the assertion may appear surprising. However, A
Asssseerrtt<E
E>(aa||bb)
tests !(aa||bb) which is !aa&&!bb.
Using N
ND
DE
EB
BU
UG
G in this way requires that we define N
ND
DE
EB
BU
UG
G with a suitable value whether or
not we are debugging. A C++ implementation does not do this for us by default, so it is better to
use a value. For example:
#iiffddeeff N
ND
DE
EB
BU
UG
G
ccoonnsstt bbooooll A
AR
RG
G__C
CH
HE
EC
CK
K = ffaallssee;
#eellssee
ccoonnsstt bbooooll A
AR
RG
G__C
CH
HE
EC
CK
K = ttrruuee;
#eennddiiff
// we are not debugging: disable checks
// we are debugging
vvooiidd ff33(iinntt* pp)
{
A
Asssseerrtt<B
Baadd__aarrgg>(!A
AR
RG
G__C
CH
HE
EC
CK
K || pp!=00);
// ...
}
// either I’m not debugging or p!=0
If the exception associated with an assertion is not caught, a failed A
Asssseerrtt() tteerrm
miinnaattee()s the program much like an equivalent aasssseerrtt() would aabboorrtt(). However, an exception handler may be
able to take some less drastic action.
In any realistically-sized program, I find myself turning assertions on and off in groups to suit
the need for testing. Using N
ND
DE
EB
BU
UG
G is simply the crudest form of that technique. Early on in
development, most assertions are enabled, whereas only key sanity checks are left enabled in
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
752
Design and Programming
Chapter 24
shipped code. This style of usage is most easily managed if the actual assertion is in two parts,
with the first being an enabling condition (such as A
AR
RG
G__C
CH
HE
EC
CK
K) and the second being the assertion proper.
If the enabling condition is a constant expression, the whole assertion will be compiled away
when not enabled. However, the enabling condition can also be a variable so that it can be turned
on and off at run time as debugging needs dictate. For example:
bbooooll ssttrriinngg__cchheecckk = ttrruuee;
iinnlliinnee vvooiidd SSttrriinngg::cchheecckk()
{
A
Asssseerrtt<IInnvvaarriiaanntt>(!ssttrriinngg__cchheecckk || (pp && 00<=sszz && sszz<T
TO
OO
O__L
LA
AR
RG
GE
E && pp[sszz]==00));
}
vvooiidd ff()
{
SSttrriinngg s = "w
woonnddeerr";
// strings are checked here
ssttrriinngg__cchheecckk = ffaallssee;
// no checking of strings here
}
Naturally, code will be generated in such cases, so we must keep an eye out for code bloat if we use
such assertions extensively.
Saying
A
Asssseerrtt<E
E>(aa);
is simply another way of saying
iiff (!aa) tthhrroow
w E
E();
Then why bother with A
Asssseerrtt(), rather than writing out the statement directly? Using A
Asssseerrtt()
makes the designer’s intent explicit. It says that this is an assertion of something that is supposed
to be always true. It is not an ordinary part of the program logic. This is valuable information to a
reader of the program. A more practical advantage is that it is easy to search for aasssseerrtt() or
A
Asssseerrtt() whereas searching for conditional statements that throw exceptions is nontrival.
A
Asssseerrtt() can be generalized to throw exceptions taking arguments and variable exceptions:
tteem
mppllaattee<ccllaassss A
A, ccllaassss E
E> iinnlliinnee vvooiidd A
Asssseerrtt(A
A aasssseerrttiioonn, E eexxcceepptt)
{
iiff (!aasssseerrttiioonn) tthhrroow
w eexxcceepptt;
}
ssttrruucctt B
Baadd__gg__aarrgg {
iinntt* pp;
B
Baadd__gg__aarrgg(iinntt* pppp) :pp(pppp) { }
};
bbooooll gg__cchheecckk = ttrruuee;
iinntt gg__m
maaxx = 110000;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.3.7.2
vvooiidd gg(iinntt* pp, eexxcceeppttiioonn ee)
{
A
Asssseerrtt(!gg__cchheecckk || pp!=00, ee);
A
Asssseerrtt(!gg__cchheecckk || (00<*pp&&*pp<=gg__m
maaxx),B
Baadd__gg__aarrgg(pp));
// ...
}
Assertions
753
// pointer is valid
// value is plausible
In many programs, it is crucial that no code is generated for an A
Asssseerrtt() where the assertion can be
evaluated at compile time. Unfortunately, some compilers are unable to achieve this for the generalized A
Asssseerrtt(). Consequently, the two-argument A
Asssseerrtt() should be used only when the exception is not of the form E
E() and it is also acceptable for some code to be generated independently of
the value of the assertion.
In §23.4.3.5, it was mentioned that the two most common forms of class hierarchy reorganizations were to split a class into two and to factor out the common part of two classes into a base
class. In both cases, well-designed invariants can give a clue to the potential for reorganization.
Comparing the invariant with the code of operations will show most of the invariant checking to be
redundant in a class that is ripe for splitting. In such cases, subsets of the operations will access
only subsets of the object state. Conversely, classes that are ripe for merging will have similar
invariants even if their detailed implementations differ.
24.3.7.3 Preconditions and Postconditions [lang.pre]
One popular use of assertions is to express preconditions and postconditions of a function. That is,
checking that basic assumptions about input hold and verifying that the function leaves the world in
the expected state upon exit. Unfortunately, the assertions we would like to make are often at a
higher level than the programming language allows us to express conveniently and efficiently. For
example:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt)
{
A
Asssseerrtt<B
Baadd__sseeqquueennccee>("[ffiirrsstt,llaasstt) iiss a vvaalliidd sseeqquueennccee");
// pseudo code
// ... sorting algorithm ...
A
Asssseerrtt<F
Faaiilleedd__ssoorrtt>("[ffiirrsstt,llaasstt) iiss iinn iinnccrreeaassiinngg oorrddeerr");
// pseudo code
}
This problem is fundamental. What we want to say about a program is best expressed in a
mathematically-based higher language, rather than in the algorithmic programming language in
which we write the program.
As for invariants, a certain amount of cleverness is needed to translate the ideal of what we
would like to assert into something that is algorithmically feasible to check. For example:
tteem
mppllaattee<ccllaassss R
Raann> vvooiidd ssoorrtt(R
Raann ffiirrsstt, R
Raann llaasstt)
{
// [first,last) is a valid sequence: check plausibility:
A
Asssseerrtt<B
Baadd__sseeqquueennccee>(N
ND
DE
EB
BU
UG
G || ffiirrsstt<=llaasstt);
// ... sorting algorithm ...
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
754
Design and Programming
Chapter 24
// [first,last) is in increasing order: check a sample:
A
Asssseerrtt<F
Faaiilleedd__ssoorrtt>(N
ND
DE
EB
BU
UG
G ||
(llaasstt-ffiirrsstt<22 || (*ffiirrsstt<=llaasstt[-11]
&& *ffiirrsstt<=ffiirrsstt[(llaasstt-ffiirrsstt)/22] && ffiirrsstt[(llaasstt-ffiirrsstt)/22]<=llaasstt[-11])));
}
I often find writing ordinary code-checking arguments and results simpler than composing assertions. However, it is important to try to express the real (ideal) preconditions and postconditions –
and at least document them as comments – before reducing them to something less abstract that
can be effectively expressed in a programming language.
Precondition checking can easily degenerate into simple checking of argument values. As an
argument is often passed through several functions, this checking can be repetitive and expensive.
However, simply asserting that every pointer argument is nonzero in every function is not particularly helpful and can give a false sense of security – especially if the tests are done during debugging only to prevent overhead. This is a major reason why I recommend a focus on invariants.
24.3.7.4 Encapsulation [lang.encapsulate]
Note that in C++, the class – not the individual object – is the unit of encapsulation. For example:
ccllaassss L
Liisstt {
L
Liisstt* nneexxtt;
ppuubblliicc:
bbooooll oonn(L
Liisstt*);
// ...
};
bbooooll L
Liisstt::oonn(L
Liisstt* pp)
{
iiff (pp == 00) rreettuurrnn ffaallssee;
ffoorr(L
Liisstt* q = tthhiiss; qq; qq=qq->nneexxtt) iiff (pp == qq) rreettuurrnn ttrruuee;
rreettuurrnn ffaallssee;
}
The chasing of the private L
Liisstt::nneexxtt pointer is accepted because L
Liisstt::oonn() has access to every
object of class L
Liisstt it can somehow reference. Where that is inconvenient, matters can be simplified by not taking advantage of the ability to access the representation of other objects from a member function. For example:
bbooooll L
Liisstt::oonn(L
Liisstt* pp)
{
iiff (pp == 00) rreettuurrnn ffaallssee;
iiff (pp == tthhiiss) rreettuurrnn ttrruuee;
iiff (nneexxtt==00) rreettuurrnn ffaallssee;
rreettuurrnn nneexxtt->oonn(pp);
}
However, this turns iteration into recursion, and doing that can cause a major performance hit when
a compiler isn’t able to optimize the recursion back into an iteration.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.4
Components
755
24.4 Components [lang.component]
The unit of design is a collection of classes, functions, etc., rather than an individual class. Such a
collection, often called a library or a framework (§25.8), is also the unit of reuse (§23.5.1), maintenance, etc. C++ provides three mechanisms for expressing the notion of a set of facilities united by
a logical criteria:
[1] A class – containing a collection of data, function, template, and type members
[2] A class hierarchy – containing a collection of classes
[3] A namespace – containing a collection of data, function, template, and type members
A class provides many facilities to make it convenient to create objects of the type it defines. However, many significant components are not best described by a mechanism for creating objects of a
single type. A class hierarchy expresses the notion of a set of related types. However, the individual members of a component are not always best expressed as classes and not all classes possess the
basic similarity required to fit into a meaningful class hierarchy (§24.2.5). Therefore, a namespace
is the most direct and the most general embodiment of the notion of a component in C++. A component is sometimes referred to as a ‘‘class category.’’ However, not every element of a component is or should be a class.
Ideally, a component is described by the set of interfaces it uses for its implementation plus the
set of interfaces it provides for its users. Everything else is ‘‘implementation detail’’ and hidden
from the rest of the system. This may indeed be the designer’s description of a component. To
make it real, the programmer needs to map it into declarations. Classes and class hierarchies provide the interfaces, and namespaces allow the programmer to group the interfaces and to separate
interfaces used from interfaces provided. Consider:
Used by X interface
Used by X implementation
X interface
X implementation
Using the techniques described in §8.2.4.1, this becomes:
nnaam
meessppaaccee A { // some facilities used by X’s interface
// ...
}
nnaam
meessppaaccee X { // interface of component X
uussiinngg nnaam
meessppaaccee A
A; // dependent on declarations from A
// ...
vvooiidd ff();
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
756
Design and Programming
Chapter 24
nnaam
meessppaaccee X
X__iim
mppll { // facilities needed by X’s implementation
uussiinngg nnaam
meessppaaccee X
X;
// ...
}
vvooiidd X
X::ff()
{
uussiinngg nnaam
meessppaaccee X
X__iim
mppll; // dependent on declarations from X_impl
// ...
}
The general interface X should not depend on the implementation interface X
X__iim
mppll.
A component can have many classes that are not intended for general use. Such classes should
be ‘‘hidden’’ within implementation classes or namespaces:
nnaam
meessppaaccee X
X__iim
mppll { // component X implementation details
ccllaassss W
Wiiddggeett {
// ...
};
// ...
}
This ensures that W
Wiiddggeett isn’t used from other parts of the program. However, classes that represent coherent concepts are often candidates for reuse and should therefore be considered for inclusion into the interface of the component. Consider:
ccllaassss C
Caarr {
ccllaassss W
Whheeeell {
// ...
};
W
Whheeeell ffllw
w, ffrrw
w, rrllw
w, rrrrw
w;
// ...
ppuubblliicc:
// ...
};
In most contexts, we need to have the actual wheels hidden to maintain the abstraction of a car
(when you use a car you cannot operate the wheels independently). However, the W
Whheeeell class itself
seems a good candidate for wider use, so moving it outside class C
Caarr might be better:
ccllaassss W
Whheeeell {
// ...
};
ccllaassss C
Caarr {
W
Whheeeell ffllw
w, ffrrw
w, rrllw
w, rrrrw
w;
// ...
ppuubblliicc:
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.4
Components
757
The decision to nest or not depends on the aims of the design and the generality of the concepts
involved. Both nesting and ‘‘non-nesting’’ are widely applicable techniques for expressing a
design. The default should be to make a class as local as possible until a need to make it more generally available is demonstrated.
There is a nasty tendency for ‘‘interesting’’ functions and data to ‘‘bubble up’’ to the global
namespace, to widely-used namespaces, or to ultimate base classes in a hierarchy. This can easily
lead to unintentional exposure of implementation details and to the problems associated with global
data and global functions. This is most likely to happen in a single-rooted hierarchy, and in a program where only very few namespaces are used. Virtual base classes (§15.2.4) can be used to combat this phenomenon in the context of class hierarchies. Small ‘‘implementation’’ namespaces are
the main tool for avoiding the problem in the context of namespaces.
Note that header files provide a powerful mechanism for supplying different views of a component to different users and for excluding classes that are considered part of the implementation from
the user’s view (§9.3.2).
24.4.1 Templates [lang.temp]
From a design perspective, templates serve two, weakly-related needs:
– Generic programming
– Policy parameterization
Early in a design effort, operations are just operations. Later, when it is time to specify the type of
operands templates become essential when using a statically-typed programming language, such as
C++. Without templates, function definitions would have to be replicated or checking would have
to be unnecessarily postponed to run time (§24.2.3). An operation that implements an algorithm for
a variety of operand types is a candidate to be implemented as a template. If all operands fit into a
single class hierarchy, and especially if there is a need to add new operand types at run time, the
operand type is best represented as a class – often as an abstract class. If the operand types do not
fit into a single hierarchy and especially if run-time performance is critical, the operation is best
implemented as a template. The standard containers and their supporting algorithms are an example of when the need to take operands of a variety of unrelated types combined with a need for
run-time performance lead to the use of templates (§16.2).
To make the template/hierarchy tradeoff more concrete, consider how to generalize a simple
iteration:
vvooiidd pprriinntt__aallll(IItteerr__ffoorr__T
T xx)
{
ffoorr (T
T* p = xx.ffiirrsstt(); pp; p = xx.nneexxtt()) ccoouutt << *pp;
}
Here, the assumption is that IItteerr__ffoorr__T
T provides operations that yield T
T*s.
We can make the iterator IItteerr__ffoorr__T
T a template parameter:
tteem
mppllaattee<ccllaassss IItteerr__ffoorr__T
T> pprriinntt__aallll(IItteerr__ffoorr__T
T xx)
{
ffoorr (T
T* p = xx.ffiirrsstt(); pp; p = xx.nneexxtt()) ccoouutt << *pp;
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
758
Design and Programming
Chapter 24
This allows us to use a variety of unrelated iterators as long as they all provide ffiirrsstt() and nneexxtt()
with the right meanings and as long as we know the type of iterator for each call of pprriinntt__aallll() at
compile time. The standard library containers and algorithms are based on this idea.
Alternatively, we can use the observation that ffiirrsstt() and nneexxtt() constitute an interface to iterators. We can then define a class to represent that interface:
ccllaassss IItteerr {
ppuubblliicc:
vviirrttuuaall T
T* ffiirrsstt() ccoonnsstt = 00;
vviirrttuuaall T
T* nneexxtt() = 00;
};
vvooiidd pprriinntt__aallll22(IItteerr& xx)
{
ffoorr (T
T* p = xx.ffiirrsstt(); pp; p = xx.nneexxtt()) ccoouutt << *pp;
}
We can now use every iterator derived from IItteerr. The actual code doesn’t differ depending on
whether we use templates or a class hierarchy to represent the parameterization – only the run-time,
recompilation, etc., tradeoffs differ. In particular, class IItteerr is a candidate for use as an argument
for the template:
vvooiidd ff(IItteerr& ii)
{
pprriinntt__aallll(ii);
pprriinntt__aallll22(ii);
}
// use the template
Consequently, the two approaches can be seen as complementary.
Often, a template needs to use functions and classes as part of its implementation. Many of
those must themselves be templates so as to maintain generality and efficiency. In that way, algorithms become generic over a range of types. This style of template use is called generic
programming (§2.7). When we call ssttdd::ssoorrtt() on a vveeccttoorr, the elements of the vector are the
operands of the ssoorrtt(); thus, ssoorrtt() is generic for the element types. In addition, the standard sort
is generic for the container types because it is invoked on iterators for arbitrary, standardconforming containers (§16.3.1).
The ssoorrtt() algorithm is also parameterized on the comparison criteria (§18.7.1). From a
design perspective, this is different from taking an operation and making it generic on its operand
type. Deciding to parameterize an algorithm on an object (or operation) in a way that controls the
way the algorithm operates is a much higher-level design decision. It is a decision to give the
designer/programmer control over some part of the policy governing the operation of the algorithm.
From a programming language point of view, however, there is no difference.
24.4.2 Interfaces and Implementations [lang.interface]
The ideal interface
– presents a complete and coherent set of concepts to a user,
– is consistent over all parts of a component,
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.4.2
Interfaces and Implementations
759
– does not reveal implementation details to a user,
– can be implemented in several ways,
– is statically typed,
– is expressed using application-level types, and
– depends in limited and well-defined ways on other interfaces.
Having noted the need for consistency across the classes that present the component’s interface to
the rest of the world (§24.4), we can simplify the discussion by looking at only a single class. Consider:
ccllaassss Y { /* ... */ };
// needed by X
ccllaassss Z { /* ... */ };
// needed by X
ccllaassss X { // example of poor interface style
Y aa;
Z bb;
ppuubblliicc:
vvooiidd ff(ccoonnsstt cchhaarr * ...);
vvooiidd gg(iinntt[],iinntt);
vvooiidd sseett__aa(Y
Y&);
Y
Y& ggeett__aa();
};
This interface has several potential problems:
– The interface uses the types Y and Z in a way that requires the declarations of Y and Z to be
known to compile it.
– The function X
X::ff() takes an arbitrary number of arguments of unknown types (probably
somehow controlled by a ‘‘format string’’ supplied as the first argument; §21.8).
– The function X
X::gg() takes an iinntt[] argument. This may be acceptable, but typically it is a
sign that the level of abstraction is too low. An array of integers is not self-describing, so it
is not obvious how many elements it is supposed to have.
– The sseett__aa() and ggeett__aa() functions most likely expose the representation of objects of
class X by allowing direct access to X
X::aa.
These member functions provide an interface at a very low level of abstraction. Basically, classes
with interfaces at this level belong among the implementation details of a larger component – if
they belong anywhere at all. Ideally, an argument of an interface function carries enough information to make it self-describing. A rule of thumb is that it should be possible to transmit the request
for service over a thin wire for service at a remote server.
C++ allows the programmer to expose the representation of a class as part of the interface. This
representation may be hidden (using pprriivvaattee or pprrootteecctteedd), but it is available to the compiler to
allow allocation of automatic variables, to allow inline substitution of functions, etc. The negative
effect of this is that use of class types in the representation of a class may introduce undesirable
dependencies. Whether the use of members of types Y and Z is a problem depends on what kind of
types Y and Z actually are. If they are simple types, such as lliisstt, ccoom
mpplleexx, and ssttrriinngg, their use is
most often quite appropriate. Such types can be considered stable, and the need to include their
class declarations is an acceptable burden on the compiler. However, if Y and Z themselves had
been interface classes of significant components, such as a graphics system or a bank account
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
760
Design and Programming
Chapter 24
management system, it might be wise not to depend too directly on them. In such cases, using a
pointer or a reference member is often a better choice:
ccllaassss Y
Y;
ccllaassss Z
Z;
ccllaassss X { // X accesses Y and Z through pointers and references only
Y
Y* aa;
Z
Z& bb;
// ...
};
This decouples the definition of X from the definitions of Y and Z
Z; that is, the definition of X
depends on the names Y and Z only. The implementation of X will, of course, still depend on the
definitions of Y and Z
Z, but this will not adversely affect the users of X
X.
This illustrates an important point: an interface that hides significant amounts of information –
as a useful interface ought to – will have far fewer dependencies than the implementation it hides.
For example, the definition of class X can be compiled without access to the definitions of Y and Z
Z.
However, the definitions of X
X’s member functions that manipulate the Y and Z objects will need
access to the definitions of Y and Z
Z. When dependencies are analyzed, the dependencies of the
interface and the implementation must be considered separately. In both cases, the ideal is for the
dependency graphs of a system to be directed acyclic graphs to ease understanding and testing of
the system. However, this ideal is far more critical and far more often achievable for interfaces
than for implementations.
Note that a class can define three interfaces:
ccllaassss X {
pprriivvaattee:
// accessible to members and friends only
pprrootteecctteedd:
// accessible to members and friends and
// to members and friends of derived classes only
ppuubblliicc:
// accessible to the general public
};
In addition, a ffrriieenndd is part of the public interface (§11.5).
A member should be part of the most restrictive interface possible. That is, a member should be
pprriivvaattee unless there is a reason for it to be more accessible. If it needs to be more accessible, it
should be pprrootteecctteedd unless there is a reason for it to be ppuubblliicc. It is almost always a bad idea to
make a data member ppuubblliicc or pprrootteecctteedd. The functions and classes that constitute the public interface should present a view of the class that fits with its role as representing a concept.
Note that abstract classes can be used to provide a further level of representation hiding (§2.5.4,
§12.3, §25.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.4.3
Fat Interfaces
761
24.4.3 Fat Interfaces [lang.fat]
Ideally, an interface should offer only operations that make sense and that can be implemented well
by every derived class implementing that interface. However, this is not always easy. Consider
lists, arrays, associative arrays, trees, etc. As shown in §16.2.2, it is tempting and sometimes useful
to provide a generalization of all of these types – usually called a container – that can be used as
the interface to every one of these. This (apparently) relieves the user of having to deal with the
details of all of these containers. However, defining the interface of a general container class is
nontrivial. Assume that we want to define C
Coonnttaaiinneerr as an abstract type. What operations do we
want C
Coonnttaaiinneerr to provide? We could provide only the operations that every container can support
– the intersection of the sets of operations – but that is a ridiculously narrow interface. In fact, in
many interesting cases that intersection is empty. Alternatively, we could provide the union of all
the sets of operations and give a run-time error if a ‘‘non-existent’’ operation is applied to an object
through this interface. An interface that is such a union of interfaces to a set of concepts is called a
fat interface. Consider a ‘‘general container’’ of objects of type T
T:
ccllaassss C
Coonnttaaiinneerr {
ppuubblliicc:
ssttrruucctt B
Baadd__ooppeerr {
// exception class
ccoonnsstt cchhaarr* pp;
B
Baadd__ooppeerr(ccoonnsstt cchhaarr* pppp) : pp(pppp) { }
};
vviirrttuuaall vvooiidd ppuutt(ccoonnsstt T
T*) { tthhrroow
w B
Baadd__ooppeerr("C
Coonnttaaiinneerr::ppuutt"); }
vviirrttuuaall T
T* ggeett() { tthhrroow
w B
Baadd__ooppeerr("C
Coonnttaaiinneerr::ggeett"); }
vviirrttuuaall T
T*& ooppeerraattoorr[](iinntt) { tthhrroow
w B
Baadd__ooppeerr("C
Coonnttaaiinneerr::[](iinntt)"); }
vviirrttuuaall T
T*& ooppeerraattoorr[](ccoonnsstt cchhaarr*) { tthhrroow
w B
Baadd__ooppeerr("C
Coonnttaaiinneerr::[](cchhaarr*)"); }
// ...
};
C
Coonnttaaiinneerrs could then be declared like this:
ccllaassss L
Liisstt__ccoonnttaaiinneerr : ppuubblliicc C
Coonnttaaiinneerr, pprriivvaattee lliisstt {
ppuubblliicc:
vvooiidd ppuutt(ccoonnsstt T
T*);
T
T* ggeett();
// ... no operator[] ...
};
ccllaassss V
Veeccttoorr__ccoonnttaaiinneerr : ppuubblliicc C
Coonnttaaiinneerr, pprriivvaattee vveeccttoorr {
ppuubblliicc:
T
T*& ooppeerraattoorr[](iinntt);
T
T*& ooppeerraattoorr[](ccoonnsstt cchhaarr*);
// ... no put() or get() ...
};
As long as one is careful, all is well:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
762
Design and Programming
Chapter 24
vvooiidd ff()
{
L
Liisstt__ccoonnttaaiinneerr sscc;
V
Veeccttoorr__ccoonnttaaiinneerr vvcc;
// ...
uusseerr(sscc,vvcc);
}
vvooiidd uusseerr(C
Coonnttaaiinneerr& cc11, C
Coonnttaaiinneerr& cc22)
{
T
T* pp11 = cc11.ggeett();
T
T* pp22 = cc22[33];
// don’t use c2.get() or c1[3]
// ...
}
However, few data structures support both the subscripting and the list-style operations well. Consequently, it is probably not a good idea to specify an interface that requires both. Doing so leads
to the use of run-time type-inquiry (§15.4) or exception handling (Chapter 14) to avoid run-time
errors. For example:
vvooiidd uusseerr22(C
Coonnttaaiinneerr& cc11, C
Coonnttaaiinneerr& cc22) // detection is easy, but recovery can be hard
{
ttrryy {
T
T* pp11 = cc11.ggeett();
T
T* pp22 = cc22[33];
// ...
}
ccaattcchh(C
Coonnttaaiinneerr::B
Baadd__ooppeerr& bbaadd) {
// Oops!
// Now what?
}
}
or
vvooiidd uusseerr33(C
Coonnttaaiinneerr& cc11, C
Coonnttaaiinneerr& cc22) // early detection is tedious; recovery can still be hard
{
iiff (ddyynnaam
miicc__ccaasstt<L
Liisstt__ccoonnttaaiinneerr*>(&cc11) && ddyynnaam
miicc__ccaasstt<V
Veeccttoorr__ccoonnttaaiinneerr*>(&cc22)) {
T
T* pp11 = cc11.ggeett();
T
T* pp22 = cc22[33];
// ...
}
eellssee {
// Oops!
// Now what?
}
}
In both cases, run-time performance can suffer and the generated code can be surprisingly large.
As a result, people are tempted to ignore the potential errors and hope that they don’t actually occur
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 24.4.3
Fat Interfaces
763
when the program is in the hands of users. The problem with this approach is that exhaustive testing is also hard and expensive.
Consequently, fat interfaces are best avoided where run-time performance is at a premium,
where strong guarantees about the correctness of code are required, and in general wherever there is
a good alternative. The use of fat interfaces weakens the correspondence between concepts and
classes and thus opens the floodgates for the use of derivation as a mere implementation convenience.
24.5 Advice [lang.advice]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
Evolve use towards data abstraction and object-oriented programming; §24.2.
Use C++ features and techniques as needed (only); §24.2.
Match design and programming styles; §24.2.1.
Use classes/concepts as a primary focus for design rather than functions/processing; §24.2.1.
Use classes to represent concepts; §24.2.1, §24.3.
Use inheritance to represent hierarchical relationships between concepts (only); §24.2.2,
§24.2.5, §24.3.2.
Express strong guarantees about interfaces in terms of application-level static types; §24.2.3.
Use program generators and direct-manipulation tools to ease well-defined tasks; §24.2.4.
Avoid program generators and direct-manipulation tools that do not interface cleanly with a
general-purpose programming language; §24.2.4.
Keep distinct levels of abstraction distinct; §24.3.1.
Focus on component design; §24.4.
Make sure that a virtual function has a well-defined meaning and that every overriding function implements a version of that desired behavior; §24.3.4, §24.3.2.1.
Use public inheritance to represent is-a relationships; §24.3.4.
Use membership to represent has-a relationships; §24.3.4.
Prefer direct membership over a pointer to a separately-allocated object for expressing simple
containment; §24.3.3, §24.3.4.
Make sure that the uses dependencies are understood, non-cyclic wherever possible, and minimal; §24.3.5.
Define invariants for all classes; §24.3.7.1.
Explicitly express preconditions, postconditions, and other assertions as assertions (possibly
using A
Asssseerrtt()); §24.3.7.2.
Define interfaces to reveal the minimal amount of information needed; §24.4.
Minimize an interface’s dependencies on other interfaces; §24.4.2.
Keep interfaces strongly typed; §24.4.2.
Express interfaces in terms of application-level types; §24.4.2.
Express an interface so that a request could be transmitted to a remote server; §24.4.2.
Avoid fat interfaces; §24.4.3.
Use pprriivvaattee data and member functions wherever possible; §24.4.2.
Use the ppuubblliicc/pprrootteecctteedd distinction to distinguish between the needs of designers of derived
classes and general users; §24.4.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
764
[27]
[28]
[29]
[30]
Design and Programming
Chapter 24
Use templates for generic programming; §24.4.1.
Use templates to parameterize an algorithm by a policy; §24.4.1.
Use templates where compile-time type resolution is needed; §24.4.1.
Use class hierarchies where run-time type resolution is needed; §24.4.1.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
25
________________________________________
________________________________________________________________________________________________________________________________________________________________
Roles of Classes
Some things better change ...
but fundamental themes
should revel in persistence.
– Stephen J. Gould
Kinds of classes — concrete types — abstract types — nodes — changing interfaces —
object I/O — actions — interface classes — handles — use counts — application frameworks — advice — exercises.
25.1 Kinds of Classes [role.intro]
The C++ class is a programming language construct that serves a variety of design needs. In fact, I
find that the solution to most knotty design problems involves the introduction of a new class to
represent some notion that had been left implicit in the previous draft design (and maybe the elimination of other classes). The great variety of roles that a class can play leads to a variety of kinds of
classes that are specialized to serve a particular need well. In this chapter, a few archetypical kinds
of classes are described, together with their inherent strengths and weaknesses:
§25.2 Concrete types
§25.3 Abstract types
§25.4 Nodes
§25.5 Operations
§25.6 Interfaces
§25.7 Handles
§25.8 Application frameworks
These ‘‘kinds of classes’’ are design notions and not language constructs. The unattained, and
probably unattainable, ideal is to have a minimal set of simple and orthogonal kinds of classes from
which all well-behaved and useful classes could be constructed. It is important to note that each of
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
766
Roles of Classes
Chapter 25
these kinds of classes has a place in design and none is inherently better than the others for all uses.
Much confusion in discussions of design and programming comes from people trying to use only
one or two kinds of classes exclusively. This is usually done in the name of simplicity, yet it leads
to contorted and unnatural uses of the favored kinds of classes.
The description here emphasizes the pure forms of these kinds of classes. Naturally, hybrid
forms can also be used. However, a hybrid ought to appear as the result of a design decision based
on an evaluation of the engineering tradeoffs and not a result of some misguided attempt to avoid
making decisions. ‘‘Delaying decisions’’ is too often a euphemism for ‘‘avoiding thinking.’’ Novice designers will usually do best by staying away from hybrids and also by following the style of
an existing component with properties that resemble the desired properties for the new component.
Only experienced programmers should attempt to write a general-purpose component or library,
and every library designer should be ‘‘condemned’’ to use, document, and support his or her creation for some years. Also, please note §23.5.1.
25.2 Concrete Types [role.concrete]
Classes such as vveeccttoorr (§16.3), lliisstt (§17.2.2), D
Daattee (§10.3), and ccoom
mpplleexx (§11.3, §22.5) are
concrete in the sense that each is the representation of a relatively simple concept with all the operations essential for the support of that concept. Also, each has a one-to-one correspondence
between its interface and an implementation and none are intended as a base for derivation. Typically, concrete types are not fitted into a hierarchy of related classes. Each concrete type can be
understood in isolation with minimal reference to other classes. If a concrete type is implemented
well, programs using it are comparable in size and speed to programs a user would write using a
hand-crafted and specialized version of the concept. Similarly, if the implementation changes significantly the interface is usually modified to reflect the change. In all of this, a concrete type
resembles a built-in type. Naturally, the built-in types are all concrete. User-defined concrete
types, such as complex numbers, matrices, error messages, and symbolic references, often provide
fundamental types for some application domain.
The exact nature of a class’ interface determines what implementation changes are significant in
this context; more abstract interfaces leave more scope for implementation changes but can compromise run-time efficiency. Furthermore, a good implementation does not depend on other classes
more than absolutely necessary so that the class can be used without compile-time or run-time overheads caused by the accommodation of other ‘‘similar’’ classes in a program.
To sum up, a class providing a concrete type aims:
[1] to be a close match to a particular concept and implementation strategy;
[2] to provide run-time and space efficiency comparable to ‘‘hand-crafted’’ code through the
use of inlining and of operations taking full advantage of the properties of the concept and
its implementation;
[3] to have only minimal dependency on other classes; and
[4] to be comprehensible and usable in isolation.
The result is a tight binding between user code and implementation code. If the implementation
changes in any way, user code will have to be recompiled because user code almost always contains calls of inline functions or local variables of a concrete type.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.2
Concrete Types
767
The name ‘‘concrete type’’ was chosen to contrast with the common term ‘‘abstract type.’’ The
relationship between concrete and abstract types is discussed in §25.3.
Concrete types cannot directly express commonality. For example, lliisstt and vveeccttoorr provide similar sets of operations and can be used interchangeably by some template functions. However, there
is no relationship between the types lliisstt<iinntt> and vveeccttoorr<iinntt> or between lliisstt<SShhaappee*> and
lliisstt<C
Ciirrccllee*> (§13.6.3), even though w
wee can discern their similarities.
For naively designed concrete types, this implies that code using them in similar ways will look
dissimilar. For example, iterating through a L
Liisstt using a nneexxtt() operation differs dramatically
from iterating through a V
Veeccttoorr using subscripting:
vvooiidd m
myy(L
Liisstt& ssll)
{
ffoorr (T
T* p = ssll.ffiirrsstt(); pp; p = ssll.nneexxtt()) { // ‘‘natural’’ list iteration
// my stuff
}
// ...
}
vvooiidd yyoouurr(V
Veeccttoorr& vv)
{
ffoorr (iinntt i = 00; ii<vv.ssiizzee(); ii++) {
// your stuff
}
// ...
}
// ‘‘natural’’ vector iteration
The difference in iteration style is natural in the sense that a get-next-element operation is essential
to the notion of a list (but not that common for a vector) and subscripting is essential to the notion
of a vector (but not for a list). The availability of operations that are ‘‘natural’’ relative to a chosen
implementation strategy is often crucial for efficiency and important for ease of writing the code.
The obvious snag is that the code for fundamentally similar operations, such as the previous two
loops, can look dissimilar, and code that uses different concrete types for similar operations cannot
be used interchangeably. In realistic examples, it takes significant thought to find similarities and
significant redesign to provide ways of exploiting such similarities once found. The standard containers and algorithms are an example of a thorough rethinking that makes it possible to exploit
similarities between concrete types without losing their efficiency and elegance benefits (§16.2).
To take a concrete type as an argument, a function must specify that exact concrete type as an
argument type. There will be no inheritance relationships that can be used to make the argument
declaration less specific. Consequently, an attempt to exploit similarities between concrete types
will involve templates and generic programming as described in §3.8. When the standard library is
used, iteration becomes:
tteem
mppllaattee<ccllaassss C
C> vvooiidd oouurrss(ccoonnsstt C
C& cc)
{
ffoorr (C
C::ccoonnsstt__iitteerraattoorr p = cc.bbeeggiinn(); pp!=cc.eenndd(); ++pp) { // standard library iteration
// ...
}
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
768
Roles of Classes
Chapter 25
The fundamental similarity between containers is exploited, and this in turn opens the possibility
for further exploitation as done by the standard algorithms (Chapter 18).
To use a concrete type well, the user must understand its particular details. There are (typically)
no general properties that hold for all concrete types in a library that can be relied on to save the
user the bother of knowing the individual classes. This is the price of run-time compactness and
efficiency. Sometimes that is a price well worth paying; sometimes it is not. It can also be the case
that an individual concrete class is easier to understand and use than is a more general (abstract)
class. This is often the case for classes that represent well-known data types such as arrays and
lists.
Note, however, that the ideal is still to hide as much of the implementation as is feasible without
seriously hurting performance. Inline functions can be a great win in this context. Exposing member variables by making them public or by providing set and get functions that allow the user to
manipulate them directly is almost never a good idea (§24.4.2). Concrete types should still be
types and not just bags of bits with a few functions added for convenience.
25.2.1 Reuse of Concrete Types [role.reuse]
Concrete types are rarely useful as bases for further derivation. Each concrete type aims at providing a clean and efficient representation of a single concept. A class that does that well is rarely a
good candidate for the creation of different but related classes through public derivation. Such
classes are more often useful as members or private base classes. There, they can be used effectively without having their interfaces and implementations mixed up with and compromised by
those of the new classes. Consider deriving a new class from D
Daattee:
ccllaassss M
Myy__ddaattee : ppuubblliicc D
Daattee {
// ...
};
Is it ever valid for M
Myy__ddaattee to be used as a plain D
Daattee? Well, that depends on what M
Myy__ddaattee is,
but in my experience it is rare to find a concrete type that makes a good base class without modification.
A concrete type is ‘‘reused’’ unmodified in the same way as built-in types such as iinntt are
(§10.3.4). For example:
ccllaassss D
Daattee__aanndd__ttiim
mee {
pprriivvaattee:
D
Daattee dd;
T
Tiim
mee tt;
ppuubblliicc:
// ...
};
This form of use (reuse?) is usually simple, effective, and efficient.
Maybe it was a mistake not to design D
Daattee to be easy to modify through derivation? It is sometimes asserted that every class should be open to modification by overriding and by access from
derived class member functions. This view leads to a variant of D
Daattee along these lines:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.2.1
Reuse of Concrete Types
769
ccllaassss D
Daattee22 {
ppuubblliicc:
// public interface, consisting primarily of virtual functions
pprrootteecctteedd:
// other implementation details (possibly including some representation)
pprriivvaattee:
// representation and other implementation details
};
To make writing overriding functions easy and efficient, the representation is declared pprrootteecctteedd.
This achieves the objective of making D
Daattee22 arbitrarily malleable by derivation, yet keeping its
user interface unchanged. However, there are costs:
[1] Less efficient basic operations. A C++ virtual function call is a fraction slower than an ordinary function call, virtual functions cannot be inlined as often as non-virtual functions, and a
class with virtual functions typically incurs a one-word space overhead.
[2] The need to use free store. The aim of D
Daattee22 is to allow objects of different classes derived
from D
Daattee22 to be used interchangeably. Because the sizes of these derived classes differ,
the obvious thing to do is to allocate them on the free store and access them through pointers
or references. Thus, the use of genuine local variables dramatically decreases.
[3] Inconvenience to users. To benefit from the polymorphism provided by the virtual functions, accesses to D
Daattee22s must be through pointers or references.
[4] Weaker encapsulation. The virtual operations can be overridden and protected data can be
manipulated from derived classes (§12.4.1.1).
Naturally, these costs are not always significant, and the behavior of a class defined in this way is
often exactly what we want (§25.3, §25.4). However, for a simple concrete type, such as D
Daattee22,
the costs are unnecessary and can be significant.
Finally, a well-designed concrete type is often the ideal representation for a more malleable
type. For example:
ccllaassss D
Daattee33 {
ppuubblliicc:
// public interface, consisting primarily of virtual functions
pprriivvaattee:
D
Daattee dd;
};
This is the way to fit concrete types (including built-in types) into a class hierarchy when that is
needed. See also §25.10[1].
25.3 Abstract Types [role.abstract]
The simplest way of loosening the coupling between users of a class and its implementers and also
between code that creates objects and code that uses such objects is to introduce an abstract class
that represents the interface to a set of implementations of a common concept. Consider a naive
SSeett:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
770
Roles of Classes
Chapter 25
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSeett {
ppuubblliicc:
vviirrttuuaall vvooiidd iinnsseerrtt(T
T*) = 00;
vviirrttuuaall vvooiidd rreem
moovvee(T
T*) = 00;
vviirrttuuaall iinntt iiss__m
meem
mbbeerr(T
T*) = 00;
vviirrttuuaall T
T* ffiirrsstt() = 00;
vviirrttuuaall T
T* nneexxtt() = 00;
vviirrttuuaall ~SSeett() { }
};
This defines an interface to a set with a built-in notion of iteration over its elements. The absence
of a constructor and the presence of a virtual destructor is typical (§12.4.2). Several implementations are possible (§16.2.1). For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt__sseett : ppuubblliicc SSeett<T
T>, pprriivvaattee lliisstt<T
T> {
// ...
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr__sseett : ppuubblliicc SSeett<T
T>, pprriivvaattee vveeccttoorr<T
T> {
// ...
};
The abstract class provides the common interface to the implementations. This means we can use a
SSeett without knowing which kind of implementation is used. For example:
vvooiidd ff(SSeett<P
Pllaannee*>& ss)
{
ffoorr (P
Pllaannee** p = ss.ffiirrsstt(); pp; p = ss.nneexxtt()) {
// my stuff
}
// ...
}
L
Liisstt__sseett<P
Pllaannee*> ssll;
V
Veeccttoorr__sseett<P
Pllaannee*> vv(110000);
vvooiidd gg()
{
ff(ssll);
ff(vv);
}
For concrete types, we required a redesign of the implementation classes to express commonality
and used a template to exploit it. Here, we must design a common interface (in this case SSeett), but
no commonality beyond the ability to implement the interface is required of the classes used for
implementation.
Furthermore, users of SSeett need not know the declarations of L
Liisstt__sseett and V
Veeccttoorr__sseett, so users
need not depend on these declarations and need not be recompiled or in any way changed if
L
Liisstt__sseett or V
Veeccttoorr__sseett changes or even if a new implementation of SSeett – say T
Trreeee__sseett – is
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.3
Abstract Types
771
introduced. All dependencies are contained in functions that explicitly use a class derived from SSeett.
In particular, assuming the conventional use of header files the programmer writing ff(SSeett&) needs
only include SSeett.hh and not L
Liisstt__sseett.hh or V
Veeccttoorr__sseett.hh. An ‘‘implementation header’’ is needed
only where a L
Liisstt__sseett or a V
Veeccttoorr__sseett, respectively, is created. An implementation can be further
insulated from the actual classes by introducing an abstract class that handles requests to create
objects (‘‘a factory;’’ §12.4.4).
This separation of the interface from the implementations implies the absence of access to operations that are ‘‘natural’’ to a particular implementation but not general enough to be part of the
interface. For example, because a SSeett doesn’t have a notion of ordering we cannot support a subscripting operator in the SSeett interface even if we happen to be implementing a particular SSeett using
an array. This implies a run-time cost due to missed hand optimizations. Furthermore, inlining
typically becomes infeasible (except in a local context, when the compiler knows the real type), and
all interesting operations on the interface become virtual function calls. As with concrete types,
sometimes the cost of an abstract type is worth it; sometimes it is not. To sum up, an abstract type
aims to:
[1] define a single concept in a way that allows several implementations of it to coexist in a program;
[2] provide reasonable run-time and space efficiency through the use of virtual functions;
[3] let each implementation have only minimal dependency on other classes; and
[4] be comprehensible in isolation.
Abstract types are not better than concrete types, just different. There are difficult and important
tradeoffs for the user to make. The library provider can dodge the issue by providing both, thus
leaving the choice to the user. The important thing is to be clear about to which world a class
belongs. Limiting the generality of an abstract type in an attempt to compete in speed with a concrete type usually fails. It compromises the ability to use interchangeable implementations without
significant recompilation after changes. Similarly, attempting to provide ‘‘generality’’ in concrete
types to compete with the abstract type notion also usually fails. It compromises the efficiency and
appropriateness of a simple class. The two notions can coexist – indeed, they must coexist because
concrete classes provide the implementations for the abstract types – but they must not be muddled
together.
Abstract types are often not intended to be bases for further derivation beyond their immediate
implementation. Derivation is most often used just to supply implementation. However, a new
interface can be constructed from an abstract class by deriving a more extensive abstract class from
it. This new abstract class must then in turn be implemented through further derivation by a nonabstract class (§15.2.5).
Why didn’t we derive L
Liisstt and V
Veeccttoorr classes from SSeett in the first place to save the introduction
of L
Liisstt__sseett and V
Veeccttoorr__sseett classes? In other words, why have concrete types when we can have
abstract types?
[1] Efficiency. We want to have concrete types such as vveeccttoorr and lliisstt without the overheads
implied by decoupling the implementations from the interfaces (as implied by the abstract
type style).
[2] Reuse. We need a mechanism to fit types designed ‘‘elsewhere’’ (such as vveeccttoorr and lliisstt)
into a new library or application by giving them a new interface (rather than rewriting
them).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
772
Roles of Classes
Chapter 25
[3] Multiple interfaces. Using a single common base for a variety of classes leads to fat interfaces (§24.4.3). Often, it is better to provide a new interface to a class used for new purposes (such as a SSeett interface for a vveeccttoorr) rather than modify its interface to serve multiple
purposes.
Naturally, these points are related. They are discussed in some detail for the IIvvaall__bbooxx example
(§12.4.2, §15.2.5) and in the context of container design (§16.2). Using the SSeett base class would
have resulted in a based-container solution relying on node classes (§25.4).
Section §25.7 describes a more flexible iterator in that the binding of the iterator to the implementation yielding the objects can be specified at the point of initialization and changed at run
time.
25.4 Node Classes [role.node]
A class hierarchy is built with a view of derivation different from the interface/implementer view
used for abstract types. Here, a class is viewed as a foundation on which to build. Even if it is an
abstract class, it usually has some representation and provides some services for its derived classes.
Examples of node classes are P
Poollyyggoonn (§12.3), the initial IIvvaall__sslliiddeerr (§12.4.1), and SSaatteelllliittee
(§15.2).
Typically, a class in a hierarchy represents a general concept of which its derived classes can be
seen as specializations. The typical class designed as an integral part of a hierarchy, a node class,
relies on services from base classes to provide its own services. That is, it calls base class member
functions. A typical node class provides not just an implementation of the interface specified by its
base class (the way an implementation class does for an abstract type). It also adds new functions
itself, thus providing a wider interface. Consider C
Caarr from the traffic-simulation example in
§24.3.2:
ccllaassss C
Caarr : ppuubblliicc V
Veehhiiccllee {
ppuubblliicc:
C
Caarr(iinntt ppaasssseennggeerrss, SSiizzee__ccaatteeggoorryy ssiizzee, iinntt w
weeiigghhtt, iinntt ffcc)
:V
Veehhiiccllee(ppaasssseennggeerrss,ssiizzee,w
weeiigghhtt), ffuueell__ccaappaacciittyy(ffcc) { /* ... */ }
// override relevant virtual functions from Vehicle:
vvooiidd ttuurrnn(D
Diirreeccttiioonn);
// ...
// add Car-specific functions:
vviirrttuuaall aadddd__ffuueell(iinntt aam
moouunntt); // a car needs fuel to run
// ...
};
The important functions are the constructor through which the programmer specifies the basic properties that are relevant to the simulation and the (virtual) functions that allow the simulation routines to manipulate a C
Caarr without knowing its exact type. A C
Caarr might be created and used like
this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.4
Node Classes
773
vvooiidd uusseerr()
{
// ...
C
Caarr* p = nneew
w C
Caarr(33,eeccoonnoom
myy,11550000,6600);
ddrriivvee(pp,bbss__hhoom
mee,M
MH
H); // enter into simulated traffic pattern
// ...
}
A node class usually needs constructors and often a nontrivial constructor. In this, node classes differ from abstract types, which rarely have constructors.
The operations on C
Caarr will typically use operations from the base class V
Veehhiiccllee in their implementations. In addition, the user of a C
Caarr relies on services from its base classes. For example,
V
Veehhiiccllee provides the basic functions dealing with weight and size so that C
Caarr doesn’t have to:
bbooooll B
Brriiddggee::ccaann__ccrroossss(ccoonnsstt V
Veehhiiccllee& rr)
{
iiff (m
maaxx__w
weeiigghhtt<rr.w
weeiigghhtt()) rreettuurrnn ffaallssee;
// ...
}
This allows programmers to create new classes such as C
Caarr and T
Trruucckk from a node class V
Veehhiiccllee
by specifying and implementing only what needs to differ from V
Veehhiiccllee. This is often referred to as
‘‘programming by difference’’ or ‘‘programming by extension.’’
Like many node classes, a C
Caarr is itself a good candidate for further derivation. For example, an
A
Am
mbbuullaannccee needs additional data and operations to deal with emergencies:
ccllaassss A
Am
mbbuullaannccee : ppuubblliicc C
Caarr, ppuubblliicc E
Em
meerrggeennccyy {
ppuubblliicc:
A
Am
mbbuullaannccee();
// override relevant Car virtual functions:
vvooiidd ttuurrnn(D
Diirreeccttiioonn);
// ...
// override relevant Emergency virtual functions:
vviirrttuuaall ddiissppaattcchh__ttoo(ccoonnsstt L
Looccaattiioonn&);
// ...
// add Ambulance-specific functions:
vviirrttuuaall iinntt ppaattiieenntt__ccaappaacciittyy(); // number of stretchers
// ...
};
To sum up, a node class
[1] relies on its base classes both for its implementation and for supplying services to its users;
[2] provides a wider interface (that is, an interface with more public member functions) to its
users than do its base classes;
[3] relies primarily (but not necessarily exclusively) on virtual functions in its public interface;
[4] depends on all of its (direct and indirect) base classes;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
774
Roles of Classes
Chapter 25
[5] can be understood only in the context of its base classes;
[6] can be used as a base for further derivation; and
[7] can be used to create objects.
Not every node class will conform to all of points 1, 2, 6, and 7, but most do. A class that does not
conform to point 6 resembles a concrete type and could be called a concrete node class. For example, a concrete node class can be used to implement an abstract class (§12.4.2) and variables of such
a class can be allocated statically and on the stack. Such a class is sometimes called a leaf class.
However, any code operating on a pointer or reference to a class with virtual functions must take
into account the possibility of a further derived class (or assume without language support that further derivation hasn’t happened). A class that does not conform to point 7 resembles an abstract
type and could be called an abstract node class. Because of unfortunate traditions, many node
classes have at least some pprrootteecctteedd members to provide a less restricted interface for derived
classes (§12.4.1.1).
Point 4 implies that to compile a node class, a programmer must include the declarations of all
of its direct and indirect base classes and all of the declarations that they, in turn, depend on. In
this, a node class again provides a contrast to an abstract type. A user of an abstract type does not
depend on the classes used to implement it and need not include them to compile.
25.4.1 Changing Interfaces [role.io]
By definition, a node class is part of a class hierarchy. Not every class in a hierarchy needs to offer
the same interface. In particular, a derived class can provide more member functions, and a sibling
class can provide a completely different set of functions. From a design perspective, ddyynnaam
miicc__ccaasstt
(§15.4) can be seen as a mechanism for asking an object if it provides a given interface.
As an example, consider a simple object I/O system. Users want to read objects from a stream,
determine that they are of the expected types, and then use them. For example:
vvooiidd uusseerr()
{
// ... open file assumed to hold shapes, and attach ss as an istream for that file ...
IIoo__oobbjj* p = ggeett__oobbjj(ssss); // read object from stream
iiff (SShhaappee* sspp = ddyynnaam
miicc__ccaasstt<SShhaappee*>(pp)) {
sspp->ddrraaw
w(); // use the Shape
// ...
}
eellssee {
// oops: non-shape in Shape file
}
}
The function uusseerr() deals with shapes exclusively through the abstract class SShhaappee and can therefore use every kind of shape. The use of ddyynnaam
miicc__ccaasstt is essential because the object I/O system
can deal with many other kinds of objects and the user may accidentally have opened a file containing perfectly good objects of classes that the user has never heard of.
This object I/O system assumes that every object read or written is of a class derived from
IIoo__oobbjj. Class IIoo__oobbjj must be a polymorphic type to allow us to use ddyynnaam
miicc__ccaasstt. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.4.1
ccllaassss IIoo__oobbjj {
ppuubblliicc:
vviirrttuuaall IIoo__oobbjj* cclloonnee() ccoonnsstt =00;
vviirrttuuaall ~IIoo__oobbjj() {}
};
Changing Interfaces
775
// polymorphic
The critical function in the object I/O system is ggeett__oobbjj(), which reads data from an iissttrreeaam
m and
creates class objects based on that data. Assume that the data representing an object on an input
stream is prefixed by a string identifying the object’s class. The job of ggeett__oobbjj() is to read that
string prefix and call a function capable of creating and initializing an object of the right class. For
example:
ttyyppeeddeeff IIoo__oobbjj* (*P
PF
F)(iissttrreeaam
m&);
// pointer to function returning an Io_obj*
m
maapp<ssttrriinngg,P
PF
F> iioo__m
maapp;
// maps strings to creation functions
bbooooll ggeett__w
woorrdd(iissttrreeaam
m& iiss, ssttrriinngg& ss);
// read a word from is into s
IIoo__oobbjj* ggeett__oobbjj(iissttrreeaam
m& ss)
{
ssttrriinngg ssttrr;
bbooooll b = ggeett__w
woorrdd(ss,ssttrr);
iiff (bb == ffaallssee) tthhrroow
w N
Noo__ccllaassss();
// read initial word into str
// io format problem
P
PF
F f = iioo__m
maapp[ssttrr];
// lookup ‘str’ to get function
iiff (ff == 00) tthhrroow
w U
Unnkknnoow
wnn__ccllaassss(); // no match for ‘str’
rreettuurrnn ff(ss);
// construct object from stream
}
The m
maapp called iioo__m
maapp holds pairs of name strings and functions that can construct objects of the
class with that name.
We could define class SShhaappee in the usual way, except for deriving it from IIoo__oobbjj as required by
uusseerr():
ccllaassss SShhaappee : ppuubblliicc IIoo__oobbjj {
// ...
};
However, it would be more interesting (and in many cases more realistic) to use a defined SShhaappee
(§2.6.2) unchanged:
ccllaassss IIoo__cciirrccllee : ppuubblliicc C
Ciirrccllee, ppuubblliicc IIoo__oobbjj {
ppuubblliicc:
IIoo__cciirrccllee* cclloonnee() ccoonnsstt { rreettuurrnn nneew
w IIoo__cciirrccllee(*tthhiiss); } // using copy constructor
IIoo__cciirrccllee(iissttrreeaam
m&); // initialize from input stream
ssttaattiicc IIoo__oobbjj* nneew
w__cciirrccllee(iissttrreeaam
m& ss) { rreettuurrnn nneew
w IIoo__cciirrccllee(ss); }
// ...
};
This is an example of how a class can be fitted into a hierarchy using an abstract class with less
foresight than would have been required to build it as a node class in the first place (§12.4.2,
§25.3).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
776
Roles of Classes
Chapter 25
The IIoo__cciirrccllee(iissttrreeaam
m&) constructor initializes an object with data from its iissttrreeaam
m argument.
The nneew
w__cciirrccllee() function is the one put into the iioo__m
maapp to make the class known to the object
I/O system. For example:
iioo__m
maapp["IIoo__cciirrccllee"]=&IIoo__cciirrccllee::nneew
w__cciirrccllee;
Other shapes are constructed in the same way:
ccllaassss IIoo__ttrriiaannggllee : ppuubblliicc T
Trriiaannggllee, ppuubblliicc IIoo__oobbjj {
// ...
};
If the provision of the object I/O scaffolding becomes tedious, a template might help:
tteem
mppllaattee<ccllaassss T
T> ccllaassss IIoo : ppuubblliicc T
T, ppuubblliicc IIoo__oobbjj {
ppuubblliicc:
IIoo* cclloonnee() ccoonnsstt { rreettuurrnn nneew
w IIoo(*tthhiiss); }
// override Io_obj::clone()
IIoo(iissttrreeaam
m&);
// initialize from input stream
ssttaattiicc IIoo* nneew
w__iioo(iissttrreeaam
m& ss) { rreettuurrnn nneew
w IIoo(ss); }
// ...
};
Given this, we can define IIoo__cciirrccllee:
ttyyppeeddeeff IIoo<C
Ciirrccllee> IIoo__cciirrccllee;
We still need to define IIoo<C
Ciirrccllee>::IIoo(iissttrreeaam
m&) explicitly, though, because it needs to know
about the details of C
Ciirrccllee.
The IIoo template is an example of a way to fit concrete types into a class hierarchy by providing
a handle that is a node in that hierarchy. It derives from its template parameter to allow casting
from IIoo__oobbjj. Unfortunately, this precludes using IIoo for a built-in type:
ttyyppeeddeeff IIoo<D
Daattee> IIoo__ddaattee;
ttyyppeeddeeff IIoo<iinntt> IIoo__iinntt;
// wrap concrete type
// error: cannot derive from built-in type
This problem can be handled by providing a separate template for built-in types or by using a class
representing a built-in type (§25.10[1]).
This simple object I/O system may not do everything anyone ever wanted, but it almost fits on a
single page and the key mechanisms have many uses. In general, these techniques can be used to
invoke a function based on a string supplied by a user and to manipulate objects of unknown type
through interfaces discovered through run-time type identification.
25.5 Actions [role.action]
The simplest and most obvious way to specify an action in C++ is to write a function. However, if
an action has to be delayed, has to be transmitted ‘‘elsewhere’’ before being performed, requires its
own data, has to be combined with other actions (§25.10[18,19]), etc., then it often becomes attractive to provide the action in the form of a class that can execute the desired action and provide other
services as well. A function object used with the standard algorithms is an obvious example
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.5
Actions
777
(§18.4), and so are the manipulators used with iioossttrreeaam
ms (§21.4.6). In the former case, the actual
action is performed by the application operator, and in the latter case, by the << or >> operators. In
the case of F
Foorrm
m (§21.4.6.3) and M
Maattrriixx (§22.4.7), compositor classes were used to delay execution until sufficient information had been gathered for efficient execution.
A common form of action class is a simple class containing just one virtual function (typically
called something like ‘‘do_it’’):
ccllaassss A
Accttiioonn {
ppuubblliicc:
vviirrttuuaall iinntt ddoo__iitt(iinntt) = 00;
vviirrttuuaall ~A
Accttiioonn() { }
};
Given this, we can write code – say a menu – that can store actions for later execution without
using pointers to functions, without knowing anything about the objects invoked, and without even
knowing the name of the operation it invokes. For example:
ccllaassss W
Wrriittee__ffiillee : ppuubblliicc A
Accttiioonn {
F
Fiillee& ff;
ppuubblliicc:
iinntt ddoo__iitt(iinntt) { rreettuurrnn ff.w
wrriittee().ssuucccceeeedd(); }
};
ccllaassss E
Errrroorr__rreessppoonnssee : ppuubblliicc A
Accttiioonn {
ssttrriinngg m
meessssaaggee;
ppuubblliicc:
iinntt ddoo__iitt(iinntt);
};
iinntt E
Errrroorr__rreessppoonnssee::ddoo__iitt(iinntt)
{
R
Reessppoonnssee__bbooxx ddbb(m
meessssaaggee.cc__ssttrr(), "ccoonnttiinnuuee","ccaanncceell","rreettrryy");
ssw
wiittcchh (ddbb.ggeett__rreessppoonnssee()) {
ccaassee 00:
rreettuurrnn 00;
ccaassee 11:
aabboorrtt();
ccaassee 22:
ccuurrrreenntt__ooppeerraattiioonn.rreeddoo();
rreettuurrnn 11;
}
}
A
Accttiioonn* aaccttiioonnss[] = {
nneew
w W
Wrriittee__ffiillee(ff),
nneew
w E
Errrroorr__rreessppoonnssee("yyoouu bblleew
w iitt aaggaaiinn"),
// ...
};
A user of A
Accttiioonn can be completely insulated from any knowledge of derived classes such as
W
Wrriittee__ffiillee and E
Errrroorr__rreessppoonnssee.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
778
Roles of Classes
Chapter 25
This is a powerful technique that should be treated with some care by people with a background
in functional decomposition. If too many classes start looking like A
Accttiioonn, the overall design of the
system may have deteriorated into something unduly functional.
Finally, a class can encode an operation for execution on a remote machine or for storage for
future use (§25.10[18]).
25.6 Interface Classes [role.interface]
One of the most important kinds of classes is the humble and mostly overlooked interface class.
An interface class doesn’t do much – if it did, it wouldn’t be an interface class. It simply adjusts
the appearance of some service to local needs. Because it is impossible in principle to serve all
needs equally well all the time, interface classes are essential to allow sharing without forcing all
users into a common straitjacket.
The purest form of an interface doesn’t even cause any code to be generated. Consider the
V
Veeccttoorr specialization from §13.5:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr<T
T*> : pprriivvaattee V
Veeccttoorr<vvooiidd*> {
ppuubblliicc:
ttyyppeeddeeff V
Veeccttoorr<vvooiidd*> B
Baassee;
V
Veeccttoorr() : B
Baassee() {}
V
Veeccttoorr(iinntt ii) : B
Baassee(ii) {}
T
T*& ooppeerraattoorr[](iinntt ii) { rreettuurrnn ssttaattiicc__ccaasstt<T
T*&>(B
Baassee::ooppeerraattoorr[](ii)); }
// ...
};
This (partial) specialization turns the unsafe V
Veeccttoorr<vvooiidd*> into a much more useful family of
type-safe vector classes. Inline functions are often essential for making interface classes affordable.
In cases such as this, when an inline forwarding function does only type adjustment, there is no
added overhead in time or space.
Naturally, an abstract base class representing an abstract type implemented by concrete types
(§25.2) is a form of interface class, as are the handles from §25.7. However, here we will focus on
classes that have no more specific function than adjusting an interface.
Consider the problem of merging two hierarchies using multiple inheritance. What can be done
if there is a name clash, that is, two classes have used the same name for virtual functions performing completely different operations? For example, consider a Wild-West videogame in which user
interactions are handled by a general window class:
ccllaassss W
Wiinnddoow
w{
// ...
vviirrttuuaall vvooiidd ddrraaw
w(); // display image
};
ccllaassss C
Coow
wbbooyy {
// ...
vviirrttuuaall vvooiidd ddrraaw
w(); // pull gun from holster
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.6
Interface Classes
779
ccllaassss C
Coow
wbbooyy__w
wiinnddoow
w : ppuubblliicc C
Coow
wbbooyy, ppuubblliicc W
Wiinnddoow
w{
// ...
};
A C
Coow
wbbooyy__w
wiinnddoow
w represents the animation of a cowboy in the game and handles the
user/player’s interactions with the cowboy character. We would prefer to use multiple inheritance,
rather than declaring either the W
Wiinnddoow
w or the C
Coow
wbbooyy as members, because there will be many
service functions defined for both W
Wiinnddoow
ws and C
Coow
wbbooyys. We would like to pass a
C
Coow
wbbooyy__w
wiinnddoow
w to such functions without special actions required by the programmer. However,
this leads to a problem defining C
Coow
wbbooyy__w
wiinnddoow
w versions of C
Coow
wbbooyy::ddrraaw
w() and
W
Wiinnddoow
w::ddrraaw
w().
There can be only one function defined in C
Coow
wbbooyy__w
wiinnddoow
w called ddrraaw
w(). Yet because service functions manipulate W
Wiinnddoow
ws and C
Coow
wbbooyys without knowledge of C
Coow
wbbooyy__w
wiinnddoow
ws,
C
Coow
wbbooyy__w
wiinnddoow
w must override both C
Coow
wbbooyy’s ddrraaw
w() and W
Wiinnddoow
w’s ddrraaw
w(). Overriding both
functions by a single ddrraaw
w() function would be wrong because, despite the common name, the
ddrraaw
w() functions are unrelated and cannot be redefined by a common function.
Finally, we would also like C
Coow
wbbooyy__w
wiinnddoow
w to have distinct, unambiguous names for the
inherited functions C
Coow
wbbooyy::ddrraaw
w() and W
Wiinnddoow
w::ddrraaw
w().
To solve this problem, we need to introduce an extra class for C
Coow
wbbooyy and an extra class for
W
Wiinnddoow
w. These classes introduce the two new names for the ddrraaw
w() functions and ensure that a
call of the ddrraaw
w() functions in C
Coow
wbbooyy and W
Wiinnddoow
w calls the functions with the new names:
ccllaassss C
CC
Coow
wbbooyy : ppuubblliicc C
Coow
wbbooyy {
ppuubblliicc:
vviirrttuuaall iinntt ccoow
w__ddrraaw
w() = 00;
vvooiidd ddrraaw
w() { ccoow
w__ddrraaw
w(); }
};
ccllaassss W
WW
Wiinnddoow
w : ppuubblliicc W
Wiinnddoow
w{
ppuubblliicc:
vviirrttuuaall iinntt w
wiinn__ddrraaw
w() = 00;
vvooiidd ddrraaw
w() { w
wiinn__ddrraaw
w(); }
};
// interface to Cowboy renaming draw()
// override Cowboy::draw()
// interface to Window renaming draw()
// override Window::draw()
We can now compose a C
Coow
wbbooyy__w
wiinnddoow
w from the interface classes C
CC
Coow
wbbooyy and W
WW
Wiinnddoow
w and
override ccoow
w__ddrraaw
w() and w
wiinn__ddrraaw
w() with the desired effect:
ccllaassss C
Coow
wbbooyy__w
wiinnddoow
w : ppuubblliicc C
CC
Coow
wbbooyy, ppuubblliicc W
WW
Wiinnddoow
w{
// ...
vvooiidd ccoow
w__ddrraaw
w();
vvooiidd w
wiinn__ddrraaw
w();
};
Note that this problem was serious only because the two ddrraaw
w() functions have the same argument type. If they have different argument types, the usual overloading resolution rules will ensure
that no problem manifests itself despite the unrelated functions having the same name.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
780
Roles of Classes
Chapter 25
For each use of an interface class, one could imagine a special-purpose language extension that
could perform the desired adjustment a little bit more efficiently or a little more elegantly. However, each use of an interface class is infrequent and supporting them all with specialized language
constructs would impose a prohibitive burden of complexity. In particular, name clashes arising
from the merging of class hierarchies are not common (compared with how often a programmer
will write a class) and tend to arise from the merging of hierarchies generated from dissimilar cultures – such as games and window systems. Merging such dissimilar hierarchies is not easy, and
resolving name clashes will more often than not be the least of the programmer’s problems. Other
problems include dissimilar error handling, dissimilar initialization, and dissimilar memorymanagement strategies. The resolution of name clashes is discussed here because the technique of
introducing an interface class with a forwarding function has many other applications. It can be
used not only to change names, but also to change argument and return types, to introduce run-time
checking, etc.
Because the forwarding functions C
CC
Coow
wbbooyy::ddrraaw
w() and W
WW
Wiinnddoow
w::ddrraaw
w() are virtual
functions, they cannot be optimized away by simple inlining. It is, however, possible for a compiler to recognize them as simple forwarding functions and then optimize them out of the call
chains that go through them.
25.6.1 Adjusting Interfaces [role.range]
A major use of interface functions is to adjust an interface to match users’ expectations better, thus
moving code that would have been scattered throughout a user’s code into an interface. For example, the standard vveeccttoorr is zero-based. Users who want ranges other than 0 to ssiizzee-11 must adjust
their usage. For example:
vvooiidd ff()
{
vveeccttoorr vv<iinntt>(1100);
// range [0:9]
// pretend v is in the range [1:10]:
ffoorr (iinntt i = 11; ii<=1100; ii++) {
vv[ii-11] = 77;
// remember to adjust index
// ...
}
}
A better way is to provide a vveeccttoorr with arbitrary bounds:
ccllaassss V
Veeccttoorr : ppuubblliicc vveeccttoorr<iinntt> {
iinntt llbb;
ppuubblliicc:
V
Veeccttoorr(iinntt lloow
w, iinntt hhiigghh) : vveeccttoorr<iinntt>(hhiigghh-lloow
w+11) { llbb=lloow
w; }
iinntt& ooppeerraattoorr[](iinntt ii) { rreettuurrnn vveeccttoorr<iinntt>::ooppeerraattoorr[](ii-llbb); }
iinntt lloow
w() { rreettuurrnn llbb; }
iinntt hhiigghh() { rreettuurrnn llbb+ssiizzee()-11; }
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.6.1
Adjusting Interfaces
781
AV
Veeccttoorr can be used like this:
vvooiidd gg()
{
V
Veeccttoorr vv(11,1100);
// range [1:10]
ffoorr (iinntt i = 11; ii<=1100; ii++) {
vv[ii] = 77;
// ...
}
}
This imposes no overhead compared to the previous example. Clearly, the V
Veeccttoorr version is easier
to read and write and is less error-prone.
Interface classes are usually rather small and (by definition) do rather little. However, they crop
up wherever software written according to different traditions needs to cooperate because then there
is a need to mediate between different conventions. For example, interface classes are often used to
provide C++ interfaces to non-C++ code and to insulate application code from the details of
libraries (to leave open the possibility of replacing the library with another).
Another important use of interface classes is to provide checked or restricted interfaces. For
example, it is not uncommon to have integer variables that are supposed to have values in a given
range only. This can be enforced (at run time) by a simple template:
tteem
mppllaattee<iinntt lloow
w, iinntt hhiigghh> ccllaassss R
Raannggee {
iinntt vvaall;
ppuubblliicc:
ccllaassss E
Errrroorr { }; // exception class
R
Raannggee(iinntt ii) { A
Asssseerrtt<E
Errrroorr>(lloow
w<=ii&&ii<hhiigghh); vvaall = ii; } // see §24.3.7.2
R
Raannggee ooppeerraattoorr=(iinntt ii) { rreettuurrnn *tthhiiss=R
Raannggee(ii); }
ooppeerraattoorr iinntt() { rreettuurrnn vvaall; }
// ...
};
vvooiidd ff(R
Raannggee<22,1177>);
vvooiidd gg(R
Raannggee<-1100,1100>);
vvooiidd hh(iinntt xx)
{
R
Raannggee<00,22000011> i = xx;
iinntt ii11 = ii;
ff(33);
ff(1177);
gg(-77);
gg(110000);
// might throw Range::Error
// throws Range::Error
// throws Range::Error
}
The R
Raannggee template is easily extended to handle ranges of arbitrary scalar types (§25.10[7]).
An interface class that controls access to another class or adjusts its interface is sometimes
called a w
wrraappppeerr.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
782
Roles of Classes
Chapter 25
25.7 Handle Classes [role.handle]
An abstract type provides an effective separation between an interface and its implementations.
However, as used in §25.3 the connection between an interface provided by an abstract type and its
implementation provided by a concrete type is permanent. For example, it is not possible to rebind
an abstract iterator from one source – say, a set – to another – say, a stream – once the original
source becomes exhausted.
Furthermore, unless one manipulates an object implementing an abstract class through pointers
or references, the benefits of virtual functions are lost. User code may become dependent on details
of the implementation classes because an abstract type cannot be allocated statically or on the stack
(including being accepted as a by-value argument) without its size being known. Using pointers
and references implies that the burden of memory management falls on the user.
Another limitation of the abstract class approach is that a class object is of fixed size. Classes,
however, are used to represent concepts that require varying amounts of storage to implement them.
A popular technique for dealing with these issues is to separate what is used as a single object
into two parts: a handle providing the user interface and a representation holding all or most of the
object’s state. The connection between the handle and the representation is typically a pointer in
the handle. Often, handles have a bit more data than the simple representation pointer, but not
much more. This implies that the layout of a handle is typically stable even when the representation changes and also that handles are small enough to move around relatively freely so that pointers and references need not be used by the user.
Handle
......
.....
......
......
......
....
Representation
The SSttrriinngg from §11.12 is a simple example of a handle. The handle provides an interface to,
access control for, and memory management for the representation. In this case, both the handle
and the representation are concrete types, but the representation class is often an abstract class.
Consider the abstract type SSeett from §25.3. How could one provide a handle for it, and what
benefits and cost would that involve? Given a set class, one might simply define a handle by overloading the -> operator:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSeett__hhaannddllee {
SSeett<T
T>* rreepp;
ppuubblliicc:
SSeett<T
T>* ooppeerraattoorr->() { rreettuurrnn rreepp; }
SSeett__hhaannddllee(SSeett<T
T>* pppp) : rreepp(pppp) { }
};
This doesn’t significantly affect the way SSeetts are used; one simply passes SSeett__hhaannddllees around
instead of SSeett&s or SSeett*s. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.7
Handle Classes
783
vvooiidd ff(SSeett__hhaannddllee<iinntt> ss)
{
ffoorr (iinntt* p = ss->ffiirrsstt(); pp; p = ss->nneexxtt())
{
// ...
}
}
vvooiidd uusseerr()
{
SSeett__hhaannddllee<iinntt> ssll(nneew
w L
Liisstt__sseett<iinntt>);
SSeett__hhaannddllee<iinntt> vv(nneew
w V
Veeccttoorr__sseett<iinntt>(110000));
ff(ssll);
ff(vv);
}
Often, we want a handle to do more than just provide access. For example, if the SSeett class and the
SSeett__hhaannddllee class are designed together it is easy to do reference counting by including a use count
in each SSeett. In general, we do not want to design a handle together with what it is a handle to, so
we will have to store any information that needs to be shared by a handle in a separate object. In
other words, we would like to have non-intrusive handles in addition to the intrusive ones. For
example, here is a handle that removes an object when its last handle goes away:
tteem
mppllaattee<ccllaassss X
X> ccllaassss H
Haannddllee {
X
X* rreepp;
iinntt* ppccoouunntt;
ppuubblliicc:
X
X* ooppeerraattoorr->() { rreettuurrnn rreepp; }
H
Haannddllee(X
X* pppp) : rreepp(pppp), ppccoouunntt(nneew
w iinntt(11)) { }
H
Haannddllee(ccoonnsstt H
Haannddllee& rr) : rreepp(rr.rreepp), ppccoouunntt(rr.ppccoouunntt) { (*ppccoouunntt)++; }
H
Haannddllee& ooppeerraattoorr=(ccoonnsstt H
Haannddllee& rr)
{
iiff (rreepp == rr.rreepp) rreettuurrnn *tthhiiss;
iiff (--(*ppccoouunntt) == 00) {
ddeelleettee rreepp;
ddeelleettee ppccoouunntt;
}
rreepp = rr.rreepp;
ppccoouunntt = rr.ppccoouunntt;
(*ppccoouunntt)++;
rreettuurrnn *tthhiiss;
}
~H
Haannddllee() { iiff (--(*ppccoouunntt) == 00) { ddeelleettee rreepp; ddeelleettee ppccoouunntt; } }
// ...
};
Such a handle can be passed around freely. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
784
Roles of Classes
Chapter 25
vvooiidd ff11(H
Haannddllee<SSeett>);
H
Haannddllee<SSeett> ff22()
{
H
Haannddllee<SSeett> hh(nneew
w L
Liisstt__sseett<iinntt>);
// ...
rreettuurrnn hh;
}
vvooiidd gg()
{
H
Haannddllee<SSeett> hhhh = ff22();
ff11(hhhh);
// ...
}
Here, the set created in ff22() will be deleted upon exit from gg() – unless ff11() held on to a copy;
the programmer does not need to know.
Naturally, this convenience comes at a cost, but for many applications the cost of storing and
maintaining the use count is acceptable.
Sometimes, it is useful to extract the representation pointer from a handle and use it directly.
For example, this would be needed to pass an object to a function that does not know about handles. This works nicely provided the called function does not destroy the object passed to it or
store a pointer to it for use after returning to its caller. An operation for rebinding a handle to a new
representation can also be useful:
tteem
mppllaattee<ccllaassss X
X> ccllaassss H
Haannddllee {
// ...
X
X* ggeett__rreepp() { rreettuurrnn rreepp; }
vvooiidd bbiinndd(X
X* pppp)
{
iiff (pppp != rreepp) {
iiff (--*ppccoouunntt == 00) {
ddeelleettee rreepp;
*ppccoouunntt = 11;
}
eellssee
ppccoouunntt = nneew
w iinntt(11);
rreepp = pppp;
}
}
// recycle pcount
// new pcount
};
Note that derivation of new classes from H
Haannddllee isn’t particularly useful. It is a concrete type
without virtual functions. The idea is to have one handle class for a family of classes defined by a
base class. Derivation from this base class can be a powerful technique. It applies to node classes
as well as to abstract types.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.7
Handle Classes
785
As written, H
Haannddllee doesn’t deal with inheritance. To get a class that acts like a genuine usecounted pointer, H
Haannddllee needs to be combined with P
Pttrr from §13.6.3.1 (see §25.10[2]).
A handle that provides an interface that is close to identical to the class for which it is a handle
is often called a proxy. This is particularly common for handles that refer to an object on a remote
machine.
25.7.1 Operations in Handles [role.handle.op]
Overloading -> enables a handle to gain control and do some work on each access to an object.
For example, one could collect statistics about the number of uses of the object accessed through a
handle:
tteem
mppllaattee <ccllaassss T
T> ccllaassss X
Xhhaannddllee {
T
T* rreepp;
iinntt nnoo__ooff__aacccceesssseess;
ppuubblliicc:
T
T* ooppeerraattoorr->() { nnoo__ooff__aacccceesssseess++; rreettuurrnn rreepp; }
// ...
};
Handles for which work needs to be done both before and after access require more elaborate programming. For example, one might want a set with locking while an insertion or a removal is
being done. Essentially, the representation class’ interface needs to be replicated in the handle
class:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSeett__ccoonnttrroolllleerr {
SSeett<T
T>* rreepp;
L
Loocckk lloocckk;
// ...
ppuubblliicc:
vvooiidd iinnsseerrtt(T
T* pp) { L
Loocckk__ppttrr xx(lloocckk); rreepp->iinnsseerrtt(pp); } // see §14.4.1
vvooiidd rreem
moovvee(T
T* pp) { L
Loocckk__ppttrr xx(lloocckk); rreepp->rreem
moovvee(pp); }
iinntt iiss__m
meem
mbbeerr(T
T* pp) { rreettuurrnn rreepp->iiss__m
meem
mbbeerr(pp); }
T ggeett__ffiirrsstt() { T
T* p = rreepp->ffiirrsstt(); rreettuurrnn p ? *pp : T
T(); }
T ggeett__nneexxtt() { T
T* p = rreepp->nneexxtt(); rreettuurrnn p ? *pp : T
T(); }
T ffiirrsstt() { L
Loocckk__ppttrr xx(lloocckk); T ttm
mpp = *rreepp->ffiirrsstt(); rreettuurrnn ttm
mpp; }
T nneexxtt() { L
Loocckk__ppttrr xx(lloocckk); T ttm
mpp = *rreepp->nneexxtt(); rreettuurrnn ttm
mpp; }
// ...
};
Providing these forwarding functions is tedious (and therefore somewhat error-prone), although it is
neither difficult nor costly in run time.
Note that only some of the sseett functions required locking. In my experience, it is typical that a
class needing pre- and post-actions requires them for only some member functions. In the case of
locking, locking on all operations – as is done for monitors in some systems – leads to excess locking and sometimes causes a noticeable decrease in concurrency.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
786
Roles of Classes
Chapter 25
An advantage of the elaborate definition of all operations on the handle over the overloading of
-> style of handles is that it is possible to derive from class SSeett__ccoonnttrroolllleerr. Unfortunately, some
of the benefits of being a handle are compromised if data members are added in the derived class.
In particular, the amount of code shared (in the handled class) decreases compared to the amount of
code written in each handle.
25.8 Application Frameworks [role.framework]
Components built out of the kinds of classes described in §25.2– §25.7 support design and reuse of
code by supplying building blocks and ways of combining them; the application builder designs a
framework into which these common building blocks are fitted. An alternative, and sometimes
more ambitious, approach to the support of design and reuse is to provide code that establishes a
common framework into which the application builder fits application-specific code as building
blocks. Such an approach is often called an application framework. The classes establishing such
a framework often have such fat interfaces that they are hardly types in the traditional sense. They
approximate the ideal of being complete applications, except that they don’t do anything. The specific actions are supplied by the application programmer.
As an example, consider a filter, that is, a program that reads an input stream, (maybe) performs
some actions based on that input, (maybe) produces an output stream, and (maybe) produces a final
result. A naive framework for such programs would provide a set of operations that an application
programmer might supply:
ccllaassss F
Fiilltteerr {
ppuubblliicc:
ccllaassss R
Reettrryy {
ppuubblliicc:
vviirrttuuaall ccoonnsstt cchhaarr* m
meessssaaggee() { rreettuurrnn 00; }
};
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
vviirrttuuaall
vvooiidd ssttaarrtt() { }
iinntt rreeaadd() = 00;
vvooiidd w
wrriittee() { }
vvooiidd ccoom
mppuuttee() { }
iinntt rreessuulltt() = 00;
vviirrttuuaall iinntt rreettrryy(R
Reettrryy& m
m) { cceerrrr << m
m.m
meessssaaggee() << ´\\nn´; rreettuurrnn 22; }
vviirrttuuaall ~F
Fiilltteerr() { }
};
Functions that a derived class must supply are declared pure virtual; other functions are simply
defined to do nothing.
The framework also provides a main loop and a rudimentary error-handling mechanism:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.8
Application Frameworks
787
iinntt m
maaiinn__lloooopp(F
Fiilltteerr* pp)
{
ffoorr(;;) {
ttrryy {
pp->ssttaarrtt();
w
whhiillee (pp->rreeaadd()) {
pp->ccoom
mppuuttee();
pp->w
wrriittee();
}
rreettuurrnn pp->rreessuulltt();
}
ccaattcchh (F
Fiilltteerr::R
Reettrryy& m
m) {
iiff (iinntt i = pp->rreettrryy(m
m)) rreettuurrnn ii;
}
ccaattcchh (...) {
cceerrrr << "F
Faattaall ffiilltteerr eerrrroorr\\nn";
rreettuurrnn 11;
}
}
}
Finally, I could write my program like this:
ccllaassss M
Myy__ffiilltteerr : ppuubblliicc F
Fiilltteerr {
iissttrreeaam
m& iiss;
oossttrreeaam
m& ooss;
iinntt nncchhaarr;
ppuubblliicc:
iinntt rreeaadd() { cchhaarr cc; iiss.ggeett(cc); rreettuurrnn iiss.ggoooodd(); }
vvooiidd ccoom
mppuuttee() { nncchhaarr++; }
iinntt rreessuulltt() { ooss << nncchhaarr << " cchhaarraacctteerrss rreeaadd\\nn"; rreettuurrnn 00; }
M
Myy__ffiilltteerr(iissttrreeaam
m& iiii, oossttrreeaam
m& oooo) : iiss(iiii), ooss(oooo), nncchhaarr(00) { }
};
and activate it like this:
iinntt m
maaiinn()
{
M
Myy__ffiilltteerr ff(cciinn,ccoouutt);
rreettuurrnn m
maaiinn__lloooopp(&ff);
}
Naturally, for a framework to be of significant use, it must provide more structure and many more
services than this simple example does. In particular, a framework is typically a hierarchy of node
classes. Having the application programmer supply leaf classes in a deeply nested hierarchy allows
commonality between applications and reuse of services provided by such a hierarchy. A framework will also be supported by a library that provides classes that are useful for the application programmer when specifying the action classes.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
788
Roles of Classes
Chapter 25
25.9 Advice [role.advice]
[1] Make conscious decisions about how a class is to be used (both as a designer and as a user);
§25.1.
[2] Be aware of the tradeoffs involved among the different kinds of classes; §25.1.
[3] Use concrete types to represent simple independent concepts; §25.2.
[4] Use concrete types to represent concepts where close-to-optimal efficiency is essential; §25.2.
[5] Don’t derive from a concrete class; §25.2.
[6] Use abstract classes to represent interfaces where the representation of objects might change;
§25.3.
[7] Use abstract classes to represent interfaces where different representations of objects need to
coexist; §25.3.
[8] Use abstract classes to represent new interfaces to existing types; §25.3.
[9] Use node classes where similar concepts share significant implementation details; §25.4.
[10] Use node classes to incrementally augment an implementation; §25.4.
[11] Use Run-time Type Identification to obtain interfaces from an object; §25.4.1.
[12] Use classes to represent actions with associated state; §25.5.
[13] Use classes to represent actions that need to be stored, transmitted, or delayed; §25.5.
[14] Use interface classes to adapt a class for a new kind of use (without modifying the class);
§25.6.
[15] Use interface classes to add checking; §25.6.1.
[16] Use handles to avoid direct use of pointers and references; §25.7.
[17] Use handles to manage shared representations; §25.7.
[18] Use an application framework where an application domain allows for the control structure to
be predefined; §25.8.
25.10 Exercises
[role.exercises]
1. (∗1) The IIoo template from §25.4.1 does not work for built-in types. Modify it so that it does.
2. (∗1.5) The H
Haannddllee template from §25.7 does not reflect inheritance relationships of the classes
for which it is a handle. Modify it so that it does. That is, you should be able to assign a
H
Haannddllee<C
Ciirrccllee*> to a H
Haannddllee<SShhaappee*> but not the other way around.
3. (∗2.5) Given a SSttrriinngg class, define another string class using it as the representation and providing its operations as virtual functions. Compare the performance of the two classes. Try to find
a meaningful class that is best implemented by publicly deriving from the string with virtual
functions.
4. (∗4) Study two widely used libraries. Classify the library classes in terms of concrete types,
abstract types, node classes, handle classes, and interface classes. Are abstract node classes and
concrete node classes used? Is there a more appropriate classification for the classes in these
libraries? Are fat interfaces used? What facilities – if any – are provided for run-time type
information? What is the memory-management strategy?
5. (∗2) Use the F
Fiilltteerr framework (§25.8) to implement a program that removes adjacent duplicate
words from an input stream but otherwise copies the input to output.
6. (∗2) Use the F
Fiilltteerr framework to implement a program that counts the frequency of words on
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section 25.10
Exercises
789
an input stream and produces a list of (word,count) pairs in frequency order as output.
7. (∗1.5) Write a R
Raannggee template that takes both the range and the element type as template
parameters.
8. (∗1) Write a R
Raannggee template that takes the range as constructor arguments.
9. (∗2) Write a simple string class that performs no error checking. Write another class that
checks access to the first. Discuss the pros and cons of separating basic function and checking
for errors.
10. (∗2.5) Implement the object I/O system from §25.4.1 for a few types, including at least integers,
strings, and a class hierarchy of your choice.
11. (∗2.5) Define a class SSttoorraabbllee as an abstract base class with virtual functions w
wrriittee__oouutt() and
rreeaadd__iinn(). For simplicity, assume that a character string is sufficient to specify a permanent
storage location. Use class SSttoorraabbllee to provide a facility for writing objects of classes derived
from SSttoorraabbllee to disk, and for reading such objects from disk. Test it with a couple of classes
of your own choice.
12. (∗4) Define a base class P
Peerrssiisstteenntt with operations ssaavvee() and nnoo__ssaavvee() that control
whether an object is written to permanent storage by a destructor. In addition to ssaavvee() and
nnoo__ssaavvee(), what operations could P
Peerrssiisstteenntt usefully provide? Test class P
Peerrssiisstteenntt with a
couple of classes of your own choice. Is P
Peerrssiisstteenntt a node class, a concrete type, or an abstract
type? Why?
13. (∗3) Write a class SSttaacckk for which it is possible to change implementation at run time. Hint:
‘‘Every problem is solved by yet another indirection.’’
14. (∗3.5) Define a class O
Oppeerr that holds an identifier of type IIdd (maybe a ssttrriinngg or a C-style string)
and an operation (a pointer to function or some function object). Define a class C
Caatt__oobbjjeecctt that
holds a list of O
Oppeerrs and a vvooiidd*. Provide C
Caatt__oobbjjeecctt with operations aadddd__ooppeerr(O
Oppeerr),
which adds an O
Oppeerr to the list; rreem
moovvee__ooppeerr(IIdd), which removes an O
Oppeerr identified by IIdd
from the list; and an ooppeerraattoorr()(IIdd,aarrgg), which invokes the O
Oppeerr identified by IIdd. Implement a stack of C
Caatts by a C
Caatt__oobbjjeecctt. Write a small program to exercise these classes.
15. (∗3) Define a template O
Obbjjeecctt based on class C
Caatt__oobbjjeecctt. Use O
Obbjjeecctt to implement a stack of
SSttrriinnggs. Write a small program to exercise this template.
16. (∗2.5) Define a variant of class O
Obbjjeecctt called C
Cllaassss that ensures that objects with identical operations share a list of operations. Write a small program to exercise this template.
17. (∗2) Define a SSttaacckk template that provides a conventional and type-safe interface to a stack
implemented by the O
Obbjjeecctt template. Compare this stack to the stack classes found in the previous exercises. Write a small program to exercise this template.
18. (∗3) Write a class for representing operations to be shipped to another computer to execute
there. Test it either by actually sending commands to another machine or by writing commands
to a file and then executing the commands read from the file.
19. (∗2) Write a class for composing operations represented as function objects. Given two function objects f and gg, C
Coom
mppoossee(ff,gg) should make an object that can be invoked with an argument x suitable for g and return ff(gg(xx)), provided the return value of gg() is an acceptable
argument type for ff().
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
790
Roles of Classes
Chapter 25
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Appendices and Index
These Appendices provide the C++ grammar, a discussion of compatibility issues that
arise between C++ and C and between Standard C++ and prestandard versions of C++,
and a variety of language-technical details. The index is extensive and considered an
integral part of the book.
Chapters
A
B
C
I
Grammar
Compatibility
Technicalities
Index
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
792
Appendices
Appendices
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
A
Appendix
________________________________________
________________________________________________________________________________________________________________________________________________________________
Grammar
There is no worse danger for a teacher
than to teach words instead of things.
– Marc Block
Introduction — keywords — lexical conventions — programs — expressions — statements — declarations — declarators — classes — derived classes — special member
functions — overloading — templates — exception handling — preprocessing directives.
A.1 Introduction
This summary of C++ syntax is intended to be an aid to comprehension. It is not an exact statement
of the language. In particular, the grammar described here accepts a superset of valid C++ constructs. Disambiguation rules (§A.5, §A.7) must be applied to distinguish expressions from declarations. Moreover, access control, ambiguity, and type rules must be used to weed out syntactically
valid but meaningless constructs.
The C and C++ standard grammars express very minor distinctions syntactically rather than
through constraints. That gives precision, but it doesn’t always improve readability.
A.2 Keywords
New context-dependent keywords are introduced into a program by ttyyppeeddeeff (§4.9.7), namespace
(§8.2), class (Chapter 10), enumeration (§4.8), and tteem
mppllaattee (Chapter 13) declarations.
typedef-name:
identifier
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
794
Grammar
Appendix A
namespace-name:
original-namespace-name
namespace-alias
original-namespace-name:
identifier
namespace-alias:
identifier
class-name:
identifier
template-id
enum-name:
identifier
template-name:
identifier
Note that a typedef-name naming a class is also a class-name.
Unless an identifier is explicitly declared to name a type, it is assumed to name something that
is not a type (see §C.13.5).
The C++ keywords are:
____________________________________________________________________________
___________________________________________________________________________
C++ Keywords
_____________________________________________________________________________
aanndd__eeqq
aassm
m
aauuttoo
bbiittaanndd
bbiittoorr
aanndd
bbrreeaakk
ccaassee
ccaattcchh
cchhaarr
ccllaassss
bbooooll
ccoom
mppll
ccoonnsstt
ccoonnsstt__ccaasstt
ccoonnttiinnuuee
ddeeffaauulltt
ddeelleettee
ddoo
ddoouubbllee
ddyynnaam
miicc__ccaasstt eellssee
eennuum
m
eexxpplliicciitt
eexxppoorrtt
eexxtteerrnn
ffaallssee
ffllooaatt
ffoorr
ffrriieenndd
iiff
iinnlliinnee
iinntt
lloonngg
m
muuttaabbllee
ggoottoo
meessppaaccee
nneew
w
nnoott
nnoott__eeqq
ooppeerraattoorr
oorr
nnaam
oorr__eeqq
pprriivvaattee
pprrootteecctteedd
ppuubblliicc
rreeggiisstteerr
rreeiinntteerrpprreett__ccaasstt
rreettuurrnn
sshhoorrtt
ssiiggnneedd
ssiizzeeooff
ssttaattiicc
ssttaattiicc__ccaasstt
ssttrruucctt
ssw
wiittcchh
tteem
mppllaattee
tthhiiss
tthhrroow
w
ttrruuee
ttyyppeeddeeff
ttyyppeeiidd
ttyyppeennaam
mee
uunniioonn
uunnssiiggnneedd
ttrryy
vviirrttuuaall
vvooiidd
vvoollaattiillee
w
wcchhaarr__tt
w
whhiillee
uussiinngg
xxoorr
xxoorr__eeqq
____________________________________________________________________________
A.3 Lexical Conventions
The standard C and C++ grammars present lexical conventions as grammar productions. This adds
precision but also makes for large grammars and doesn’t always increase readability:
hex-quad:
hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.3
Lexical Conventions
universal-character-name:
\u hex-quad
\U hex-quad hex-quad
preprocessing-token:
header-name
identifier
pp-number
character-literal
string-literal
preprocessing-op-or-punc
each non-white-space character that cannot be one of the above
token:
identifier
keyword
literal
operator
punctuator
header-name:
<h-char-sequence>
"q-char-sequence"
h-char-sequence:
h-char
h-char-sequence h-char
h-char:
any member of the source character set except new-line and >
q-char-sequence:
q-char
q-char-sequence q-char
q-char:
any member of the source character set except new-line and "
pp-number:
digit
. digit
pp-number digit
pp-number nondigit
pp-number e sign
pp-number E sign
pp-number .
identifier:
nondigit
identifier nondigit
identifier digit
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
795
796
Grammar
Appendix A
nondigit: one of
universal-character-name
_ a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
digit: one of
0 1 2 3 4 5 6 7 8 9
preprocessing-op-or-punc: one of
{
}
[
]
%:
;
:
?
&
|
~
!
&=
|=
<<= >>=
-,
->
->*
bitor
compl
#
::
=
<<
...
not
##
.
<
>>
new
or
(
)
.*
+
>
+=
==
!=
delete
not_eq
<:
-=
<=
and
xor
:>
<%
*
/
*=
/=
>=
&&
and_eq
or_eq
%>
%:%:
%
^
%=
^=
||
++
bitand
xor_eq
literal:
integer-literal
character-literal
floating-literal
string-literal
boolean-literal
integer-literal:
decimal-literal integer-suffixopt
octal-literal integer-suffixopt
hexadecimal-literal integer-suffixopt
decimal-literal:
nonzero-digit
decimal-literal digit
octal-literal:
0
octal-literal octal-digit
hexadecimal-literal:
0x hexadecimal-digit
0X hexadecimal-digit
hexadecimal-literal hexadecimal-digit
nonzero-digit: one of
1 2 3 4
5
6
7
8
octal-digit: one of
0 1 2 3
4
5
6
7
hexadecimal-digit: one of
0 1 2 3 4
a b c d e
A B C D E
5
f
F
6
7
9
8
9
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.3
Lexical Conventions
797
integer-suffix:
unsigned-suffix long-suffixopt
long-suffix unsigned-suffixopt
unsigned-suffix: one of
u U
long-suffix: one of
l L
character-literal:
’c-char-sequence’
L’c-char-sequence’
c-char-sequence:
c-char
c-char-sequence c-char
c-char:
any member of the source character set except the single-quote, backslash, or new-line character
escape-sequence
universal-character-name
escape-sequence:
simple-escape-sequence
octal-escape-sequence
hexadecimal-escape-sequence
simple-escape-sequence: one of
\’ \" \? \\ \a
\b
\f
\n
\r
\t
\v
octal-escape-sequence:
\ octal-digit
\ octal-digit octal-digit
\ octal-digit octal-digit octal-digit
hexadecimal-escape-sequence:
\x hexadecimal-digit
hexadecimal-escape-sequence hexadecimal-digit
floating-literal:
fractional-constant exponent-partopt floating-suffixopt
digit-sequence exponent-part floating-suffixopt
fractional-constant:
digit-sequenceopt . digit-sequence
digit-sequence .
exponent-part:
e signopt digit-sequence
E signopt digit-sequence
sign: one of
+ -
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
798
Grammar
Appendix A
digit-sequence:
digit
digit-sequence digit
floating-suffix: one of
f l F L
string-literal:
"s-char-sequenceopt"
L"s-char-sequenceopt"
s-char-sequence:
s-char
s-char-sequence s-char
s-char:
any member of the source character set except double-quote, backslash , or new-line
escape-sequence
universal-character-name
boolean-literal:
false
true
A.4 Programs
A program is a collection of translation-units combined through linking (§9.4). A translation-unit,
often called a source file, is a sequence of declarations:
translation-unit:
declaration-seqopt
A.5 Expressions
See §6.2.
primary-expression:
literal
this
:: identifier
:: operator-function-id
:: qualified-id
( expression )
id-expression
id-expression:
unqualified-id
qualified-id
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.5
Expressions
id-expression:
unqualified-id
qualified-id
unqualified-id:
identifier
operator-function-id
conversion-function-id
~ class-name
template-id
qualified-id:
nested-name-specifier templateopt unqualified-id
nested-name-specifier:
class-or-namespace-name :: nested-name-specifieropt
class-or-namespace-name :: template nested-name-specifier
class-or-namespace-name:
class-name
namespace-name
postfix-expression:
primary-expression
postfix-expression [ expression ]
postfix-expression ( expression-listopt )
simple-type-specifier ( expression-listopt )
typename ::opt nested-name-specifier identifier ( expression-listopt )
typename ::opt nested-name-specifier templateopt template-id ( expression-listopt )
postfix-expression . templateopt ::opt id-expression
postfix-expression -> templateopt ::opt id-expression
postfix-expression . pseudo-destructor-name
postfix-expression -> pseudo-destructor-name
postfix-expression ++
postfix-expression -dynamic_cast < type-id > ( expression )
static_cast < type-id > ( expression )
reinterpret_cast < type-id > ( expression )
const_cast < type-id > ( expression )
typeid ( expression )
typeid ( type-id )
expression-list:
assignment-expression
expression-list , assignment-expression
pseudo-destructor-name:
::opt nested-name-specifieropt type-name :: ~ type-name
::opt nested-name-specifier template template-id :: ~ type-name
::opt nested-name-specifieropt ~ type-name
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
799
800
Grammar
Appendix A
unary-expression:
postfix-expression
++ cast-expression
-- cast-expression
unary-operator cast-expression
sizeof unary-expression
sizeof ( type-id )
new-expression
delete-expression
unary-operator: one of
* & + - !
~
new-expression:
::opt new new-placementopt new-type-id new-initializeropt
::opt new new-placementopt ( type-id ) new-initializeropt
new-placement:
( expression-list )
new-type-id:
type-specifier-seq new-declaratoropt
new-declarator:
ptr-operator new-declaratoropt
direct-new-declarator
direct-new-declarator:
[ expression ]
direct-new-declarator [ constant-expression ]
new-initializer:
( expression-listopt )
delete-expression:
::opt delete cast-expression
::opt delete [ ] cast-expression
cast-expression:
unary-expression
( type-id ) cast-expression
pm-expression:
cast-expression
pm-expression .* cast-expression
pm-expression ->* cast-expression
multiplicative-expression:
pm-expression
multiplicative-expression * pm-expression
multiplicative-expression / pm-expression
multiplicative-expression % pm-expression
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.5
Expressions
additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression
additive-expression - multiplicative-expression
shift-expression:
additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
relational-expression:
shift-expression
relational-expression
relational-expression
relational-expression
relational-expression
< shift-expression
> shift-expression
<= shift-expression
>= shift-expression
equality-expression:
relational-expression
equality-expression == relational-expression
equality-expression != relational-expression
and-expression:
equality-expression
and-expression & equality-expression
exclusive-or-expression:
and-expression
exclusive-or-expression ^ and-expression
inclusive-or-expression:
exclusive-or-expression
inclusive-or-expression | exclusive-or-expression
logical-and-expression:
inclusive-or-expression
logical-and-expression && inclusive-or-expression
logical-or-expression:
logical-and-expression
logical-or-expression || logical-and-expression
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
assignment-expression:
conditional-expression
logical-or-expression assignment-operator assignment-expression
throw-expression
assignment-operator: one of
= *= /= %=
+=
-=
>>=
<<=
&=
^=
|=
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
801
802
Grammar
Appendix A
expression:
assignment-expression
expression , assignment-expression
constant-expression:
conditional-expression
Grammar ambiguities arise from the similarity between function style casts and declarations. For
example:
iinntt xx;
vvooiidd ff()
{
cchhaarr(xx); // conversion of x to char or declaration of a char called x?
}
All such ambiguities are resolved to declarations. That is, ‘‘if it could possibly be interpreted as a
declaration, it is a declaration.’’ For example:
T
T(aa)->m
m;
T
T(aa)++;
// expression statement
// expression statement
T
T(*ee)(iinntt(33));
T
T(ff)[44];
// declaration
// declaration
T
T(aa);
T
T(aa)=m
m;
T
T(*bb)();
T
T(xx),yy,zz=77;
// declaration
// declaration
// declaration
// declaration
This disambiguation is purely syntactic. The only information used for a name is whether it is
known to be a name of a type or a name of a template. If that cannot be determined, the name is
assumed to name something that isn’t a template or a type.
The construct tteem
mppllaattee unqualified-id is used to state that the unqualified-id is the name of a
template in a context in which that cannot be deduced (see §C.13.5).
A.6 Statements
See §6.3.
statement:
labeled-statement
expression-statement
compound-statement
selection-statement
iteration-statement
jump-statement
declaration-statement
try-block
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.6
Statements
803
labeled-statement:
identifier : statement
case constant-expression : statement
default : statement
expression-statement:
expressionopt ;
compound-statement:
{ statement-seqopt }
statement-seq:
statement
statement-seq statement
selection-statement:
if ( condition ) statement
if ( condition ) statement else statement
switch ( condition ) statement
condition:
expression
type-specifier-seq declarator = assignment-expression
iteration-statement:
while ( condition ) statement
do statement while ( expression ) ;
for ( for-init-statement conditionopt ; expressionopt ) statement
for-init-statement:
expression-statement
simple-declaration
jump-statement:
break ;
continue ;
return expressionopt ;
goto identifier ;
declaration-statement:
block-declaration
A.7 Declarations
The structure of declarations is described in Chapter 4, enumerations in §4.8, pointers and arrays in
Chapter 5, functions in Chapter 7, namespaces in §8.2, linkage directives in §9.2.4, and storage
classes in §10.4.
declaration-seq:
declaration
declaration-seq declaration
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
804
Grammar
Appendix A
declaration:
block-declaration
function-definition
template-declaration
explicit-instantiation
explicit-specialization
linkage-specification
namespace-definition
block-declaration:
simple-declaration
asm-definition
namespace-alias-definition
using-declaration
using-directive
simple-declaration:
decl-specifier-seqopt init-declarator-listopt ;
decl-specifier:
storage-class-specifier
type-specifier
function-specifier
friend
typedef
decl-specifier-seq:
decl-specifier-seqopt decl-specifier
storage-class-specifier:
auto
register
static
extern
mutable
function-specifier:
inline
virtual
explicit
typedef-name:
identifier
type-specifier:
simple-type-specifier
class-specifier
enum-specifier
elaborated-type-specifier
cv-qualifier
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.7
Declarations
simple-type-specifier:
::opt nested-name-specifieropt type-name
::opt nested-name-specifier templateopt template-id
char
wchar_t
bool
short
int
long
signed
unsigned
float
double
void
type-name:
class-name
enum-name
typedef-name
elaborated-type-specifier:
class-key ::opt nested-name-specifieropt identifier
enum ::opt nested-name-specifieropt identifier
typename ::opt nested-name-specifier identifier
typename ::opt nested-name-specifier templateopt template-id
enum-name:
identifier
enum-specifier:
enum identifieropt { enumerator-listopt }
enumerator-list:
enumerator-definition
enumerator-list , enumerator-definition
enumerator-definition:
enumerator
enumerator = constant-expression
enumerator:
identifier
namespace-name:
original-namespace-name
namespace-alias
original-namespace-name:
identifier
namespace-definition:
named-namespace-definition
unnamed-namespace-definition
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
805
806
Grammar
Appendix A
named-namespace-definition:
original-namespace-definition
extension-namespace-definition
original-namespace-definition:
namespace identifier { namespace-body }
extension-namespace-definition:
namespace original-namespace-name { namespace-body }
unnamed-namespace-definition:
namespace { namespace-body }
namespace-body:
declaration-seqopt
namespace-alias:
identifier
namespace-alias-definition:
namespace identifier = qualified-namespace-specifier ;
qualified-namespace-specifier:
::opt nested-name-specifieropt namespace-name
using-declaration:
using typenameopt ::opt nested-name-specifier unqualified-id ;
using :: unqualified-id ;
using-directive:
using namespace ::opt nested-name-specifieropt namespace-name ;
asm-definition:
asm ( string-literal ) ;
linkage-specification:
extern string-literal { declaration-seqopt }
extern string-literal declaration
The grammar allows for arbitrary nesting of declarations. However, some semantic restrictions
apply. For example, nested functions (functions defined local to other functions) are not allowed.
The list of specifiers that starts a declaration cannot be empty (there is no ‘‘implicit iinntt;’’ §B.2)
and consists of the longest possible sequence of specifiers. For example:
ttyyppeeddeeff iinntt II;
vvooiidd ff(uunnssiiggnneedd II) { /* ... */ }
Here, ff() takes an unnamed uunnssiiggnneedd iinntt.
An aassm
m() is an assembly code insert. Its meaning is implementation-defined, but the intent is
for the string to be a piece of assembly code that will be inserted into the generated code at the
place where it is specified.
Declaring a valiable rreeggiisstteerr is a hint to the compiler to optimize for frequent access; doing so
is redundant with most modern compilers.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.7.1
Declarators
807
A.7.1 Declarators
See §4.9.1, Chapter 5 (pointers and arrays), §7.7 (pointers to functions), and §15.5 (pointers to
members).
init-declarator-list:
init-declarator
init-declarator-list , init-declarator
init-declarator:
declarator initializeropt
declarator:
direct-declarator
ptr-operator declarator
direct-declarator:
declarator-id
direct-declarator ( parameter-declaration-clause ) cv-qualifier-seqopt exception-specificationopt
direct-declarator [ constant-expressionopt ]
( declarator )
ptr-operator:
* cv-qualifier-seqopt
&
::opt nested-name-specifier * cv-qualifier-seqopt
cv-qualifier-seq:
cv-qualifier cv-qualifier-seqopt
cv-qualifier:
const
volatile
declarator-id:
::opt id-expression
::opt nested-name-specifieropt type-name
type-id:
type-specifier-seq abstract-declaratoropt
type-specifier-seq:
type-specifier type-specifier-seqopt
abstract-declarator:
ptr-operator abstract-declaratoropt
direct-abstract-declarator
direct-abstract-declarator:
direct-abstract-declaratoropt ( parameter-declaration-clause ) cv-qualifier-seqopt exception-specificationopt
direct-abstract-declaratoropt [ constant-expressionopt ]
( abstract-declarator )
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
808
Grammar
Appendix A
parameter-declaration-clause:
parameter-declaration-listopt ...opt
parameter-declaration-list , ...
parameter-declaration-list:
parameter-declaration
parameter-declaration-list , parameter-declaration
parameter-declaration:
decl-specifier-seq
decl-specifier-seq
decl-specifier-seq
decl-specifier-seq
declarator
declarator = assignment-expression
abstract-declaratoropt
abstract-declaratoropt = assignment-expression
function-definition:
decl-specifier-seqopt declarator ctor-initializeropt function-body
decl-specifier-seqopt declarator function-try-block
function-body:
compound-statement
initializer:
= initializer-clause
( expression-list )
initializer-clause:
assignment-expression
{ initializer-list ,opt }
{ }
initializer-list:
initializer-clause
initializer-list , initializer-clause
A vvoollaattiillee specifier is a hint to a compiler that an object may change its value in ways not specified
by the language so that aggressive optimizations must be avoided. For example, a real time clock
might be declared:
eexxtteerrnn ccoonnsstt vvoollaattiillee cclloocckk;
Two successive reads of cclloocckk might give different results.
A.8 Classes
See Chapter 10.
class-name:
identifier
template-id
class-specifier:
class-head { member-specificationopt }
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.8
Classes
809
class-head:
class-key identifieropt base-clauseopt
class-key nested-name-specifier identifier base-clauseopt
class-key nested-name-specifier template template-id base-clauseopt
class-key:
class
struct
union
member-specification:
member-declaration member-specificationopt
access-specifier : member-specificationopt
member-declaration:
decl-specifier-seqopt member-declarator-listopt ;
function-definition ;opt
::opt nested-name-specifier templateopt unqualified-id ;
using-declaration
template-declaration
member-declarator-list:
member-declarator
member-declarator-list , member-declarator
member-declarator:
declarator pure-specifieropt
declarator constant-initializeropt
identifieropt : constant-expression
pure-specifier:
= 0
constant-initializer:
= constant-expression
To preserve C compatibility, a class and a non-class of the same name can be declared in the same
scope (§5.7). For example:
ssttrruucctt ssttaatt { /* ... */ };
iinntt ssttaatt(cchhaarr* nnaam
mee, ssttrruucctt ssttaatt* bbuuff);
In this case, the plain name (ssttaatt) is the name of the non-class. The class must be referred to using
a class-key prefix .
Constant expressions are defined in §C.5.
A.8.1 Derived Classes
See Chapter 12 and Chapter 15.
base-clause:
: base-specifier-list
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
810
Grammar
Appendix A
base-specifier-list:
base-specifier
base-specifier-list , base-specifier
base-specifier:
::opt nested-name-specifieropt class-name
virtual access-specifieropt ::opt nested-name-specifieropt class-name
access-specifier virtualopt ::opt nested-name-specifieropt class-name
access-specifier:
private
protected
public
A.8.2 Special Member Functions
See §11.4 (conversion operators), §10.4.6 (class member initialization), and §12.2.2 (base initialization).
conversion-function-id:
operator conversion-type-id
conversion-type-id:
type-specifier-seq conversion-declaratoropt
conversion-declarator:
ptr-operator conversion-declaratoropt
ctor-initializer:
: mem-initializer-list
mem-initializer-list:
mem-initializer
mem-initializer , mem-initializer-list
mem-initializer:
mem-initializer-id ( expression-listopt )
mem-initializer-id:
::opt nested-name-specifieropt class-name
identifier
A.8.3 Overloading
See Chapter 11.
operator-function-id:
operator operator
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.8.3
operator: one of
new delete
+
*
+=
-=
*=
!=
<=
>=
Overloading
new[]
/
%
/=
%=
&&
||
delete[]
^
&
^=
&=
++
--
|
|=
,
~
<<
->*
!
>>
->
=
>>=
()
<
<<=
[]
811
>
==
A.9 Templates
Templates are explained in Chapter 13 and §C.13.
template-declaration:
exportopt template < template-parameter-list > declaration
template-parameter-list:
template-parameter
template-parameter-list , template-parameter
template-parameter:
type-parameter
parameter-declaration
type-parameter:
class identifieropt
class identifieropt = type-id
typename identifieropt
typename identifieropt = type-id
template < template-parameter-list > class identifieropt
template < template-parameter-list > class identifieropt = template-name
template-id:
template-name < template-argument-listopt >
template-name:
identifier
template-argument-list:
template-argument
template-argument-list , template-argument
template-argument:
assignment-expression
type-id
template-name
explicit-instantiation:
template declaration
explicit-specialization:
template < > declaration
The explicit template argument specification opens up the possibility of an obscure syntactic ambiguity. Consider:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
812
Grammar
Appendix A
vvooiidd hh()
{
ff<11>(00); // ambiguity: ((f)<1) > (0) or (f<1>)(0) ?
// resolution: f<1> is called with argument 0
}
The resolution is simple and effective: if f is a template name, ff< is the beginning of a qualified
template name and the subsequent tokens must be interpreted based on that; otherwise, < means
less-than. Similarly, the first non-nested > terminates a template argument list. If a greater-than is
needed, parentheses must be used:
ff< aa>bb >(00);
ff< (aa>bb) >(00);
// syntax error
// ok
A similar lexical ambiguity can occur when terminating >s get too close. For example:
lliisstt<vveeccttoorr<iinntt>> llvv11;
lliisstt< vveeccttoorr<iinntt> > llvv22;
// syntax error: unexpected >> (right shift)
// correct: list of vectors
Note the space between the two >s; >> is the right-shift operator. That can be a real nuisance.
A.10 Exception Handling
See §8.3 and Chapter 14.
try-block:
try compound-statement handler-seq
function-try-block:
try ctor-initializeropt function-body handler-seq
handler-seq:
handler handler-seqopt
handler:
catch ( exception-declaration ) compound-statement
exception-declaration:
type-specifier-seq declarator
type-specifier-seq abstract-declarator
type-specifier-seq
...
throw-expression:
throw assignment-expressionopt
exception-specification:
throw ( type-id-listopt )
type-id-list:
type-id
type-id-list , type-id
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section A.10
Exception Handling
813
A.11 Preprocessing Directives
The preprocessor is a relatively unsophisticated macro processor that works primarily on lexical
tokens rather than individual characters. In addition to the ability to define and use macros (§7.8),
the preprocessor provides mechanisms for including text files and standard headers (§9.2.1) and
conditional compilation based on macros (§9.3.3). For example:
#iiff O
OP
PT
T==44
#iinncclluuddee "hheeaaddeerr44.hh"
#eelliiff 00<O
OP
PT
T
#iinncclluuddee "ssoom
meehheeaaddeerr.hh"
#eellssee
#iinncclluuddee<ccssttddlliibb>
#eennddiiff
All preprocessor directives start with a #, which must be the first non-whitespace character on its
line.
preprocessing-file:
groupopt
group:
group-part
group group-part
group-part:
pp-tokensopt new-line
if-section
control-line
if-section:
if-group elif-groupsopt else-groupopt endif-line
if-group:
# if constant-expression new-line groupopt
# ifdef identifier new-line groupopt
# ifndef identifier new-line groupopt
elif-groups:
elif-group
elif-groups elif-group
elif-group:
# elif constant-expression new-line groupopt
else-group:
# else new-line groupopt
endif-line:
# endif
new-line
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
814
Grammar
Appendix A
control-line:
# include pp-tokens new-line
# define identifier replacement-list new-line
# define identifier lparen identifier-listopt ) replacement-list new-line
# undef identifier new-line
# line pp-tokens new-line
# error pp-tokensopt new-line
# pragma pp-tokensopt new-line
# new-line
lparen:
the left-parenthesis character without preceding white-space
replacement-list:
pp-tokensopt
pp-tokens:
preprocessing-token
pp-tokens preprocessing-token
new-line:
the new-line character
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
B
Appendix
________________________________________
________________________________________________________________________________________________________________________________________________________________
Compatibility
You go ahead and follow your customs,
and I´ll follow mine.
– C. Napier
C/C++ compatibility — silent differences between C and C++ — C code that is not C++
— deprecated features — C++ code that is not C — coping with older C++ implementations — headers — the standard library — namespaces — allocation errors — templates
— for-statement initializers — advice — exercises.
B.1 Introduction
This appendix discusses the incompatibilities between C and C++ and between Standard C++ (as
defined by ISO/IEC 14882) and earlier versions of C++. The purpose is to document differences
that can cause problems for the programmer and point to ways of dealing with such problems.
Most compatibility problems surface when people try to upgrade a C program to a C++ program,
try to port a C++ program from one pre-standard version of C++ to another, or try to compile C++
using modern features with an older compiler. The aim here is not to drown you in the details of
every compatibility problem that ever surfaced in an implementation, but rather to list the most frequently occurring problems and present their standard solutions.
When you look at compatibility issues, a key question to consider is the range of implementations under which a program needs to work. For learning C++, it makes sense to use the most complete and helpful implementation. For delivering a product, a more conservative strategy might be
in order to maximize the number of systems on which the product can run. In the past, this has
been a reason (and sometimes just an excuse) to avoid C++ features deemed novel. However,
implementations are converging, so the need for portability across platforms is less cause for
extreme caution than it was a couple of years ago.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
816
Compatibility
Appendix B
B.2 C/C++ Compatibility
With minor exceptions, C++ is a superset of C (meaning C89, defined by ISO/IEC 9899:1990).
Most differences stem from C++’s greater emphasis on type checking. Well-written C programs
tend to be C++ programs as well. A compiler can diagnose every difference between C++ and C.
B.2.1 ‘‘Silent’’ Differences
With a few exceptions, programs that are both C++ and C have the same meaning in both languages. Fortunately, these ‘‘silent differences’’ are rather obscure:
In C, the size of a character constant and of an enumeration equals ssiizzeeooff(iinntt). In C++,
ssiizzeeooff(´aa´) equals ssiizzeeooff(cchhaarr), and a C++ implementation is allowed to choose whatever size is
most appropriate for an enumeration (§4.8).
C++ provides the // comments; C does not (although many C implementations provide them as
an extension). This difference can be used to construct programs that behave differently in the two
languages. For example:
iinntt ff(iinntt aa, iinntt bb)
{
rreettuurrnn a //* pretty unlikely */ b
;
/* unrealistic: semicolon on separate line to avoid syntax error */
}
C99 (meaning C as defined by ISO/IEC 9899:1999(E)), also provides //.
A structure name declared in an inner scope can hide the name of an object, function, enumerator, or type in an outer scope. For example:
iinntt xx[9999];
vvooiidd ff()
{
ssttrruucctt x { iinntt aa; };
ssiizzeeooff(xx); /* size of the array in C, size of the struct in C++ */
}
B.2.2 C Code That Is Not C++
The C/C++ incompatibilities that cause most real problems are not subtle. Most are easily caught
by compilers. This section gives examples of C code that is not C++. Most are deemed poor style
or even obsolete in modern C.
In C, most functions can be called without a previous declaration. For example:
m
maaiinn()
/* poor style C. Not C++ */
{
ddoouubbllee ssqq22 = ssqqrrtt(22);
pprriinnttff("tthhee ssqquuaarree rroooott ooff 2 iiss %gg\\nn",ssqq22);
}
/* call undeclared function */
/* call undeclared function */
Complete and consistent use of function declarations (function prototypes) is generally recommended for C. Where that sensible advice is followed, and especially where C compilers provide
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
C Code That Is Not C++
Section B.2.2
817
options to enforce it, C code conforms to the C++ rule. Where undeclared functions are called, you
have to know the functions and the rules for C pretty well to know whether you have made a mistake or introduced a portability problem. For example, the previous m
maaiinn() contains at least two
errors as a C program.
In C, a function declared without specifying any argument types can take any number of arguments of any type at all. Such use is deemed obsolescent in Standard C, but it is not uncommon:
vvooiidd ff(); /* argument types not mentioned */
vvooiidd gg()
{
ff(22);
}
/* poor style C. Not C++ */
In C, functions can be defined using a syntax that optionally specifies argument types after the list
of arguments:
vvooiidd ff(aa,pp,cc) cchhaarr *pp; cchhaarr cc; { /* ... */ }
/* C. Not C++ */
Such definitions must be rewritten:
vvooiidd ff(iinntt aa, cchhaarr* pp, cchhaarr cc) { /* ... */ }
In C and in pre-standard versions of C++, the type specifier defaults to iinntt. For example:
ccoonnsstt a = 77;
/* In C, type int assumed. Not C++ */
C99 disallows ‘‘implicit iinntt,’’ just as in C++.
C allows the definition of ssttrruucctts in return type and argument type declarations. For example:
ssttrruucctt S { iinntt xx,yy; } ff();
vvooiidd gg(ssttrruucctt S { iinntt xx,yy; } yy);
/* C. Not C++ */
/* C. Not C++ */
The C++ rules for defining types make such declarations useless, and they are not allowed.
In C, integers can be assigned to variables of enumeration type:
eennuum
m D
Diirreeccttiioonn { uupp, ddoow
wnn };
eennuum
m D
Diirreeccttiioonn d = 11;
/* error: int assigned to Direction; ok in C */
C++ provides many more keywords than C does. If one of these appears as an identifier in a C program, that program must be modified to make it a C++ program:
___________________________________________________________________________
____________________________________________________________________________
C++ Keywords That Are Not C Keywords
__________________________________________________________________________
aanndd
aanndd__eeqq
aassm
m
bbiittaanndd
bbiittoorr
bbooooll
ccllaassss
ccoom
mppll
ccoonnsstt__ccaasstt
ddeelleettee
ddyynnaam
miicc__ccaasstt
ccaattcchh
eexxppoorrtt
ffaallssee
ffrriieenndd
iinnlliinnee
m
muuttaabbllee
eexxpplliicciitt
nnaam
meessppaaccee
nneew
w
nnoott
nnoott__eeqq
ooppeerraattoorr
oorr
oorr__eeqq
pprriivvaattee
pprrootteecctteedd
ppuubblliicc
rreeiinntteerrpprreett__ccaasstt
ssttaattiicc__ccaasstt
tteem
mppllaattee
tthhiiss
tthhrroow
w
ttrruuee
ttrryy
ttyyppeeiidd
ttyyppeennaam
mee
uussiinngg
vviirrttuuaall
w
wcchhaarr__tt
xxoorr
xxoorr__eeqq
___________________________________________________________________________
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
818
Compatibility
Appendix B
In C, some of the C++ keywords are macros defined in standard headers:
_______________________________________________________________
________________________________________________________________
C++ Keywords That Are C Macros
______________________________________________________________
aanndd
aanndd__eeqq
bbiittaanndd
bbiittoorr
bbooooll ccoom
mppll
ffaallssee
nnoott
nnoott__eeqq
oorr
oorr__eeqq
ttrruuee
w
wcchhaarr__tt xxoorr
xxoorr__eeqq
_______________________________________________________________
This implies that in C they can be tested using #iiffddeeff, redefined, etc.
In C, a global data object may be declared several times in a single translation unit without
using the eexxtteerrnn specifier. As long as at most one such declaration provides an initializer, the
object is considered defined only once. For example:
iinntt ii; iinntt ii;
/* defines or declares a single integer ‘i’; not C++ */
In C++, an entity must be defined exactly once; §9.2.3.
In C++, a class may not have the same name as a ttyyppeeddeeff declared to refer to a different type in
the same scope; §5.7.
In C, a vvooiidd* may be used as the right-hand operand of an assignment to or initialization of a
variable of any pointer type; in C++ it may not (§5.6). For example:
vvooiidd ff(iinntt nn)
{
iinntt* p = m
maalllloocc(nn*ssiizzeeooff(iinntt)); /* not C++. In C++, allocate using ‘new’ */
}
C allows transfer of control to a labeled-statement (§A.6) to bypass an initialization; C++ does not.
In C, a global ccoonnsstt by default has external linkage; in C++ it does not and must be initialized,
unless explicitly declared eexxtteerrnn (§5.4).
In C, names of nested structures are placed in the same scope as the structure in which they are
nested. For example:
ssttrruucctt S {
ssttrruucctt T { /* ... */ };
// ...
};
ssttrruucctt T xx;
/* ok in C meaning ‘S::T x;’. Not C++ */
In C, an array can be initialized by an initializer that has more elements than the array requires. For
example:
cchhaarr vv[55] = "O
Ossccaarr";
/* ok in C, the terminating 0 is not used. Not C++ */
B.2.3 Deprecated Features
By deprecating a feature, the standards committee expresses the wish that the feature would go
away. However, the committee does not have a mandate to remove a heavily used feature – however redundant or dangerous it may be. Thus, a deprecation is a strong hint to the users to avoid the
feature.
The keyword ssttaattiicc, which usually means ‘‘statically allocated,’’ can be used to indicate that a
function or an object is local to a translation unit. For example:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section B.2.3
Deprecated Features
819
// file1:
ssttaattiicc iinntt gglloobb;
// file2:
ssttaattiicc iinntt gglloobb;
This program genuinely has two integers called gglloobb. Each gglloobb is used exclusively by functions
defined in its translation unit.
The use of ssttaattiicc to indicate ‘‘local to translation unit’’ is deprecated in C++. Use unnamed
namespaces instead (§8.2.5.1).
The implicit conversion of a string literal to a (non-ccoonnsstt) cchhaarr* is deprecated. Use named
arrays of cchhaarr or avoid assignment of string literals to cchhaarr*s (§5.2.2).
C-style casts should have been deprecated when the new-style casts were introduced. Programmers should seriously consider banning C-style casts from their own programs. Where explicit
type conversion is necessary, ssttaattiicc__ccaasstt, rreeiinntteerrpprreett__ccaasstt, ccoonnsstt__ccaasstt, or a combination of these
can do what a C-style cast can. The new-style casts should be preferred because they are more
explicit and more visible (§6.2.7).
B.2.4 C++ Code That Is Not C
This section lists facilities offered by C++ but not by C. The features are sorted by purpose. However, many classifications are possible and most features serve multiple purposes, so this classification should not be taken too seriously.
– Features primarily for notational convenience:
[1] // comments (§2.3); added to C99
[2] Support for restricted character sets (§C.3.1); partially added to C99
[3] Support for extended character sets (§C.3.3); added to C99
[4] Non-constant initializers for objects in ssttaattiicc storage (§9.4.1)
[5] ccoonnsstt in constant expressions (§5.4, §C.5)
[6] Declarations as statements (§6.3.1); added to C99
[7] Declarations in for-statement initializers (§6.3.3); added to C99
[8] Declarations in conditions (§6.3.2.1)
[9] Structure names need not be prefixed by ssttrruucctt (§5.7)
– Features primarily for strengthening the type system:
[1] Function argument type checking (§7.1); later added to C (§B.2.2)
[2] Type-safe linkage (§9.2, §9.2.3)
[3] Free store management using nneew
w and ddeelleettee (§6.2.6, §10.4.5, §15.6)
[4] ccoonnsstt (§5.4, §5.4.1); later added to C
[5] The Boolean type bbooooll (§4.2); partially added to C99
[6] New cast syntax (§6.2.7)
– Facilities for user-defined types:
[1] Classes (Chapter 10)
[2] Member functions (§10.2.1) and member classes (§11.12)
[3] Constructors and destructors (§10.2.3, §10.4.1)
[4] Derived classes (Chapter 12, Chapter 15)
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
820
Compatibility
Appendix B
[5] vviirrttuuaall functions and abstract classes (§12.2.6, §12.3)
[6] Public/protected/private access control (§10.2.2, §15.3, §C.11)
[7] ffrriieenndds (§11.5)
[8] Pointers to members (§15.5, §C.12)
[9] ssttaattiicc members (§10.2.4)
[10] m
muuttaabbllee members (§10.2.7.2)
[11] Operator overloading (Chapter 11)
[12] References (§5.5)
– Features primarily for program organization (in addition to classes):
[1] Templates (Chapter 13, §C.13)
[2] Inline functions (§7.1.1); added to C99
[3] Default arguments (§7.5)
[4] Function overloading (§7.4)
[5] Namespaces (§8.2)
[6] Explicit scope qualification (operator ::; §4.9.4)
[7] Exception handling (§8.3, Chapter 14)
[8] Run-time Type Identification (§15.4)
The keywords added by C++ (§B.2.2) can be used to spot most C++-specific facilities. However,
some facilities, such as function overloading and ccoonnsstts in constant expressions, are not identified
by a keyword. In addition to the features listed, the C++ library (§16.1.2) is mostly C++ specific.
The ____ccpplluusspplluuss macro can be used to determine whether a program is being processed by a C
or a C++ compiler (§9.2.4).
B.3 Coping with Older C++ Implementations
C++ has been in constant use since 1983 (§1.4). Since then, several versions have been defined and
many separately developed implementations have emerged. The fundamental aim of the standards
effort was to ensure that implementers and users would have a single definition of C++ to work
from. Until that definition becomes pervasive in the C++ community, however, we have to deal
with the fact that not every implementation provides every feature described in this book.
It is unfortunately not uncommon for people to take their first serious look at C++ using a fiveyear-old implementation. The typical reason is that such implementations are widely available and
free. Given a choice, no self-respecting professional would touch such an antique. For a novice,
older implementations come with serious hidden costs. The lack of language features and library
support means that the novice must struggle with problems that have been eliminated in newer
implementations. Using a feature-poor older implementation also warps the novice’s programming
style and gives a biased view of what C++ is. The best subset of C++ to initially learn is not the set
of low-level facilities (and not the common C and C++ subset; §1.2). In particular, I recommend
relying on the standard library and on templates to ease learning and to get a good initial impression of what C++ programming can be.
The first commercial release of C++ was in late 1985. The language was defined by the first
edition of this book. At that point, C++ did not offer multiple inheritance, templates, run-time type
information, exceptions, or namespaces. Today, I see no reason to use an implementation that
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section B.3
Coping with Older C++ Implementations
821
doesn’t provide at least some of these features. I added multiple inheritance, templates, and exceptions to the definition of C++ in 1989. However, early support for templates and exceptions was
uneven and often poor. If you find problems with templates or exceptions in an older implementation, consider an immediate upgrade.
In general, it is wise to use an implementation that conforms to the standard wherever possible
and to minimize the reliance on implementation-defined and undefined aspects of the language.
Design as if the full language were available and then use whatever workarounds are needed. This
leads to better organized and more maintainable programs than designing for the lowest-commondenominator subset of C++. Also, be careful to use implementation-specific language extensions
only when absolutely necessary.
B.3.1 Headers
Traditionally, every header file had a .hh suffix. Thus, C++ implementations provided headers such
as <m
maapp.hh> and <iioossttrreeaam
m.hh>. For compatibility, most still do.
When the standards committee needed headers for redefined versions of standard libraries and
for newly added library facilities, naming those headers became a problem. Using the old .hh
names would have caused compatibility problems. The solution was to drop the .hh suffix in standard header names. The suffix is redundant anyway because the < > notation indicates that a standard header is being named.
Thus, the standard library provides non-suffixed headers, such as <iioossttrreeaam
m> and <m
maapp>. The
declarations in those files are placed in namespace ssttdd. Older headers place their declarations in the
global namespace and use a .hh suffix. Consider:
#iinncclluuddee<iioossttrreeaam
m>
iinntt m
maaiinn()
{
ssttdd::ccoouutt << "H
Heelllloo, w
woorrlldd!\\nn";
}
If this fails to compile on an implementation, try the more traditional version:
#iinncclluuddee<iioossttrreeaam
m.hh>
iinntt m
maaiinn()
{
ccoouutt << "H
Heelllloo, w
woorrlldd!\\nn";
}
Some of the most serious portability problems occur because of incompatible headers. The standard headers are only a minor contributor to this. Often, a program depends on a large number of
headers that are not present on all systems, on a large number of declarations that don’t appear in
the same headers on all systems, and on declarations that appear to be standard (because they are
found in headers with standard names) but are not part of any standard.
There are no fully-satisfactory approaches to dealing with portability in the face of inconsistent
headers. A general idea is to avoid direct dependencies on inconsistent headers and localize the
remaining dependencies. That is, we try to achieve portability through indirection and localization.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
822
Compatibility
Appendix B
For example, if declarations that we need are provided in different headers in different systems, we
may choose to #iinncclluuddee an application specific header that in turn #iinncclluuddees the appropriate
header(s) for each system. Similarly, if some functionality is provided in slightly different forms
on different systems, we may choose to access that functionality through application-specific interface classes and functions.
B.3.2 The Standard Library
Naturally, pre-standard-C++ implementations may lack parts of the standard library. Most will
have iostreams, non-templated ccoom
mpplleexx, a different ssttrriinngg class, and the C standard library. However, some may lack m
maapp, lliisstt, vvaallaarrrraayy, etc. In such cases, use the – typically proprietary –
libraries available in a way that will allow conversion when your implementation gets upgraded to
the standard. It is usually better to use a non-standard ssttrriinngg, lliisstt, and m
maapp than to revert to C-style
programming in the absence of these standard library classes. Also, good implementations of the
STL part of the standard library (Chapter 16, Chapter 17, Chapter 18, Chapter 19) are available free
for downloading.
Early implementations of the standard library were incomplete. For example, some had containers that didn’t support allocators and others required allocators to be explicitly specified for
each class. Similar problems occurred for other ‘‘policy arguments,’’ such as comparison criteria.
For example:
lliisstt<iinntt> llii;
lliisstt<iinntt,aallllooccaattoorr<iinntt> > llii22;
// ok, but some implementations require an allocator
// ok, but some implementations don’t implement allocators
m
maapp<ssttrriinngg,R
Reeccoorrdd> m
m11;
// ok, but some implementations require a less-operation
m
maapp<ssttrriinngg,R
Reeccoorrdd,lleessss<ssttrriinngg> > m
m22;
Use whichever version an implementation accepts. Eventually, the implementations will accept all.
Early C++ implementations provided iissttrrssttrreeaam
m and oossttrrssttrreeaam
m defined in <ssttrrssttrreeaam
m.hh>
instead of iissttrriinnggssttrreeaam
m and oossttrriinnggssttrreeaam
m defined in <ssssttrreeaam
m>. The ssttrrssttrreeaam
ms operated
directly on a cchhaarr[] (see §21.10[26]).
The streams in pre-standard-C++ implementations were not parameterized. In particular, the
templates with the bbaassiicc__ prefix are new in the standard, and the bbaassiicc__iiooss class used to be called
iiooss. Curiously enough, iioossttaattee used to be called iioo__ssttaattee.
B.3.3 Namespaces
If your implementation does not support namespaces, use source files to express the logical structure of the program (Chapter 9). Similarly, use header files to express interfaces that you provide
for implementations or that are shared with C.
In the absence of namespaces, use ssttaattiicc to compensate for the lack of unnamed namespaces.
Also use an identifying prefix to global names to distinguish your names from those of other parts
of the code. For example:
// for use on pre-namespace implementations:
ccllaassss bbss__ssttrriinngg { /* ... */ };
ttyyppeeddeeff iinntt bbss__bbooooll;
// Bjarne’s string
// Bjarne’s Boolean type
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section B.3.3
ccllaassss jjooee__ssttrriinngg;
eennuum
m jjooee__bbooooll { jjooee__ffaallssee, jjooee__ttrruuee };
Namespaces
823
// Joe’s string
// Joe’s bool
Be careful when choosing a prefix. Existing C and C++ libraries are littered with such prefixes.
B.3.4 Allocation Errors
In pre-exception-handling-C++, operator nneew
w returned 0 to indicate allocation failure. Standard
C++’s nneew
w throws bbaadd__aalllloocc by default.
In general, it is best to convert to the standard. In this case, this means modify the code to catch
bbaadd__aalllloocc rather than test for 00. In either case, coping with memory exhaustion beyond giving an
error message is hard on many systems.
However, when converting from testing 0 to catching bbaadd__aalllloocc is impractical, you can sometimes modify the program to revert to the pre-exception-handling behavior. If no __nneew
w__hhaannddlleerr is
installed, using the nnootthhrroow
w allocator will cause a 0 to be returned in case of allocation failure:
X
X* pp11 = nneew
w X
X;
// throws bad_alloc if no memory
X
X* pp22 = nneew
w(nnootthhrroow
w) X
X; // returns 0 if no memory
B.3.5 Templates
The standard introduced new template features and clarified the rules for several existing ones.
If your implementation doesn’t support partial specialization, use a separate name for the template that would otherwise have been a specialization. For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss pplliisstt : pprriivvaattee lliisstt<vvooiidd*> { // should have been list<T*>
// ...
};
If your implementation doesn’t support member templates, some techniques become infeasible. In
particular, member templates allow the programmer to specify construction and conversion with a
flexibility that cannot be matched without them (§13.6.2). Sometimes, providing a nonmember
function that constructs an object is an alternative. Consider:
tteem
mppllaattee<ccllaassss T
T> ccllaassss X {
// ...
tteem
mppllaattee<ccllaassss A
A> X
X(ccoonnsstt A
A& aa);
};
In the absence of member templates, we must restrict ourselves to specific types:
tteem
mppllaattee<ccllaassss T
T> ccllaassss X {
// ...
X
X(ccoonnsstt A
A11& aa);
X
X(ccoonnsstt A
A22& aa);
// ...
};
Most early implementations generated definitions for all member functions defined within a template class when that template class was instantiated. This could lead to errors in unused member
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
824
Compatibility
Appendix B
functions (§C.13.9.1). The solution is to place the definition of the member functions after the
class declaration. For example, rather than
tteem
mppllaattee<ccllaassss T
T> ccllaassss C
Coonnttaaiinneerr {
// ...
ppuubblliicc:
vvooiidd ssoorrtt() { /* use < */ }
// in-class definition
};
ccllaassss G
Glloobb { /* no < for Glob */ };
C
Coonnttaaiinneerr<G
Glloobb> ccgg; // some pre-standard implementations try to define Container<Glob>::sort()
use
tteem
mppllaattee<ccllaassss T
T> ccllaassss C
Coonnttaaiinneerr {
// ...
ppuubblliicc:
vvooiidd ssoorrtt();
};
tteem
mppllaattee<ccllaassss T
T> vvooiidd C
Coonnttaaiinneerr<T
T>::ssoorrtt() { /* use < */ }
// out-of-class definition
ccllaassss G
Glloobb { /* no < for Glob */ };
C
Coonnttaaiinneerr<G
Glloobb> ccgg; // no problem as long as cg.sort() isn’t called
Early implementations of C++ did not handle the use of members defined later in a class. For
example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss V
Veeccttoorr {
ppuubblliicc:
T
T& ooppeerraattoorr[](ssiizzee__tt ii) { rreettuurrnn vv[ii]; } // v declared below
// ...
pprriivvaattee:
T
T* vv;
// oops: not found!
ssiizzee__tt sszz;
};
In such cases, either sort the member declarations to avoid the problem or place the definition of
the member function after the class declaration.
Some pre-standard-C++ implementations do not accept default arguments for templates
(§13.4.1). In that case, every template parameter must be given an explicit argument. For example:
tteem
mppllaattee<ccllaassss K
Keeyy, ccllaassss T
T, ccllaassss L
LT
T = lleessss<T
T> > ccllaassss m
maapp {
// ...
};
m
maapp<ssttrriinngg,iinntt> m
m;
m
maapp< ssttrriinngg,iinntt,lleessss<ssttrriinngg> > m
m22;
// Oops: default template arguments not implemented
// workaround: be explicit
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section B.3.6
For-Statement Initializers
825
B.3.6 For-Statement Initializers
Consider:
vvooiidd ff(vveeccttoorr<cchhaarr>& vv, iinntt m
m)
{
ffoorr (iinntt ii= 00; ii<vv.ssiizzee() && ii<=m
m; ++ii) ccoouutt << vv[ii];
iiff (ii == m
m) {
// ...
}
// error: i referred to after end of for-statement
}
Such code used to work because in the original definition of C++, the scope of the controlled variable extended to the end of the scope in which the for-statement appears. If you find such code,
simply declare the controlled variable before the for-statement:
vvooiidd ff22(vveeccttoorr<cchhaarr>& vv, iinntt m
m)
{
iinntt ii= 00; // i needed after the loop
ffoorr (; ii<vv.ssiizzee() && ii<=m
m; ++ii) ccoouutt << vv[ii];
iiff (ii == m
m) {
// ...
}
}
B.4 Advice
[1] For learning C++, use the most up-to-date and complete implementation of Standard C++ that
you can get access to; §B.3.
[2] The common subset of C and C++ is not the best initial subset of C++ to learn; §1.6, §B.3.
[3] For production code, remember that not every C++ implementation is completely up-to-date.
Before using a major new feature in production code, try it out by writing small programs to
test the standards conformance and performance of the implementations you plan to use; for
example, see §8.5[6-7], §16.5[10], §B.5[7].
[4] Avoid deprecated features such as global ssttaattiiccs; also avoid C-style casts; §6.2.7, §B.2.3.
[5] ‘‘implicit iinntt’’ has been banned, so explicitly specify the type of every function, variable,
ccoonnsstt, etc.; §B.2.2.
[6] When converting a C program to C++, first make sure that function declarations (prototypes)
and standard headers are used consistently; §B.2.2.
[7] When converting a C program to C++, rename variables that are C++ keywords; §B.2.2.
[8] When converting a C program to C++, cast the result of m
maalllloocc() to the proper type or change
all uses of m
maalllloocc() to uses of nneew
w; §B.2.2.
[9] When converting from m
maalllloocc() and ffrreeee() to nneew
w and ddeelleettee, consider using vveeccttoorr,
ppuusshh__bbaacckk(), and rreesseerrvvee() instead of rreeaalllloocc(); §3.8, §16.3.5.
[10] When converting a C program to C++, remember that there are no implicit conversions from
iinntts to enumerations; use explicit type conversion where necessary; §4.8.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
826
Compatibility
Appendix B
[11] A facility defined in namespace ssttdd is defined in a header without a suffix (e.g. ssttdd::ccoouutt is
declared in <iioossttrreeaam
m>). Older implementations have standard library facilities in the global
namespace and declared in headers with a .hh suffix (e.g. ::ccoouutt declared in <iioossttrreeaam
m.hh>);
§9.2.2, §B.3.1.
[12] If older code tests the result of nneew
w against 00, it must be modified to catch bbaadd__aalllloocc or to use
nneew
w(nnootthhrroow
w); §B.3.4.
[13] If your implementation doesn’t support default template arguments, provide arguments explicitly; ttyyppeeddeeffs can often be used to avoid repetition of template arguments (similar to the way
the typedef ssttrriinngg saves you from saying bbaassiicc__ssttrriinngg< cchhaarr, cchhaarr__ttrraaiittss<cchhaarr>,
aallllooccaattoorr<cchhaarr> >); §B.3.5.
[14] Use <ssttrriinngg> to get ssttdd::ssttrriinngg (<ssttrriinngg.hh> holds the C-style string functions); §9.2.2,
§B.3.1.
[15] For each standard C header <X
X.hh> that places names in the global namespace, the header
<ccX
X> places the names in namespace ssttdd; §B.3.1.
[16] Many systems have a "SSttrriinngg.hh" header defining a string type. Note that such strings differ
from the standard library ssttrriinngg.
[17] Prefer standard facilities to non-standard ones; §20.1, §B.3, §C.2.
[18] Use eexxtteerrnn "C
C" when declaring C functions; §9.2.4.
B.5 Exercises
1. (∗2.5) Take a C program and convert it to a C++ program; list the kinds of non-C++ constructs
used and determine if they are valid ANSI C constructs. First convert the program to strict
ANSI C (adding prototypes, etc.), then to C++. Estimate the time it would take to convert a
100,000 line C program to C++.
2. (∗2.5) Write a program to help convert C programs to C++ by renaming variables that are C++
keywords, replacing calls of m
maalllloocc() by uses of nneew
w, etc. Hint: don’t try to do a perfect job.
3. (∗2) Replace all uses of m
maalllloocc() in a C-style C++ program (maybe a recently converted C program) to uses of nneew
w. Hint: §B.4[8-9].
4. (∗2.5) Minimize the use of macros, global variables, uninitialized variables, and casts in a Cstyle C++ program (maybe a recently converted C program).
5. (∗3) Take a C++ program that is the result of a crude conversion from C and critique it as a C++
program considering locality of information, abstraction, readability, extensibility, and potential
for reuse of parts. Make one significant change to the program based on that critique.
6. (∗2) Take a small (say, 500 line) C++ program and convert it to C. Compare the original with
the result for size and probable maintainability.
7. (∗3) Write a small set of test programs to determine whether a C++ implementation has ‘‘the
latest’’ standard features. For example, what is the scope of a variable defined in a ffoorrssttaatteem
meenntt initializer? (§B.3.6), are default template arguments supported? (§B.3.5), are member
templates supported? (§13.6.2), and is argument-based lookup supported? (§8.2.6). Hint:
§B.2.4.
8. (∗2.5) Take a C++ program that use <X
X.hh> headers and convert it to using <X
X> and <ccX
X>
headers. Minimize the use of using-directives.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
C
Appendix
________________________________________
________________________________________________________________________________________________________________________________________________________________
Technicalities
Deep in the fundamental
heart of mind and Universe,
there is a reason.
– Slartibartfast
What the standard promises — character sets — integer literals — constant expressions
— promotions and conversions — multidimensional arrays — fields and unions —
memory management — garbage collection — namespaces — access control — pointers
to data members — templates — ssttaattiicc members — ffrriieennddss — templates as template
parameters — template argument deduction — ttyyppeennaam
mee and tteem
mppllaattee qualification —
instantiation — name binding — templates and namespaces — explicit instantiation —
advice.
C.1 Introduction and Overview
This chapter presents technical details and examples that do not fit neatly into my presentation of
the main C++ language features and their uses. The details presented here can be important when
you are writing a program and essential when reading code written using them. However, I consider them technical details that should not be allowed to distract from the student’s primary task of
learning to use C++ well or the programmer’s primary task of expressing ideas as clearly and as
directly as possible in C++.
C.2 The Standard
Contrary to common belief, strictly adhering to the C++ language and library standard doesn’t guarantee good code or even portable code. The standard doesn’t say whether a piece of code is good
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
828
Technicalities
Appendix C
or bad; it simply says what a programmer can and cannot rely on from an implementation. One can
write perfectly awful standard-conforming programs, and most real-world programs rely on features not covered by the standard.
Many important things are deemed implementation-defined by the standard. This means that
each implementation must provide a specific, well-defined behavior for a construct and that behavior must be documented. For example:
uunnssiiggnneedd cchhaarr cc11 = 6644;
uunnssiiggnneedd cchhaarr cc22 = 11225566;
// well-defined: a char has at least 8 bits and can always hold 64
// implementation-defined: truncation if a char has only 8 bits
The initialization of cc11 is well-defined because a cchhaarr must be at least 8 bits. However, the behavior of the initialization of cc22 is implementation-defined because the number of bits in a cchhaarr is
implementation-defined. If the cchhaarr has only 8 bits, the value 11225566 will be truncated to 223322
(§C.6.2.1). Most implementation-defined features relate to differences in the hardware used to run
a program.
When writing real-world programs, it is usually necessary to rely on implementation-defined
behavior. Such behavior is the price we pay for the ability to operate effectively on a large range of
systems. For example, the language would have been much simpler if all characters had been 8 bits
and all integers 32 bits. However, 16-bit and 32-bit character sets are not uncommon – nor are
integers too large to fit in 32 bits. For example, many computers now have disks that hold more
that 3322G
G bytes, so 48-bit or 64-bit integers can be useful for representing disk addresses.
To maximize portability, it is wise to be explicit about what implementation-defined features
we rely on and to isolate the more subtle examples in clearly marked sections of a program. A typical example of this practice is to present all dependencies on hardware sizes in the form of constants and type definitions in some header file. To support such techniques, the standard library
provides nnuum
meerriicc__lliim
miittss (§22.2).
Undefined behavior is nastier. A construct is deemed undefined by the standard if no reasonable behavior is required by an implementation. Typically, some obvious implementation technique will cause a program using an undefined feature to behave very badly. For example:
ccoonnsstt iinntt ssiizzee = 44*11002244;
cchhaarr ppaaggee[ssiizzee];
vvooiidd ff()
{
ppaaggee[ssiizzee+ssiizzee] = 77; // undefined
}
Plausible outcomes of this code fragment include overwriting unrelated data and triggering a hardware error/exception. An implementation is not required to choose among plausible outcomes.
Where powerful optimizers are used, the actual effects of undefined behavior can become quite
unpredictable. If a set of plausible and easily implementable alternatives exist, a feature is deemed
implementation-defined rather than undefined.
It is worth spending considerable time and effort to ensure that a program does not use something deemed undefined by the standard. In many cases, tools exist to help do this.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.3
Character Sets
829
C.3 Character Sets
The examples in this book are written using the U.S. variant of the international 7-bit character set
ISO 646-1983 called ASCII (ANSI3.4-1968). This can cause three problems for people who use
C++ in an environment with a different character set:
[1] ASCII contains punctuation characters and operator symbols – such as ], {, and ! – that
are not available in some character sets.
[2] We need a notation for characters that do not have a convenient character representation
(e.g., newline and ‘‘the character with value 17’’).
[3] ASCII doesn’t contain characters, such as – ζ , æ, and Π – that are used for writing languages other than English.
C.3.1 Restricted Character Sets
The ASCII special characters [, ], {, }, |, and \ occupy character set positions designated as
alphabetic by ISO. In most European national ISO-646 character sets, these positions are occupied
by letters not found in the English alphabet. For example, the Danish national character set uses
them for the vowels Æ, æ, Ø, ø, Å, and å. No significant amount of text can be written in Danish
without them.
A set of trigraphs is provided to allow national characters to be expressed in a portable way
using a truly standard minimal character set. This can be useful for interchange of programs, but it
doesn’t make it easier for people to read programs. Naturally, the long-term solution to this problem is for C++ programmers to get equipment that supports both their native language and C++
well. Unfortunately, this appears to be infeasible for some, and the introduction of new equipment
can be a frustratingly slow process. To help programmers stuck with incomplete character sets,
C++ provides alternatives:
_______________________________________
Keywords Digraphs Trigraphs
________________________________________
______________________________________
&& <%
{ ??=
#
aanndd
} ??(
[
aanndd__eeqq &= %>
& <:
[ ??<
{
bbiittaanndd
bbiittoorr
| :>
] ??/
\
ccoom
mppll
~ %:
# ??)
]
! %:%:
## ??>
}
nnoott
||
^
oorr
??’
|=
|
oorr__eeqq
??!
xxoorr
??^
~
xxoorr__eeqq
???
^=
?
nnoott__eeqq
!=
_______________________________________
Programs using the keywords and digraphs are far more readable than the equivalent programs
written using trigraphs. However, if characters such as { are not available, trigraphs are necessary
for putting ‘‘missing’’ characters into strings and character constants. For example, ´{´ becomes
´??<´.
Some people prefer the keywords such as aanndd to their traditional operator notation.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
830
Technicalities
Appendix C
C.3.2 Escape Characters
A few characters have standard names that use the backslash \ as an escape character:
________________________________________
_________________________________________
Name
ASCII Name C++ Name
_______________________________________
NL (LF)
\n
newline
HT
\t
horizontal tab
vertical tab
VT
\v
backspace
BS
\b
carriage return
CR
\r
FF
\f
form feed
BEL
\a
alert
backslash
\
\\
question mark
?
\?
single quote
’
\’
"
\"
double quote
ooo
\ooo
octal number
hex number
hhh
\xhhh ...
________________________________________
Despite their appearance, these are single characters.
It is possible to represent a character as a one-, two-, or three-digit octal number (\\ followed by
octal digits) or as a hexadecimal number (\\xx followed by hexadecimal digits). There is no limit to
the number of hexadecimal digits in the sequence. A sequence of octal or hexadecimal digits is terminated by the first character that is not an octal digit or a hexadecimal digit, respectively. For
example:
__________________________________________
___________________________________________
Octal
Hexadecimal Decimal ASCII
_________________________________________
’\x6’
6
ACK
’\6’
’\x30’
48
’0’
’\60’
__________________________________________
’\137’ ’\x05f’
95
’_’
This makes it possible to represent every character in the machine’s character set and, in particular,
to embed such characters in character strings (see §5.2.2). Using any numeric notation for characters makes a program nonportable across machines with different character sets.
It is possible to enclose more than one character in a character literal, for example ´aabb´. Such
uses are archaic, implementation-dependent, and best avoided.
When embedding a numeric constant in a string using the octal notation, it is wise always to use
three digits for the number. The notation is hard enough to read without having to worry about
whether or not the character after a constant is a digit. For hexadecimal constants, use two digits.
Consider these examples:
cchhaarr
cchhaarr
cchhaarr
cchhaarr
vv11[] = "aa\\xxaahh\\112299";
vv22[] = "aa\\xxaahh\\112277";
vv33[] = "aa\\xxaadd\\112277";
vv44[] = "aa\\xxaadd\\00112277";
// 6 chars: ’a’ ’\xa’ ’h’ ’\12’ ’9’ ’\0’
// 5 chars: ’a’ ’\xa’ ’h’ ’\127’ ’\0’
// 4 chars: ’a’ ’\xad’ ’\127’ ’\0’
// 5 chars: ’a’ ’\xad’ ’\012’ ’7’ ’\0’
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.3.3
Large Character Sets
831
C.3.3 Large Character Sets
A C++ program may be written and presented to the user in character sets that are much richer than
the 127 character ASCII set. Where an implementation supports larger character sets, identifiers,
comments, character constants, and strings may contain characters such as å, β, and Γ. However, to
be portable the implementation must map these characters into an encoding using only characters
available to every C++ user. In principle, this translation into the C++ basic source character set
(the set used in this book) occurs before the compiler does any other processing. Therefore, it does
not affect the semantics of the program.
The standard encoding of characters from large character sets into the smaller set supported
directly by C++ is presented as sequences of four or eight hexadecimal digits:
universal-character-name:
\U X X X X X X X X
\u X X X X
Here, X represents a hexadecimal digit. For example, \\uu11ee22bb. The shorter notation \\uuX
XX
XX
XX
X is
equivalent to \\U
U00000000X
XX
XX
XX
X. A number of hexadecimal digits different from four or eight is a lexical error.
A programmer can use these character encodings directly. However, they are primarily meant
as a way for an implementation that internally uses a small character set to handle characters from a
large character set seen by the programmer.
If you rely on special environments to provide an extended character set for use in identifiers,
the program becomes less portable. A program is hard to read unless you understand the natural
language used for identifiers and comments. Consequently, for programs used internationally it is
usually best to stick to English and ASCII.
C.3.4 Signed and Unsigned Characters
It is implementation-defined whether a plain cchhaarr is considered signed or unsigned. This opens the
possibility for some nasty surprises and implementation dependencies. For example:
cchhaarr c = 225555; // 255 is ‘‘all ones,’’ hexadecimal 0xFF
iinntt i = cc;
What will be the value of ii? Unfortunately, the answer is undefined. On all implementations I
know of, the answer depends on the meaning of the ‘‘all ones’’ cchhaarr bit pattern when extended into
an iinntt. On a SGI Challenge machine, a cchhaarr is unsigned, so the answer is 225555. On a Sun SPARC
or an IBM PC, where a cchhaarr is signed, the answer is -11. In this case, the compiler might warn
about the conversion of the literal 225555 to the cchhaarr value -11. However, C++ does not offer a general
mechanism for detecting this kind of problem. One solution is to avoid plain cchhaarr and use the specific cchhaarr types only. Unfortunately, some standard library functions, such as ssttrrccm
mpp(), take plain
cchhaarrs only (§20.4.1).
A cchhaarr must behave identically to either a ssiiggnneedd cchhaarr or an uunnssiiggnneedd cchhaarr. However, the
three cchhaarr types are distinct, so you can’t mix pointers to different cchhaarr types. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
832
Technicalities
Appendix C
vvooiidd ff(cchhaarr cc, ssiiggnneedd cchhaarr sscc, uunnssiiggnneedd cchhaarr uucc)
{
cchhaarr* ppcc = &uucc;
// error: no pointer conversion
ssiiggnneedd cchhaarr* ppsscc = ppcc;
// error: no pointer conversion
uunnssiiggnneedd cchhaarr* ppuucc = ppcc; // error: no pointer conversion
ppsscc = ppuucc;
// error: no pointer conversion
}
Variables of the three cchhaarr types can be freely assigned to each other. However, assigning a toolarge value to a signed cchhaarr (§C.6.2.1) is still undefined. For example:
vvooiidd ff(cchhaarr cc, ssiiggnneedd cchhaarr sscc, uunnssiiggnneedd cchhaarr uucc)
{
c = 225555; // undefined if plain chars are signed and have 8 bits
c = sscc;
c = uucc;
sscc = uucc;
uucc = sscc;
sscc = cc;
uucc = cc;
// ok
// undefined if plain chars are signed and if uc’s value is too large
// undefined if uc’s value is too large
// ok: conversion to unsigned
// undefined if plain chars are unsigned and if c’s value is too large
// ok: conversion to unsigned
}
None of these potential problems occurs if you use plain cchhaarr throughout.
C.4 Types of Integer Literals
In general, the type of an integer literal depends on its form, value, and suffix:
– If it is decimal and has no suffix, it has the first of these types in which its value can be represented: iinntt, lloonngg iinntt, uunnssiiggnneedd lloonngg iinntt.
– If it is octal or hexadecimal and has no suffix, it has the first of these types in which its
value can be represented: iinntt, uunnssiiggnneedd iinntt, lloonngg iinntt, uunnssiiggnneedd lloonngg iinntt.
– If it is suffixed by u or U
U, its type is the first of these types in which its value can be represented: uunnssiiggnneedd iinntt, uunnssiiggnneedd lloonngg iinntt.
– If it is suffixed by l or L
L, its type is the first of these types in which its value can be represented: lloonngg iinntt, uunnssiiggnneedd lloonngg iinntt.
– If it is suffixed by uull, lluu, uuL
L, L
Luu, U
Ull, llU
U, U
UL
L, or L
LU
U, its type is uunnssiiggnneedd lloonngg iinntt.
For example, 110000000000 is of type iinntt on a machine with 32-bit iinntts but of type lloonngg iinntt on a machine
with 16-bit iinntts and 32-bit lloonnggs. Similarly, 00X
XA
A000000 is of type iinntt on a machine with 32-bit iinntts
but of type uunnssiiggnneedd iinntt on a machine with 16-bit iinntts. These implementation dependencies can be
avoided by using suffixes: 110000000000L
L is of type lloonngg iinntt on all machines and 00X
XA
A000000U
U is of type
uunnssiiggnneedd iinntt on all machines.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.5
Constant Expressions
833
C.5 Constant Expressions
In places such as array bounds (§5.2), case labels (§6.3.2), and initializers for enumerators (§4.8),
C++ requires a constant expression. A constant expression evaluates to an integral or enumeration
constant. Such an expression is composed of literals (§4.3.1, §4.4.1, §4.5.1), enumerators (§4.8),
and ccoonnsstts initialized by constant expressions. In a template, an integer template parameter can
also be used (§C.13.3). Floating literals (§4.5.1) can be used only if explicitly converted to an integral type. Functions, class objects, pointers, and references can be used as operands to the ssiizzeeooff
operator (§6.2) only.
Intuitively, constant expressions are simple expressions that can be evaluated by the compiler
before the program is linked (§9.1) and starts to run.
C.6 Implicit Type Conversion
Integral and floating-point types (§4.1.1) can be mixed freely in assignments and expressions.
Wherever possible, values are converted so as not to lose information. Unfortunately, valuedestroying conversions are also performed implicitly. This section provides a description of conversion rules, conversion problems, and their resolution.
C.6.1 Promotions
The implicit conversions that preserve values are commonly referred to as promotions. Before an
arithmetic operation is performed, integral promotion is used to create iinntts out of shorter integer
types. Note that these promotions will not promote to lloonngg (unless the operand is a w
wcchhaarr__tt or an
enumeration that is already larger than an iinntt). This reflects the original purpose of these promotions in C: to bring operands to the ‘‘natural’’ size for arithmetic operations.
The integral promotions are:
– A cchhaarr, ssiiggnneedd cchhaarr, uunnssiiggnneedd cchhaarr, sshhoorrtt iinntt, or uunnssiiggnneedd sshhoorrtt iinntt is converted to an iinntt
if iinntt can represent all the values of the source type; otherwise, it is converted to an
uunnssiiggnneedd iinntt.
– Aw
wcchhaarr__tt (§4.3) or an enumeration type (§4.8) is converted to the first of the following
types that can represent all the values of its underlying type: iinntt, uunnssiiggnneedd iinntt, lloonngg, or
uunnssiiggnneedd lloonngg.
– A bit-field (§C.8.1) is converted to an iinntt if iinntt can represent all the values of the bit-field;
otherwise, it is converted to uunnssiiggnneedd iinntt if uunnssiiggnneedd iinntt can represent all the values of the
bit-field. Otherwise, no integral promotion applies to it.
– A bbooooll is converted to an iinntt; ffaallssee becomes 0 and ttrruuee becomes 11.
Promotions are used as part of the usual arithmetic conversions (§C.6.3).
C.6.2 Conversions
The fundamental types can be converted into each other in a bewildering number of ways. In my
opinion, too many conversions are allowed. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
834
Technicalities
vvooiidd ff(ddoouubbllee dd)
{
cchhaarr c = dd;
}
Appendix C
// beware: double-precision floating-point to char conversion
When writing code, you should always aim to avoid undefined behavior and conversions that quietly throw away information. A compiler can warn about many questionable conversions. Fortunately, many compilers actually do.
C.6.2.1 Integral Conversions
An integer can be converted to another integer type. An enumeration value can be converted to an
integer type.
If the destination type is uunnssiiggnneedd, the resulting value is simply as many bits from the source as
will fit in the destination (high-order bits are thrown away if necessary). More precisely, the result
is the least unsigned integer congruent to the source integer modulo 2 to the nnth, where n is the
number of bits used to represent the unsigned type. For example:
uunnssiiggnneedd cchhaarr uucc = 11002233; // binary 1111111111: uc becomes binary 11111111; that is, 255
If the destination type is ssiiggnneedd, the value is unchanged if it can be represented in the destination
type; otherwise, the value is implementation-defined:
ssiiggnneedd cchhaarr sscc = 11002233;
// implementation-defined
Plausible results are 225555 and -11 (§C.3.4).
A Boolean or enumeration value can be implicitly converted to its integer equivalent (§4.2,
§4.8).
C.6.2.2 Floating-Point Conversions
A floating-point value can be converted to another floating-point type. If the source value can be
exactly represented in the destination type, the result is the original numeric value. If the source
value is between two adjacent destination values, the result is one of those values. Otherwise, the
behavior is undefined. For example:
ffllooaatt f = F
FL
LT
T__M
MA
AX
X;
ddoouubbllee d = ff;
ffllooaatt ff22 = dd;
ddoouubbllee dd33 = D
DB
BL
L__M
MA
AX
X;
ffllooaatt ff33 = dd33;
// largest float value
// ok: d == f
// ok: f2 == f
// largest double value
// undefined if FLT_MAX<DBL_MAX
C.6.2.3 Pointer and Reference Conversions
Any pointer to an object type can be implicitly converted to a vvooiidd* (§5.6). A pointer (reference)
to a derived class can be implicitly converted to a pointer (reference) to an accessible and unambiguous base (§12.2). Note that a pointer to function or a pointer to member cannot be implicitly
converted to a vvooiidd*.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.6.2.3
Pointer and Reference Conversions
835
A constant expression (§C.5) that evaluates to 0 can be implicitly converted to any pointer or
pointer to member type (§5.1.1). For example:
iinntt* p =
!
!
!! !
! !!
!
!
!
!
!
!
!
!
!!!!!!
!
!
!
!!!!!
!
!
!
!!!!11;
AT
T* can be implicitly converted to a ccoonnsstt T
T* (§5.4.1). Similarly, a T
T& can be implicitly converted to a ccoonnsstt T
T&.
C.6.2.4 Pointer-to-Member Conversions
Pointers and references to members can be implicitly converted as described in §15.5.1.
C.6.2.5 Boolean Conversions
Pointers, integral, and floating-point values can be implicitly converted to bbooooll (§4.2). A nonzero
value converts to ttrruuee; a zero value converts to ffaallssee. For example:
vvooiidd ff(iinntt* pp, iinntt ii)
{
bbooooll iiss__nnoott__zzeerroo = pp;
bbooooll bb22 = ii;
}
// true if p!=0
// true if i!=0
C.6.2.6 Floating-Integral Conversions
When a floating-point value is converted to an integer value, the fractional part is discarded. In
other words, conversion from a floating-point type to an integer type truncates. For example, the
value of iinntt(11.66) is 11. The behavior is undefined if the truncated value cannot be represented in
the destination type. For example:
iinntt i = 22.77;
cchhaarr b = 22000000.77;
// i becomes 2
// undefined for 8-bit chars: 2000 cannot be represented as an 8-bit char
Conversions from integer to floating types are as mathematically correct as the hardware allows.
Loss of precision occurs if an integral value cannot be represented exactly as a value of the floating
type. For example,
iinntt i = ffllooaatt(11223344556677889900);
left i with the value 11223344556677993366 on a machine, where both iinntts and ffllooaatts are represented using 32
bits.
Clearly, it is best to avoid potentially value-destroying implicit conversions. In fact, compilers
can detect and warn against some obviously dangerous conversions, such as floating to integral and
lloonngg iinntt to cchhaarr. However, general compile-time detection is impractical, so the programmer must
be careful. When ‘‘being careful’’ isn’t enough, the programmer can insert explicit checks. For
example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
836
Technicalities
Appendix C
ccllaassss cchheecckk__ffaaiilleedd { };
cchhaarr cchheecckkeedd(iinntt ii)
{
cchhaarr c = ii;
iiff (ii != cc) tthhrroow
w cchheecckk__ffaaiilleedd();
rreettuurrnn cc;
}
vvooiidd m
myy__ccooddee(iinntt ii)
{
cchhaarr c = cchheecckkeedd(ii);
// ...
}
// warning: not portable (§C.6.2.1)
To truncate in a way that is guaranteed to be portable requires the use of nnuum
meerriicc__lliim
miittss (§22.2).
C.6.3 Usual Arithmetic Conversions
These conversions are performed on the operands of a binary operator to bring them to a common
type, which is then used as the type of the result:
[1] If either operand is of type lloonngg ddoouubbllee, the other is converted to lloonngg ddoouubbllee.
– Otherwise, if either operand is ddoouubbllee, the other is converted to ddoouubbllee.
– Otherwise, if either operand is ffllooaatt, the other is converted to ffllooaatt.
– Otherwise, integral promotions (§C.6.1) are performed on both operands.
[2] Then, if either operand is uunnssiiggnneedd lloonngg, the other is converted to uunnssiiggnneedd lloonngg.
– Otherwise, if one operand is a lloonngg iinntt and the other is an uunnssiiggnneedd iinntt, then if a lloonngg iinntt
can represent all the values of an uunnssiiggnneedd iinntt, the uunnssiiggnneedd iinntt is converted to a lloonngg iinntt;
otherwise, both operands are converted to uunnssiiggnneedd lloonngg iinntt.
– Otherwise, if either operand is lloonngg, the other is converted to lloonngg.
– Otherwise, if either operand is uunnssiiggnneedd, the other is converted to uunnssiiggnneedd.
– Otherwise, both operands are iinntt.
C.7 Multidimensional Arrays
It is not uncommon to need a vector of vectors, a vector of vector of vectors, etc. The issue is how
to represent these multidimensional vectors in C++. Here, I first show how to use the standard
library vveeccttoorr class. Next, I present multidimensional arrays as they appear in C and C++ programs
using only built-in facilities.
C.7.1 Vectors
The standard vveeccttoorr (§16.3) provides a very general solution:
vveeccttoorr< vveeccttoorr<iinntt> > m
m;
This creates a vector of vectors of integers that initially contains no elements. We could initialize it
to a three-by-five matrix like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.7.1
Vectors
837
vvooiidd iinniitt__m
m()
{
m
m.rreessiizzee(33);
// m now holds 3 empty vectors
ffoorr (iinntt i = 00; ii<m
m.ssiizzee(); ii++) {
m
m[ii].rreessiizzee(55);
// now each of m’s vectors holds 5 ints
ffoorr (iinntt j = 00; jj<m
m[ii].ssiizzee(); jj++) m
m[ii][jj] = 1100*ii+jj;
}
}
or graphically:
m:
3
m[0]:
m[1]:
m[2]:
5
5
5
00 01 02 03 04
10 11 12 13 14
20 21 22 23 24
Each vveeccttoorr implementation holds a pointer to its elements plus the number of elements. The elements are typically held in an array. For illustration, I gave each iinntt an initial value representing its
coordinates.
It is not necessary for the vveeccttoorr<iinntt>s in the vveeccttoorr< vveeccttoorr<iinntt> > to have the same size.
Accessing an element is done by indexing twice. For example, m
m[ii][jj] is the jjth element of
the iith vector. We can print m like this:
vvooiidd pprriinntt__m
m()
{
ffoorr (iinntt i = 00; ii<m
m.ssiizzee(); ii++) {
ffoorr (iinntt j = 00; jj<m
m[ii].ssiizzee(); jj++) ccoouutt << m
m[ii][jj] << ´\\tt´;
ccoouutt << ´\\nn´;
}
}
which gives:
0
1100
2200
1
1111
2211
2
1122
2222
3
1133
2233
4
1144
2244
C.7.2 Arrays
The built-in arrays are a major source of errors – especially when they are used to build multidimensional arrays. For novices, they are also a major source of confusion. Wherever possible, use
vveeccttoorr, lliisstt, vvaallaarrrraayy, ssttrriinngg, etc.
Multidimensional arrays are represented as arrays of arrays. A three-by-five array is declared
like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
838
Technicalities
Appendix C
iinntt m
maa[33][55]; // 3 arrays with 5 ints each
For arrays, the dimensions must be given as part of the definition. We can initialize m
maa like this:
vvooiidd iinniitt__m
maa()
{
ffoorr (iinntt i = 00; ii<33; ii++) {
ffoorr (iinntt j = 00; jj<55; jj++) m
maa[ii][jj] = 1100*ii+jj;
}
}
or graphically:
ma:
00 01 02 03 04 10 11 12 13 14 20 21 22 23 24
The array m
maa is simply 15 iinnttss that we access as if it were 3 arrays of 5 iinntts. In particular, there is
no single object in memory that is the matrix m
maa – only the elements are stored. The dimensions 3
and 5 exist in the compiler source only. When we write code, it is our job to remember them somehow and supply the dimensions where needed. For example, we might print m
maa like this:
vvooiidd pprriinntt__m
maa()
{
ffoorr (iinntt i = 00; ii<33; ii++) {
ffoorr (iinntt j = 00; jj<55; jj++) ccoouutt << m
maa[ii][jj] << ´\\tt´;
ccoouutt << ´\\nn´;
}
}
The comma notation used for array bounds in some languages cannot be used in C++ because the
comma (,) is a sequencing operator (§6.2.2). Fortunately, most mistakes are caught by the compiler. For example:
iinntt
iinntt
iinntt
iinntt
bbaadd[33,55];
ggoooodd[33][55];
oouucchh = ggoooodd[11,44];
nniiccee = ggoooodd[11][44];
// error: comma not allowed in constant expression
// 3 arrays with 5 ints each
// error: int initialized by int* (good[1,4] means good[4], which is an int*)
C.7.3 Passing Multidimensional Arrays
Consider defining a function to manipulate a two-dimensional matrix. If the dimensions are known
at compile time, there is no problem:
vvooiidd pprriinntt__m
m3355(iinntt m
m[33][55])
{
ffoorr (iinntt i = 00; ii<33; ii++) {
ffoorr (iinntt j = 00; jj<55; jj++) ccoouutt << m
m[ii][jj] << ´\\tt´;
ccoouutt << ´\\nn´;
}
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.7.3
Passing Multidimensional Arrays
839
A matrix represented as a multidimensional array is passed as a pointer (rather than copied; §5.3).
The first dimension of an array is irrelevant to the problem of finding the location of an element; it
simply states how many elements (here 33) of the appropriate type (here iinntt[55]) are present. For
example, look at the previous representation of m
maa and note that by our knowing only that the second dimension is 55, we can locate m
maa[ii][55] for any ii. The first dimension can therefore be
passed as an argument:
vvooiidd pprriinntt__m
mii55(iinntt m
m[][55], iinntt ddiim
m11)
{
ffoorr (iinntt i = 00; ii<ddiim
m11; ii++) {
ffoorr (iinntt j = 00; jj<55; jj++) ccoouutt << m
m[ii][jj] << ´\\tt´;
ccoouutt << ´\\nn´;
}
}
The difficult case is when both dimensions need to be passed. The ‘‘obvious solution’’ simply
does not work:
vvooiidd pprriinntt__m
miijj(iinntt m
m[][], iinntt ddiim
m11, iinntt ddiim
m22) // doesn’t behave as most people would think
{
ffoorr (iinntt i = 00; ii<ddiim
m11; ii++) {
ffoorr (iinntt j = 00; jj<ddiim
m22; jj++) ccoouutt << m
m[ii][jj] << ´\\tt´;
// surprise!
ccoouutt << ´\\nn´;
}
}
First, the argument declaration m
m[][] is illegal because the second dimension of a multidimensional array must be known in order to find the location of an element. Second, the expression
m
m[ii][jj] is (correctly) interpreted as *(*(m
m+ii)+jj), although that is unlikely to be what the programmer intended. A correct solution is:
vvooiidd pprriinntt__m
miijj(iinntt* m
m, iinntt ddiim
m11, iinntt ddiim
m22)
{
ffoorr (iinntt i = 00; ii<ddiim
m11; ii++) {
ffoorr (iinntt j = 00; jj<ddiim
m22; jj++) ccoouutt << m
m[ii*ddiim
m22+jj] << ´\\tt´; // obscure
ccoouutt << ´\\nn´;
}
}
The expression used for accessing the members in pprriinntt__m
miijj() is equivalent to the one the compiler generates when it knows the last dimension.
To call this function, we pass a matrix as an ordinary pointer:
iinntt m
maaiinn()
{
iinntt vv[33][55] = { {00,11,22,33,44}, {1100,1111,1122,1133,1144}, {2200,2211,2222,2233,2244} };
pprriinntt__m
m3355(vv);
pprriinntt__m
mii55(vv,33);
pprriinntt__m
miijj(&vv[00][00],33,55);
}
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
840
Technicalities
Appendix C
Note the use of &vv[00][00] for the last call; vv[00] would do because it is equivalent, but v would be
a type error. This kind of subtle and messy code is best hidden. If you must deal directly with multidimensional arrays, consider encapsulating the code relying on it. In that way, you might ease the
task of the next programmer to touch the code. Providing a multidimensional array type with a
proper subscripting operator saves most users from having to worry about the layout of the data in
the array (§22.4.6).
The standard vveeccttoorr (§16.3) doesn’t suffer from these problems.
C.8 Saving Space
When programming nontrivial applications, there often comes a time when you want more memory
space than is available or affordable. There are two ways of squeezing more space out of what is
available:
[1] Put more than one small object into a byte.
[2] Use the same space to hold different objects at different times.
The former can be achieved by using fields, and the latter by using unions. These constructs are
described in the following sections. Many uses of fields and unions are pure optimizations, and
these optimizations are often based on nonportable assumptions about memory layouts. Consequently, the programmer should think twice before using them. Often, a better approach is to
change the way data is managed, for example, to rely more on dynamically allocated store (§6.2.6)
and less on preallocated (static) storage.
C.8.1 Fields
It seems extravagant to use a whole byte (a cchhaarr or a bbooooll) to represent a binary variable – for
example, an on/off switch – but a cchhaarr is the smallest object that can be independently allocated
and addressed in C++ (§5.1). It is possible, however, to bundle several such tiny variables together
as fields in a ssttrruucctt. A member is defined to be a field by specifying the number of bits it is to
occupy. Unnamed fields are allowed. They do not affect the meaning of the named fields, but they
can be used to make the layout better in some machine-dependent way:
ssttrruucctt P
PP
PN
N{
// R6000 Physical Page Number
uunnssiiggnneedd iinntt P
PF
FN
N : 2222; // Page Frame Number
iinntt : 33;
// unused
uunnssiiggnneedd iinntt C
CC
CA
A : 33;
// Cache Coherency Algorithm
bbooooll nnoonnrreeaacchhaabbllee : 11;
bbooooll ddiirrttyy : 11;
bbooooll vvaalliidd : 11;
bbooooll gglloobbaall : 11;
};
This example also illustrates the other main use of fields: to name parts of an externally imposed
layout. A field must be of an integral or enumeration type (§4.1.1). It is not possible to take the
address of a field. Apart from that, however, it can be used exactly like other variables. Note that a
bbooooll field really can be represented by a single bit. In an operating system kernel or in a debugger,
the type P
PP
PN
N might be used like this:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.8.1
Fields
841
vvooiidd ppaarrtt__ooff__V
VM
M__ssyysstteem
m(P
PP
PN
N* pp)
{
// ...
iiff (pp->ddiirrttyy) { // contents changed
// copy to disc
pp->ddiirrttyy = 00;
}
// ...
}
Surprisingly, using fields to pack several variables into a single byte does not necessarily save
space. It saves data space, but the size of the code needed to manipulate these variables increases
on most machines. Programs have been known to shrink significantly when binary variables were
converted from bit fields to characters! Furthermore, it is typically much faster to access a cchhaarr or
an iinntt than to access a field. Fields are simply a convenient shorthand for using bitwise logical
operators (§6.2.4) to extract information from and insert information into part of a word.
C.8.2 Unions
A uunniioonn is a ssttrruucctt in which all members are allocated at the same address so that the uunniioonn occupies only as much space as its largest member. Naturally, a uunniioonn can hold a value for only one
member at a time. For example, consider a symbol table entry that holds a name and a value:
eennuum
m T
Tyyppee { SS, I };
ssttrruucctt E
Ennttrryy {
cchhaarr* nnaam
mee;
T
Tyyppee tt;
cchhaarr* ss; // use s if t==S
iinntt ii;
// use i if t==I
};
vvooiidd ff(E
Ennttrryy* pp)
{
iiff (pp->tt == SS) ccoouutt << pp->ss;
// ...
}
The members s and i can never be used at the same time, so space is wasted. It can be easily recovered by specifying that both should be members of a uunniioonn, like this:
uunniioonn V
Vaalluuee {
cchhaarr* ss;
iinntt ii;
};
The language doesn’t keep track of which kind of value is held by a uunniioonn, so the programmer must
still do that:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
842
Technicalities
Appendix C
ssttrruucctt E
Ennttrryy {
cchhaarr* nnaam
mee;
T
Tyyppee tt;
V
Vaalluuee vv; // use v.s if t==S; use v.i if t==I
};
vvooiidd ff(E
Ennttrryy* pp)
{
iiff (pp->tt == SS) ccoouutt << pp->vv.ss;
// ...
}
Unfortunately, the introduction of the uunniioonn forced us to rewrite code to say vv.ss instead of plain ss.
This can be avoided by using an anonymous union, which is a union that doesn’t have a name and
consequently doesn’t define a type. Instead, it simply ensures that its members are allocated at the
same address:
ssttrruucctt E
Ennttrryy {
cchhaarr* nnaam
mee;
T
Tyyppee tt;
uunniioonn {
cchhaarr* ss; // use s if t==S
iinntt ii;
// use i if t==I
};
};
vvooiidd ff(E
Ennttrryy* pp)
{
iiff (pp->tt == SS) ccoouutt << pp->ss;
// ...
}
This leaves all code using an E
Ennttrryy unchanged.
Using a uunniioonn so that its value is always read using the member through which it was written is
a pure optimization. However, it is not always easy to ensure that a uunniioonn is used in this way only,
and subtle errors can be introduced through misuse. To avoid errors, one can encapsulate a uunniioonn
so that the correspondence between a type field and access to the uunniioonn members can be guaranteed
(§10.6[20]).
Unions are sometimes misused for ‘‘type conversion.’’ This misuse is practiced mainly by programmers trained in languages that do not have explicit type conversion facilities, where cheating is
necessary. For example, the following ‘‘converts’’ an iinntt to an iinntt* simply by assuming bitwise
equivalence:
uunniioonn F
Fuuddggee {
iinntt ii;
iinntt* pp;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.8.2
iinntt* cchheeaatt(iinntt ii)
{
F
Fuuddggee aa;
aa.ii = ii;
rreettuurrnn aa.pp;
}
Unions
843
// bad use
This is not really a conversion at all. On some machines, an iinntt and an iinntt* do not occupy the
same amount of space, while on others, no integer can have an odd address. Such use of a uunniioonn is
dangerous and nonportable, and there is an explicit and portable way of specifying type conversion
(§6.2.7).
Unions are occasionally used deliberately to avoid type conversion. One might, for example,
use a F
Fuuddggee to find the representation of the pointer 00:
iinntt m
maaiinn()
{
F
Fuuddggee ffoooo;
ffoooo.pp = 00;
ccoouutt << "tthhee iinntteeggeerr vvaalluuee ooff tthhee ppooiinntteerr 0 iiss " << ffoooo.ii << ´\\nn´;
}
C.8.3 Unions and Classes
Many nontrivial uunniioonns have some members that are much larger than the most frequently-used
members. Because the size of a uunniioonn is at least as large as its largest member, space is wasted.
This waste can often be eliminated by using a set of derived classes instead of a uunniioonn.
A class with a constructor, destructor, or copy operation cannot be the type of a uunniioonn member
(§10.4.12) because the compiler would not know which member to destroy.
C.9 Memory Management
There are three fundamental ways of using memory in C++:
Static memory, in which an object is allocated by the linker for the duration of the program.
Global and namespace variables, ssttaattiicc class members (§10.2.4), and ssttaattiicc variables in
functions (§7.1.2) are allocated in static memory. An object allocated in static memory is
constructed once and persists to the end of the program. It always has the same address.
Static objects can be a problem in programs using threads (shared-address space concurrency) because they are shared and require locking for proper access.
Automatic memory, in which function arguments and local variables are allocated. Each entry
into a function or a block gets its own copy. This kind of memory is automatically created
and destroyed; hence the name automatic memory. Automatic memory is also said ‘‘to be
on the stack.’’ If you absolutely must be explicit about this, C++ provides the redundant
keyword aauuttoo.
Free store, from which memory for objects is explicitly requested by the program and where a
program can free memory again once it is done with it (using nneew
w and ddeelleettee). When a program needs more free store, nneew
w requests it from the operating system. Typically, the free
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
844
Technicalities
Appendix C
store (also called dynamic memory or the heap) grows throughout the lifetime of a program
because no memory is ever returned to the operating system for use by other programs.
As far as the programmer is concerned, automatic and static storage are used in simple, obvious,
and implicit ways. The interesting question is how to manage the free store. Allocation (using
nneew
w) is simple, but unless we have a consistent policy for giving memory back to the free store
manager, memory will fill up – especially for long-running programs.
The simplest strategy is to use automatic objects to manage corresponding objects in free store.
Consequently, many containers are implemented as handles to elements stored in the free store
(§25.7). For example, an automatic SSttrriinngg (§11.12) manages a sequence of characters on the free
store and automatically frees that memory when it itself goes out of scope. All of the standard containers (§16.3, Chapter 17, Chapter 20, §22.4) can be conveniently implemented in this way.
C.9.1 Automatic Garbage Collection
When this regular approach isn’t sufficient, the programmer might use a memory manager that
finds unreferenced objects and reclaims their memory in which to store new objects. This is usually called automatic garbage collection, or simply garbage collection. Naturally, such a memory
manager is called a garbage collector.
The fundamental idea of garbage collection is that an object that is no longer referred to in a
program will not be accessed again, so its memory can be safely reused for some new object. For
example:
vvooiidd ff()
{
iinntt* p = nneew
w iinntt;
p = 00;
cchhaarr* q = nneew
w cchhaarr;
}
Here, the assignment pp=00 makes the iinntt unreferenced so that its memory can be used for some
other new object. Thus, the cchhaarr might be allocated in the same memory as the iinntt so that q holds
the value that p originally had.
The standard does not require that an implementation supply a garbage collector, but garbage
collectors are increasingly used for C++ in areas where their costs compare favorably to those of
manual management of free store. When comparing costs, consider the run time, memory usage,
reliability, portability, monetary cost of programming, monetary cost of a garbage collector, and
predictability of performance.
C.9.1.1 Disguised Pointers
What should it mean for an object to be unreferenced? Consider:
vvooiidd ff()
{
iinntt* p = nneew
w iinntt;
lloonngg ii11 = rreeiinntteerrpprreett__ccaasstt<lloonngg>(pp)&00xxF
FF
FF
FF
F00000000;
lloonngg ii22 = rreeiinntteerrpprreett__ccaasstt<lloonngg>(pp)&00xx00000000F
FF
FF
FF
F;
p = 00;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.9.1.1
Disguised Pointers
845
// point #1: no pointer to the int exists here
p = rreeiinntteerrpprreett__ccaasstt<iinntt*>(ii11|ii22);
// now the int is referenced again
}
Often, pointers stored as non-pointers in a program are called ‘‘disguised pointers.’’ In particular,
the pointer originally held in p is disguised in the integers ii11 and ii22. However, a garbage collector
need not be concerned about disguised pointers. If the garbage collector runs at point #11, the memory holding the iinntt can be reclaimed. In fact, such programs are not guaranteed to work even if a
garbage collector is not used because the use of rreeiinntteerrpprreett__ccaasstt to convert between integers and
pointers is at best implementation-defined.
A uunniioonn that can hold both pointers and non-pointers presents a garbage collector with a special
problem. In general, it is not possible to know whether such a uunniioonn contains a pointer. Consider:
uunniioonn U {
iinntt* pp;
iinntt ii;
};
// union with both pointer and non-pointer members
vvooiidd ff(U
U uu, U uu22, U uu33)
{
uu.pp = nneew
w iinntt;
uu22.ii = 999999999999;
uu.ii = 88;
// ...
}
The safe assumption is that any value that appears in such a uunniioonn is a pointer value. A clever garbage collector can do somewhat better. For example, it may notice that (for a given implementation) iinntts are not allocated with odd addresses and that no objects are allocated with an address as
low as 88. Noticing this will save the garbage collector from having to assume that objects containing locations 999999999999 and 8 are used by ff().
C.9.1.2 Delete
If an implementation automatically collects garbage, the ddeelleettee and ddeelleettee[] operators are no
longer needed to free memory for potential reuse. Thus, a user relying on a garbage collector could
simply refrain from using these operators. However, in addition to freeing memory, ddeelleettee and
ddeelleettee[] invoke destructors.
In the presence of a garbage collector,
ddeelleettee pp;
invokes the destructor for the object pointed to by p (if any). However, reuse of the memory can be
postponed until it is collected. Recycling lots of objects at once can help limit fragmentation
(§C.9.1.4). It also renders harmless the otherwise serious mistake of deleting an object twice in the
important case where the destructor simply deletes memory.
As always, access to an object after it has been deleted is undefined.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
846
Technicalities
Appendix C
C.9.1.3 Destructors
When an object is about to be recycled by a garbage collector, two alternatives exist:
[1] Call the destructor (if any) for the object.
[2] Treat the object as raw memory (don’t call its destructor).
By default, a garbage collector should choose option (2) because objects created using nneew
w and
never ddeelleetteed are never destroyed. Thus, one can see a garbage collector as a mechanism for simulating an infinite memory.
It is possible to design a garbage collector to invoke the destructors for objects that have been
specifically ‘‘registered’’ with the collector. However, there is no standard way of ‘‘registering’’
objects. Note that it is always important to destroy objects in an order that ensures that the
destructor for one object doesn’t refer to an object that has been previously destroyed. Such ordering isn’t easily achieved by a garbage collector without help from the programmer.
C.9.1.4 Memory Fragmentation
When a lot of objects of varying sizes are allocated and freed, the memory fragments. That is,
much of memory is consumed by pieces of memory that are too small to use effectively. The reason is that a general allocator cannot always find a piece of memory of the exact right size for an
object. Using a slightly larger piece means that a smaller fragment of memory remains. After running a program for a while with a naive allocator, it is not uncommon to find half the available
memory taken up with fragments too small ever to get reused.
Several techniques exist for coping with fragmentation. The simplest is to request only larger
chunks of memory from the allocator and use each such chunk for objects of the same size (§15.3,
§19.4.2). Because most allocations and deallocations are of small objects of types such as tree
nodes, links, etc., this technique can be very effective. An allocator can sometimes apply similar
techniques automatically. In either case, fragmentation is further reduced if all of the larger
‘‘chunks’’ are of the same size (say, the size of a page) so that they themselves can be allocated and
reallocated without fragmentation.
There are two main styles of garbage collectors:
[1] A copying collector moves objects in memory to compact fragmented space.
[2] A conservative collector allocates objects to minimize fragmentation.
From a C++ point of view, conservative collectors are preferable because it is very hard (probably
impossible in real programs) to move an object and modify all pointers to it correctly. A conservative collector also allows C++ code fragments to coexist with code written in languages such as C.
Traditionally, copying collectors have been favored by people using languages (such as Lisp and
Smalltalk) that deal with objects only indirectly through unique pointers or references. However,
modern conservative collectors seem to be at least as efficient as copying collectors for larger programs, in which the amount of copying and the interaction between the allocator and a paging system become important. For smaller programs, the ideal of simply never invoking the collector is
often achievable – especially in C++, where many objects are naturally automatic.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.10
Namespaces
847
C.10 Namespaces
This section presents minor points about namespaces that look like technicalities, yet frequently
surface in discussions and in real code.
C.10.1 Convenience vs. Safety
A using-declaration adds a name to a local scope. A using-directive does not; it simply renders
names accessible in the scope in which they were declared. For example:
nnaam
meessppaaccee X {
iinntt ii, jj, kk;
}
iinntt kk;
vvooiidd ff11()
{
iinntt i = 00;
uussiinngg nnaam
meessppaaccee X
X; // make names from X accessible
ii++;
// local i
jj++;
// X::j
kk++;
// error: X::k or global k ?
::kk++;
// the global k
X
X::kk++;
// X’s k
}
vvooiidd ff22()
{
iinntt i = 00;
uussiinngg X
X::ii;
uussiinngg X
X::jj;
uussiinngg X
X::kk;
ii++;
jj++;
kk++;
// error: i declared twice in f2()
// hides global k
// X::j
// X::k
}
A locally declared name (declared either by an ordinary declaration or by a using-declaration)
hides nonlocal declarations of the same name, and any illegal overloadings of the name are detected
at the point of declaration.
Note the ambiguity error for kk++ in ff11(). Global names are not given preference over names
from namespaces made accessible in the global scope. This provides significant protection against
accidental name clashes, and – importantly – ensures that there are no advantages to be gained
from polluting the global namespace.
When libraries declaring many names are made accessible through using-directives, it is a significant advantage that clashes of unused names are not considered errors.
The global scope is just another namespace. The global namespace is odd only in that you
don’t have to mention its name in an explicit qualification. That is, ::kk means ‘‘look for k in the
global namespace and in namespaces mentioned in uussiinngg-ddiirreeccttiivvees in the global namespace,’’
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
848
Technicalities
Appendix C
whereas X
X::kk means ‘‘the k declared in namespace X and namespaces mentioned in uussiinnggddiirreeccttiivvees in X
X’’ (§8.2.8).
I hope to see a radical decrease in the use of global names in new programs using namespaces
compared to traditional C and C++ programs. The rules for namespaces were specifically crafted to
give no advantages to a ‘‘lazy’’ user of global names over someone who takes care not to pollute
the global scope.
C.10.2 Nesting of Namespaces
One obvious use of namespaces is to wrap a complete set of declarations and definitions in a separate namespace:
nnaam
meessppaaccee X {
// all my declarations
}
The list of declarations will, in general, contain namespaces. Thus, nested namespaces are allowed.
This is allowed for practical reasons, as well as for the simple reason that constructs ought to nest
unless there is a strong reason for them not to. For example:
vvooiidd hh();
nnaam
meessppaaccee X {
vvooiidd gg();
// ...
nnaam
meessppaaccee Y {
vvooiidd ff();
vvooiidd ffff();
// ...
}
}
The usual scope and qualification rules apply:
vvooiidd X
X::Y
Y::ffff()
{
ff(); gg(); hh();
}
vvooiidd X
X::gg()
{
ff();
Y
Y::ff();
}
// error: no f() in X
// ok
vvooiidd hh()
{
ff();
Y
Y::ff();
X
X::ff();
X
X::Y
Y::ff();
}
// error: no global f()
// error: no global Y
// error: no f() in X
// ok
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.10.2
Nesting of Namespaces
849
C.10.3 Namespaces and Classes
A namespace is a named scope. A class is a type defined by a named scope that describes how
objects of that type can be created and used. Thus, a namespace is a simpler concept than a class
and ideally a class would be defined as a namespace with a few extra facilities included. This is
almost the case. A namespace is open (§8.2.9.3), but a class is closed. This difference stems from
the observation that a class needs to define the layout of an object and that is best done in one place.
Furthermore, uussiinngg-ddeeccllaarraattiioonns and uussiinngg-ddiirreeccttiivvees can be applied to classes only in a very
restricted way (§15.2.2).
Namespaces are preferred over classes when all that is needed is encapsulation of names. In
this case, the class apparatus for type checking and for creating objects is not needed; the simpler
namespace concept suffices.
C.11 Access Control
This section presents a few technical examples illustrating access control to supplement those presented in §15.3.
C.11.1 Access to Members
Consider:
ccllaassss X {
// private by default:
iinntt pprriivv;
pprrootteecctteedd:
iinntt pprroott;
ppuubblliicc:
iinntt ppuubbll;
vvooiidd m
m();
};
The member X
X::m
m() has unrestricted access:
vvooiidd X
X::m
m()
{
pprriivv = 11; // ok
pprroott = 22; // ok
ppuubbll = 33; // ok
}
A member of a derived class has access to public and protected members (§15.3):
ccllaassss Y : ppuubblliicc X {
vvooiidd m
mddeerriivveedd();
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
850
Technicalities
Appendix C
vvooiidd Y
Y::m
mddeerriivveedd()
{
pprriivv = 11; // error: priv is private
pprroott = 22; // ok: prot is protected and mderived() is a member of the derived class Y
ppuubbll = 33; // ok: publ is public
}
A global function can access only the public members:
vvooiidd ff(Y
Y* pp)
{
pp->pprriivv = 11;
pp->pprroott = 22;
pp->ppuubbll = 33;
}
// error: priv is private
// error: prot is protected and f() is not a friend or a member of X or Y
// ok: publ is public
C.11.2 Access to Base Classes
Like a member, a base class can be declared pprriivvaattee, pprrootteecctteedd, or ppuubblliicc. Consider:
ccllaassss X {
ppuubblliicc:
iinntt aa;
// ...
};
ccllaassss Y
Y11 : ppuubblliicc X { };
ccllaassss Y
Y22 : pprrootteecctteedd X { };
ccllaassss Y
Y33 : pprriivvaattee X { };
Because X is a public base of Y
Y11, any function can (implicitly) convert a Y
Y11* to an X
X* where
needed just as it can access the public members of class X
X. For example:
vvooiidd ff(Y
Y11* ppyy11, Y
Y22* ppyy22, Y
Y33* ppyy33)
{
X
X* ppxx = ppyy11; // ok: X is a public base class of Y1
ppyy11->aa = 77;
// ok
ppxx = ppyy22;
ppyy22->aa = 77;
// error: X is a protected base of Y2
// error
ppxx = ppyy33;
ppyy33->aa = 77;
// error: X is a private base of Y3
// error
}
Consider:
ccllaassss Y
Y22 : pprrootteecctteedd X { };
ccllaassss Z
Z22 : ppuubblliicc Y
Y22 { vvooiidd ff(Y
Y11*, Y
Y22*, Y
Y33*); };
Because X is a protected base of Y
Y22, only members and friends of Y
Y22 and members and friends of
Y
Y22’s derived classes (e.g., Z
Z22) can (implicitly) convert a Y
Y22* to an X
X* where needed, just as they
can access the public and protected members of class X
X. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.11.2
Access to Base Classes
851
vvooiidd Z
Z22::ff(Y
Y11* ppyy11, Y
Y22* ppyy22, Y
Y33* ppyy33)
{
X
X* ppxx = ppyy11; // ok: X is a public base class of Y1
ppyy11->aa = 77;
// ok
ppxx = ppyy22;
ppyy22->aa = 77;
// ok: X is a protected base of Y2, and Z2 is derived from Y2
// ok
ppxx = ppyy33;
ppyy33->aa = 77;
// error: X is a private base of Y3
// error
}
Consider finally:
ccllaassss Y
Y33 : pprriivvaattee X { vvooiidd ff(Y
Y11*, Y
Y22*, Y
Y33*); };
Because X is a private base of Y
Y33, only members and friends of Y
Y33 can (implicitly) convert a Y
Y33* to
an X
X* where needed, just as they can access the public and protected members of class X
X. For
example:
vvooiidd Y
Y33::ff(Y
Y11* ppyy11, Y
Y22* ppyy22, Y
Y33* ppyy33)
{
X
X* ppxx = ppyy11; // ok: X is a public base class of Y1
ppyy11->aa = 77;
// ok
ppxx = ppyy22;
ppyy22->aa = 77;
// error: X is a protected base of Y2
// error
ppxx = ppyy33;
ppyy33->aa = 77;
// ok: X is a private base of Y3, and Y3::f() is a member of Y3
// ok
}
C.11.3 Access to Member Class
The members of a member class have no special access to members of an enclosing class. Similarly members of an enclosing class have no special access to members of a nested class; the usual
access rules (§10.2.2) shall be obeyed. For example:
ccllaassss O
Ouutteerr {
ttyyppeeddeeff iinntt T
T;
iinntt ii;
ppuubblliicc:
iinntt ii22;
ssttaattiicc iinntt ss;
ccllaassss IInnnneerr {
iinntt xx;
T yy; // error: Outer::T is private
ppuubblliicc:
vvooiidd ff(O
Ouutteerr* pp, iinntt vv);
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
852
Technicalities
Appendix C
iinntt gg(IInnnneerr* pp);
};
vvooiidd O
Ouutteerr::IInnnneerr::ff(O
Ouutteerr* pp, iinntt vv)
{
pp->ii = vv;
// error: Outer::i is private
pp->ii22 = vv;
// ok: Outer::i2 is public
}
iinntt O
Ouutteerr::gg(IInnnneerr* pp)
{
pp->ff(tthhiiss,22); // ok: Inner::f() is public
rreettuurrnn pp->xx; // error: Inner::x is private
}
However, it is often useful to grant a member class access to its enclosing class. This can be done
by making the member a ffrriieenndd. For example:
ccllaassss O
Ouutteerr {
ttyyppeeddeeff iinntt T
T;
iinntt ii;
ppuubblliicc:
ccllaassss IInnnneerr;
// forward declaration of member class
ffrriieenndd ccllaassss IInnnneerr; // grant access to Outer::Inner
ccllaassss IInnnneerr {
iinntt xx;
T yy;
// ok: Inner is a friend
ppuubblliicc:
vvooiidd ff(O
Ouutteerr* pp, iinntt vv);
};
};
vvooiidd O
Ouutteerr::IInnnneerr::ff(O
Ouutteerr* pp, iinntt vv)
{
pp->ii = vv; // ok: Inner is a friend
}
C.11.4 Friendship
Friendship is neither inherited nor transitive. For example:
ccllaassss A {
ffrriieenndd ccllaassss B
B;
iinntt aa;
};
ccllaassss B {
ffrriieenndd ccllaassss C
C;
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.11.4
Friendship
853
ccllaassss C {
vvooiidd ff(A
A* pp)
{
pp->aa++; // error: C is not a friend of A, despite being a friend of a friend of A
}
};
ccllaassss D : ppuubblliicc B {
vvooiidd ff(A
A* pp)
{
pp->aa++; // error: D is not a friend of A, despite being derived from a friend of A
}
};
C.12 Pointers to Data Members
Naturally, the notion of pointer to member (§15.5) applies to data members and to member functions with arguments and return types. For example:
ssttrruucctt C {
cchhaarr* vvaall;
iinntt ii;
vvooiidd pprriinntt(iinntt xx) { ccoouutt << vvaall << x << ´\\nn´; }
vvooiidd ff11();
iinntt ff22();
C
C(cchhaarr* vv) { vvaall = vv; }
};
ttyyppeeddeeff vvooiidd (C
C::*P
PM
MF
FII)(iinntt);
ttyyppeeddeeff cchhaarr* C
C::*P
PM
M;
// pointer to member function of C taking an int
// pointer to char* data member of C
vvooiidd ff(C
C& zz11, C
C& zz22)
{
C
C* p = &zz22;
P
PM
MF
FII ppff = &C
C::pprriinntt;
P
PM
M ppm
m = &C
C::vvaall;
zz11.pprriinntt(11);
(zz11.*ppff)(22);
zz11.*ppm
m = "nnvv11 ";
pp->*ppm
m = "nnvv22 ";
zz22.pprriinntt(33);
(pp->*ppff)(44);
ppff = &C
C::ff11;
ppff = &C
C::ff22;
ppm
m = &C
C::ii;
ppm
m = ppff;
// error: return type mismatch
// error: argument type mismatch
// error: type mismatch
// error: type mismatch
}
The type of a pointer to function is checked just like any other type.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
854
Technicalities
Appendix C
C.13 Templates
A class template specifies how a class can be generated given a suitable set of template arguments.
Similarly, a function template specifies how a function can be generated given a suitable set of template arguments. Thus, a template can be used to generate types and executable code. With this
expressive power comes some complexity. Most of this complexity relates to the variety of contexts involved in the definition and use of templates.
C.13.1 Static Members
A class template can have ssttaattiicc members. Each class generated from the template has its own
copy of the static members. Static members must be separately defined and can be specialized. For
example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss X {
// ...
ssttaattiicc T ddeeff__vvaall;
ssttaattiicc T
T* nneew
w__X
X(T
T a = ddeeff__vvaall);
};
tteem
mppllaattee<ccllaassss T
T> T X
X<T
T>::ddeeff__vvaall(00,00);
tteem
mppllaattee<ccllaassss T
T> T
T* X
X<T
T>::nneew
w__X
X(T
T aa) { /* ... */ }
tteem
mppllaattee<> iinntt X
X<iinntt>::ddeeff__vvaall<iinntt> = 00;
tteem
mppllaattee<> iinntt* X
X<iinntt>::nneew
w__X
X<iinntt>(iinntt ii) { /* ... */ }
If you want to share an object or function among all members of every class generated from a template, you can place it in a non-templatized base class. For example:
ssttrruucctt B {
ssttaattiicc B
B* nniill;
};
// to be used as common null pointer for every class derived from B
tteem
mppllaattee<ccllaassss T
T> ccllaassss X : ppuubblliicc B {
// ...
};
B
B* B
B::nniill = 00;
C.13.2 Friends
Like other classes, a template class can have friends. For example, comparison operators are typically friends, so we can rewrite class B
Baassiicc__ooppss from §13.6 like this:
tteem
mppllaattee <ccllaassss C
C> ccllaassss B
Baassiicc__ooppss { // basic operators on containers
ffrriieenndd bbooooll ooppeerraattoorr==(ccoonnsstt C
C&, ccoonnsstt C
C&); // compare elements
ffrriieenndd bbooooll ooppeerraattoorr!=(ccoonnsstt C
C&, ccoonnsstt C
C&);
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.2
Friends
855
tteem
mppllaattee<ccllaassss T
T> ccllaassss M
Maatthh__ccoonnttaaiinneerr : ppuubblliicc B
Baassiicc__ooppss< M
Maatthh__ccoonnttaaiinneerr<T
T> > {
// ...
};
Like a member, a friend declared within a template is itself a template and is defined using the template parameters of its class. For example:
tteem
mppllaattee <ccllaassss C
C> bbooooll ooppeerraattoorr==(ccoonnsstt C
C& aa, ccoonnsstt C
C& bb)
{
iiff (aa.ssiizzee() != bb.ssiizzee()) rreettuurrnn ffaallssee;
ffoorr (iinntt i = 00; ii<aa.ssiizzee(); ++ii)
iiff (aa[ii] != bb[ii]) rreettuurrnn ffaallssee;
rreettuurrnn ttrruuee;
}
Friends do not affect the scope in which the template class is defined, nor do they affect the scope
in which the template is used. Instead, friend functions and operators are found using a lookup
based on their argument types (§11.2.4, §11.5.1). Like a member function, a friend function is
instantiated (§C.13.9.1) only if it is called.
C.13.3 Templates as Template Parameters
Sometimes it is useful to pass templates – rather than classes or objects – as template arguments.
For example:
tteem
mppllaattee<ccllaassss T
T, tteem
mppllaattee<ccllaassss> ccllaassss C
C> ccllaassss X
Xrreeffdd {
C
C<T
T> m
meem
mss;
C
C<T
T*> rreeffss;
// ...
};
X
Xrreeffdd<E
Ennttrryy,vveeccttoorr> xx11;
// store cross references for Entries in a vector
X
Xrreeffdd<R
Reeccoorrdd,sseett> xx22;
// store cross references for Records in a set
To use a template as a template parameter, you specify its required arguments. The template
parameters of the template parameter need to be known in order to use the template parameter. The
point of using a template as a template parameter is usually that we want to instantiate it with a
variety of argument types (such as T and T
T* in the previous example). That is, we want to express
the member declarations of a template in terms of another template, but we want that other template
to be a parameter so that it can be specified by users.
The common case in which a template needs a container to hold elements of its own argument
type is often better handled by passing the container type (§13.6, §17.3.1).
Only class templates can be template arguments.
C.13.4 Deducing Function Template Arguments
A compiler can deduce a type template argument, T or T
TT
T, and a non-type template argument, II,
from a template function argument with a type composed of the following constructs:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
856
Technicalities
T
T
T*
ttyyppee[II]
T
TT
T<T
T>
T ttyyppee::*
T (*)(aarrggss)
ttyyppee (ttyyppee::*)(aarrggss__T
TII)
T (ttyyppee::*)(aarrggss__T
TII)
Appendix C
ccoonnsstt T
T
T&
ccllaassss__tteem
mppllaattee__nnaam
mee<T
T>
T
T<II>
T T
T::*
ttyyppee (T
T::*)(aarrggss)
T (T
T::*)(aarrggss__T
TII)
ttyyppee (*)(aarrggss__T
TII)
vvoollaattiillee T
T
T[ccoonnssttaanntt__eexxpprreessssiioonn]
ccllaassss__tteem
mppllaattee__nnaam
mee<II>
T
T<>
ttyyppee T
T::*
T (ttyyppee::*)(aarrggss)
ttyyppee (T
T::*)(aarrggss__T
TII)
Here, aarrggss__T
TII is a parameter list from which a T or an I can be determined by recursive application
of these rules and aarrggss is a parameter list that does not allow deduction. If not all parameters can
be deduced in this way, a call is ambiguous. For example:
tteem
mppllaattee<ccllaassss T
T, ccllaassss U
U> vvooiidd ff(ccoonnsstt T
T*, U
U(*)(U
U));
iinntt gg(iinntt);
vvooiidd hh(ccoonnsstt cchhaarr* pp)
{
ff(pp,gg); // T is char, U is int
ff(pp,hh); // error: can’t deduce U
}
Looking at the arguments of the first call of ff(), we easily deduce the template arguments. Looking at the second call of ff(), we see that hh() doesn’t match the pattern U
U(*)(U
U) because hh()’s
argument and return types differ.
If a template parameter can be deduced from more than one function argument, the same type
must be the result of each deduction. Otherwise, the call is an error. For example:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ff(T
T ii, T
T* pp);
vvooiidd gg(iinntt ii)
{
ff(ii,&ii);
// ok
ff(ii,"R
Reem
meem
mbbeerr!"); // error, ambiguous: T is int or T is char?
}
C.13.5 Typename and Template
To make generic programming easier and more general, the standard library containers provide a
set of standard functions and types (§16.3.1). For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss vveeccttoorr {
ppuubblliicc:
ttyyppeeddeeff T vvaalluuee__ttyyppee;
ttyyppeeddeeff T
T* iitteerraattoorr;
iitteerraattoorr bbeeggiinn();
iitteerraattoorr eenndd();
// ...
};
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.5
Typename and Template
857
tteem
mppllaattee<ccllaassss T
T> ccllaassss lliisstt {
ccllaassss lliinnkk {
// ...
};
ppuubblliicc:
ttyyppeeddeeff T vvaalluuee__ttyyppee;
ttyyppeeddeeff lliinnkk* iitteerraattoorr;
iitteerraattoorr bbeeggiinn();
iitteerraattoorr eenndd();
// ...
};
This allows us to write:
vvooiidd ff11(vveeccttoorr<T
T>& vv)
{
vveeccttoorr<T
T>::iitteerraattoorr i = vv.bbeeggiinn();
// ...
}
vvooiidd ff22(lliisstt<T
T>& vv)
{
lliisstt<T
T>::iitteerraattoorr i = vv.bbeeggiinn();
// ...
}
However, this does not allow us to write:
tteem
mppllaattee<ccllaassss C
C> vvooiidd ff44(C
C& vv)
{
C
C::iitteerraattoorr i = vv.bbeeggiinn(); // error
// ...
}
Unfortunately, the compiler isn’t required to be psychic, so it doesn’t know that C
C::iitteerraattoorr is the
name of a type. In the previous example, the compiler could look at the declaration of vveeccttoorr<> to
determine that the iitteerraattoorr in vveeccttoorr<T
T>::iitteerraattoorr was a type. That is not possible when the qualifier is a type parameter. Naturally, a compiler could postpone all checking until instantiation time
where all information is available and could then accept such examples. However, that would be a
nonstandard language extension.
Consider an example stripped of clues as to its meaning:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ff55(T
T& vv)
{
T
T::xx(yy); // error?
}
Is T
T::xx a function called with a nonlocal variable y as its argument? Or, are we declaring a variable y with the type T
T::xx perversely using redundant parentheses? We could imagine a context in
which X
X::xx(yy) was a function call and Y
Y::xx(yy) was a declaration.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
858
Technicalities
Appendix C
The resolution is simple: unless otherwise stated, an identifier is assumed to refer to something
that is not a type or a template. If we want to state that something should be treated as a type, we
can do so using the ttyyppeennaam
mee keyword:
tteem
mppllaattee<ccllaassss C
C> vvooiidd ff44(C
C& vv)
{
ttyyppeennaam
mee C
C::iitteerraattoorr i = vv.bbeeggiinn();
// ...
}
The ttyyppeennaam
mee keyword can be placed in front of a qualified name to state that the entity named is a
type. In this, it resembles ssttrruucctt and ccllaassss.
The ttyyppeennaam
mee keyword can also be used as an alternative to ccllaassss in template declarations. For
example:
tteem
mppllaattee<ttyyppeennaam
mee T
T> vvooiidd ff(T
T);
Being an indifferent typist and always short of screen space, I prefer the shorter:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ff(T
T);
C.13.6 Template as a Qualifier
The need for the ttyyppeennaam
mee qualifier arises because we can refer both to members that are types and
to members that are non-types. We can also have members that are templates. In rare cases, the
need to distinguish the name of a template member from other member names can arise. Consider
a possible interface to a general memory manager:
ccllaassss M
Meem
moorryy { // some Allocator
ppuubblliicc:
tteem
mppllaattee<ccllaassss T
T> T
T* ggeett__nneew
w();
tteem
mppllaattee<ccllaassss T
T> vvooiidd rreelleeaassee(T
T&);
// ...
};
tteem
mppllaattee<ccllaassss A
Allllooccaattoorr> vvooiidd ff(A
Allllooccaattoorr& m
m)
{
iinntt* pp11 = m
m.ggeett__nneew
w<iinntt>();
// syntax error: int after less-than operator
iinntt* pp22 = m
m.tteem
mppllaattee ggeett__nneew
w<iinntt>();
// explicit qualification
// ...
m
m.rreelleeaassee(pp11); // template argument deduced: no explicit qualification needed
m
m.rreelleeaassee(pp22);
}
Explicit qualification of ggeett__nneew
w() is necessary because its template parameter cannot be deduced.
In this case, the tteem
mppllaattee prefix must be used to inform the compiler (and the human reader) that
ggeett__nneew
w is a member template so that explicit qualification with the desired type of element is possible. Without the qualification with tteem
mppllaattee, we would get a syntax error because the < would be
assumed to be a less-than operator. The need for qualification with tteem
mppllaattee is rare because most
template parameters are deduced.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.7
Instantiation
859
C.13.7 Instantiation
Given a template definition and a use of that template, it is the implementation’s job to generate
correct code. From a class template and a set of template arguments, the compiler needs to generate the definition of a class and the definitions of those of its member functions that were used.
From a template function, a function needs to be generated. This process is commonly called
template instantiation.
The generated classes and functions are called specializations. When there is a need to distinguish between generated specializations and specializations explicitly written by the programmer
(§13.5), these are referred to as generated specializations and explicit specializations, respectively.
An explicit specialization is sometimes referred to as a user-defined specialization, or simply a user
specialization.
To use templates in nontrivial programs, a programmer must understand how names used in a
template definition are bound to declarations and how source code can be organized (§13.7).
By default, the compiler generates classes and functions from the templates used in accordance
with the name-binding rules (§C.13.8). That is, a programmer need not state explicitly which versions of which templates must be generated. This is important because it is not easy for a programmer to know exactly which versions of a template are needed. Often, templates that the programmer hasn’t even heard of are used in the implementation of libraries, and sometimes templates that
the programmer does know of are used with unknown template argument types. In general, the set
of generated functions needed can be known only by recursive examination of the templates used in
application code libraries. Computers are better suited than humans for doing such analysis.
However, it is sometimes important for a programmer to be able to state specifically where code
should be generated from a template (§C.13.10). By doing so, the programmer gains detailed control over the context of the instantiation. In most compilation environments, this also implies control over exactly when that instantiation is done. In particular, explicit instantiation can be used to
force compilation errors to occur at predictable times rather than occurring whenever an implementation determines the need to generate a specialization. A perfectly predictable build process is
essential to some users.
C.13.8 Name Binding
It is important to define template functions so that they have as few dependencies as possible on
nonlocal information. The reason is that a template will be used to generate functions and classes
based on unknown types and in unknown contexts. Every subtle context dependency is likely to
surface as a debugging problem for some programmer – and that programmer is unlikely to want to
know the implementation details of the template. The general rule of avoiding global names as far
as possible should be taken especially seriously in template code. Thus, we try to make template
definitions as self-contained as possible and to supply much of what would otherwise have been
global context in the form of template parameters (e.g., traits; §13.4, §20.2.1).
However, some nonlocal names must be used. In particular, it is more common to write a set of
cooperating template functions than to write just one self-contained function. Sometimes, such
functions can be class members, but not always. Sometimes, nonlocal functions are the best
choice. Typical examples of that are ssoorrtt()’s calls to ssw
waapp() and lleessss() (§13.5.2). The standard
library algorithms provide a large-scale example (Chapter 18).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
860
Technicalities
Appendix C
Operations with conventional names and semantics, such as +, *, [], and ssoorrtt(), are another
source of nonlocal name use in a template definition. Consider:
#iinncclluuddee<vveeccttoorr>
bbooooll ttrraacciinngg;
// ...
tteem
mppllaattee<ccllaassss T
T> T ssuum
m(ssttdd::vveeccttoorr<T
T>& vv)
{
T t = 00;
iiff (ttrraacciinngg) cceerrrr << "ssuum
m(" << &vv << ")\\nn";
ffoorr (iinntt i = 00; ii<vv.ssiizzee(); ii++) t = t + vv[ii];
rreettuurrnn tt;
}
// ...
#iinncclluuddee<qquuaadd.hh>
vvooiidd ff(ssttdd::vveeccttoorr<Q
Quuaadd>& vv)
{
Q
Quuaadd c = ssuum
m(vv);
}
The innocent-looking template function ssuum
m() depends on the + operator. In this example, + is
defined in <qquuaadd.hh>:
Q
Quuaadd ooppeerraattoorr+(Q
Quuaadd,Q
Quuaadd);
Importantly, nothing related to complex numbers is in scope when ssuum
m() is defined and the writer
of ssuum
m() cannot be assumed to know about class Q
Quuaadd. In particular, the + may be defined later
than ssuum
m() in the program text, and even later in time.
The process of finding the declaration for each name explicitly or implicitly used in a template
is called name binding. The general problem with template name binding is that three contexts are
involved in a template instantiation and they cannot be cleanly separated:
[1] The context of the template definition
[2] The context of the argument type declaration
[3] The context of the use of the template
C.13.8.1 Dependent Names
When defining a function template, we want to assure that enough context is available for the template definition to make sense in terms of its actual arguments without picking up ‘‘accidental’’
stuff from the environment of a point of use. To help with this, the language separates names used
in a template definition into two categories:
[1] Names that depend on a template argument. Such names are bound at some point of instantiation (§C.13.8.3). In the ssuum
m() example, the definition of + can be found in the instantiation context because it takes operands of the template argument type.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.8.1
Dependent Names
861
[2] Names that don’t depend on a template argument. Such names are bound at the point of
definition of the template (§C.13.8.2). In the ssuum
m() example, the template vveeccttoorr is
defined in the standard header <vveeccttoorr> and the Boolean ttrraacciinngg is in scope when the definition of ssuum
m() is encountered by the compiler.
The simplest definition of ‘‘N
N depends on a template parameter T
T’’ would be ‘‘N
N is a member of
T
T.’’ Unfortunately, this doesn’t quite suffice; addition of Q
Quuaadds (§C.13.8) is a counter-example.
Consequently, a function call is said to depend on a template argument if and only if one of these
conditions hold:
[1] The type of the actual argument depends on a template parameter T according to the type
deduction rules (§13.3.1). For example, ff(T
T(11)), ff(tt), ff(gg(tt)), and ff(&tt), assuming that
t is a T
T.
[2] The function called has a formal parameter that depends on T according to the type deduction rules (§13.3.1). For example, ff(T
T), ff(lliisstt<T
T>&), and ff(ccoonnsstt T
T*).
Basically, the name of a function called is dependent if it is obviously dependent by looking at its
arguments or at its formal parameters.
A call that by coincidence has an argument that matches an actual template parameter type is
not dependent. For example:
tteem
mppllaattee<ccllaassss T
T> T ff(T
T aa)
{
rreettuurrnn gg(11); // error: no g() in scope and g(1) doesn’t depend on T
}
vvooiidd gg(iinntt);
iinntt z = ff(22);
It doesn’t matter that for the call ff(22), T happens to be iinntt and gg()’s argument just happens to be
an iinntt. Had gg(11) been considered dependent, its meaning would have been most subtle and mysterious to the reader of the template definition. If a programmer wants gg(iinntt) to be called, gg(iinntt)’s
definition should be placed before the definition of ff() so that gg(iinntt) is in scope when ff() is analyzed. This is exactly the same rule as for non-template function definitions.
Note that only names of functions used in calls can be dependent names according to this definition. Names of variables, class members, types, etc., in a template definition must be declared
(possibly in terms of template parameters) before they are used.
C.13.8.2 Point of Definition Binding
When the compiler sees a template definition, it determines which names are dependent
(§C.13.8.1). If a name is dependent, looking for its declaration must be postponed until instantiation time (§C.13.8.3).
Names that do not depend on a template argument must be in scope (§4.9.4) at the point of definition. For example:
iinntt xx;
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
862
Technicalities
Appendix C
tteem
mppllaattee<ccllaassss T
T> T ff(T
T aa)
{
xx++;
// ok
yy++;
// error: no y in scope, and y doesn’t depend on T
rreettuurrnn aa;
}
iinntt yy;
iinntt z = ff(22);
If a declaration is found, that declaration is used even if a ‘‘better’’ declaration might be found
later. For example:
vvooiidd gg(ddoouubbllee);
tteem
mppllaattee<ccllaassss T
T> ccllaassss X : ppuubblliicc T {
ppuubblliicc:
vvooiidd ff() { gg(22); } // call g(double);
// ...
};
vvooiidd gg(iinntt);
ccllaassss Z { };
vvooiidd hh(X
X<Z
Z> xx)
{
xx.ff();
}
When a definition for X
X<Z
Z>::ff() is generated, gg(iinntt) is not considered because it is declared
after X
X. It doesn’t matter that X is not used until after the declaration of gg(iinntt). Also, a call that
isn’t dependent cannot be hijacked in a base class:
ccllaassss Y { ppuubblliicc: vvooiidd gg(iinntt); };
vvooiidd hh(X
X<Y
Y> xx)
{
xx.ff();
}
Again, X
X<Y
Y>::ff() will call gg(ddoouubbllee). If the programmer had wanted the gg() from the base
class T to be called, the definition of ff() should have said so:
tteem
mppllaattee<ccllaassss T
T> ccllaassss X
XX
X : ppuubblliicc T {
vvooiidd ff() { T
T::gg(22); }
// calls T::g()
// ...
};
This is, of course, an application of the rule of thumb that a template definition should be as selfcontained as possible.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.8.3
Point of Instantiation Binding
863
C.13.8.3 Point of Instantiation Binding
Each use of a template for a given set of template arguments defines a point of instantiation. That
point is in the nearest global or namespace scope enclosing its use, just before the declaration that
contains that use. For example:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ff(T
T aa) { gg(aa); }
vvooiidd gg(iinntt);
vvooiidd hh()
{
eexxtteerrnn gg(ddoouubbllee);
ff(22);
}
Here, the point of instantiation for ff<iinntt>() is just before hh(), so the gg() called in ff() is the global gg(iinntt) rather than the local gg(ddoouubbllee). The definition of ‘‘instantiation point’’ implies that a
template parameter can never be bound to a local name or a class member. For example:
vvooiidd ff()
{
ssttrruucctt X { /* ... */ };
vveeccttoorr<X
X> vv;
// ...
}
// local structure
// error: cannot use local structure as template parameter
Nor can an unqualified name used in a template ever be bound to a local name. Finally, even if a
template is first used within a class, unqualified names used in the template will not be bound to
members of that class. Ignoring local names is essential to prevent a lot of nasty macro-like behavior. For example:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssoorrtt(vveeccttoorr<T
T>& vv)
{
ssoorrtt(vv.bbeeggiinn(),vv.eenndd());
// use standard library sort()
}
ccllaassss C
Coonnttaaiinneerr {
vveeccttoorr<iinntt> vv; // elements
// ...
ppuubblliicc:
vvooiidd ssoorrtt()
// sort elements
{
ssoorrtt(vv); // invokes sort(vector<int>&) rather than Container::sort()
}
// ...
};
If the point of instantiation for a template defined in a namespace is in another namespace, names
from both namespaces are available for name binding. As always, overload resolution is used to
choose between names from different namespaces (§8.2.9.2).
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
864
Technicalities
Appendix C
Note that a template used several times with the same set of template arguments has several
points of instantiation. If the bindings of independent names differ, the program is illegal. However, this is a difficult error for an implementation to detect, especially if the points of instantiation
are in different translation units. It is best to avoid subtleties in name binding by minimizing the
use of nonlocal names in templates and by using header files to keep use contexts consistent.
C.13.8.4 Templates and Namespaces
When a function is called, its declaration can be found even if it is not in scope, provided it is
declared in the same namespace as one of its arguments (§8.2.6). This is very important for functions called in template definitions because it is the mechanism by which dependent functions are
found during instantiation.
A template specialization may be generated at any point of instantiation (§C.13.8.3), any point
subsequent to that in a translation unit, or in a translation unit specifically created for generating
specializations. This reflects three obvious strategies an implementation can use for generating
specializations:
[1] Generate a specialization the first time a call is seen.
[2] At the end of a translation unit, generate all specializations needed for that translation unit.
[3] Once every translation unit of a program has been seen, generate all specializations needed
for the program.
All three strategies have strengths and weaknesses, and combinations of these strategies are also
possible.
In any case, the binding of independent names is done at a point of template definition. The
binding of dependent names is done by looking at
[1] the names in scope at the point where the template is defined, plus
[2] the names in the namespace of an argument of a dependent call (global functions are considered in the namespace of built-in types).
For example:
nnaam
meessppaaccee N {
ccllaassss A { /* ... */ };
cchhaarr ff(A
A);
}
cchhaarr ff(iinntt);
tteem
mppllaattee<ccllaassss T
T> cchhaarr gg(T
T tt) { rreettuurrnn ff(tt); }
cchhaarr c = gg(N
N::A
A());
// causes N::f(N::A) to be called
Here, ff(tt) is clearly dependent, so we can’t bind f to ff(N
N::A
A) or ff(iinntt) at the point of definition.
To generate a specialization for gg<N
N::A
A>(N
N::A
A), the implementation looks in namespace N for
functions called ff() and finds N
N::ff(N
N::A
A).
A program is illegal, if it is possible to construct two different meanings by choosing different
points of instantiation or different contents of namespaces at different possible contexts for generating the specialization. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.8.4
Templates and Namespaces
865
nnaam
meessppaaccee N {
ccllaassss A { /* ... */ };
cchhaarr ff(A
A,iinntt);
}
tteem
mppllaattee<ccllaassss T
T, ccllaassss T
T22> cchhaarr gg(T
T tt, T
T22 tt22) { rreettuurrnn ff(tt,tt22); }
cchhaarr c = gg(N
N::A
A(),´aa´);
// error (alternative resolutions of f(t) possible)
nnaam
meessppaaccee N {
vvooiidd ff(A
A,cchhaarr);
}
// add to namespace N (§8.2.9.3)
We could generate the specialization at the point of instantiation and get ff(N
N::A
A,iinntt) called.
Alternatively, we could wait and generate the specialization at the end of the translation unit and
get ff(N
N::A
A,cchhaarr) called. Consequently, the call gg(N
N::A
A(),´aa´) is an error.
It is sloppy programming to call an overloaded function in between two of its declarations.
Looking at a large program, a programmer would have no reason to suspect a problem. In this particular case, a compiler could catch the ambiguity. However, similar problems can occur in separate translation units, and then detection becomes much harder. An implementation is not obliged
to catch problems of this kind.
Most problems with alternative resolutions of function calls involve built-in types. Consequently, most remedies rely on more-careful use of arguments of built-in types.
As usual, use of global functions can make matters worse. The global namespace is considered
the namespace associated with built-in types, so global functions can be used to resolve dependent
calls that take built-in types. For example:
iinntt ff(iinntt);
tteem
mppllaattee<ccllaassss T
T> T gg(T
T tt) { rreettuurrnn ff(tt); }
cchhaarr c = gg(´aa´);
// error: alternative resolutions of f(t) are possible
cchhaarr ff(cchhaarr);
We could generate the specialization gg<cchhaarr>(cchhaarr) at the point of instantiation and get ff(iinntt)
called. Alternatively, we could wait and generate the specialization at the end of the translation
unit and get ff(cchhaarr) called. Consequently, the call gg(´aa´) is an error.
C.13.9 When Is a Specialization Needed?
It is necessary to generate a specialization of a class template only if the class’ definition is needed.
In particular, to declare a pointer to some class, the actual definition of a class is not needed. For
example:
ccllaassss X
X;
X
X* pp;
// ok: no definition of X needed
X aa;
// error: definition of X needed
When defining template classes, this distinction can be crucial. A template class is not instantiated
unless its definition is actually needed. For example:
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
866
Technicalities
Appendix C
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liinnkk {
L
Liinnkk* ssuucc; // ok: no definition of Link needed (yet)
// ...
};
L
Liinnkk<iinntt>* ppll; // no instantiation of Link<int> needed
L
Liinnkk<iinntt> llnnkk; // now we need to instantiate Link<int>
The point of instantiation is where a definition is first needed.
C.13.9.1 Template Function Instantiation
An implementation instantiates a template function only if that function has been used. In particular, instantiation of a class template does not imply the instantiation of all of its members or even of
all of the members defined in the template class declaration. This allows the programmer an important degree of flexibility when defining a template class. Consider:
tteem
mppllaattee<ccllaassss T
T> ccllaassss L
Liisstt {
// ...
vvooiidd ssoorrtt();
};
ccllaassss G
Glloobb { /* no comparison operators */ };
vvooiidd ff(L
Liisstt<G
Glloobb>& llbb, L
Liisstt<ssttrriinngg>& llss)
{
llss.ssoorrtt();
// use operations on lb, but not lb.sort()
}
Here, L
Liisstt<ssttrriinngg>::ssoorrtt() is instantiated, but L
Liisstt<G
Glloobb>::ssoorrtt() isn’t. This both reduces the
amount of code generated and saves us from having to redesign the program. Had
L
Liisstt<G
Glloobb>::ssoorrtt() been generated, we would have had to either add the operations needed by
vveeccttoorr::ssoorrtt() to G
Glloobb, redefine ssoorrtt() so that it wasn’t a member of L
Liisstt, or use some other
container for G
Glloobbs.
C.13.10 Explicit Instantiation
An explicit instantiation request is a declaration of a specialization prefixed by the keyword tteem
m-ppllaattee (not followed by <):
tteem
mppllaattee ccllaassss vveeccttoorr<iinntt>;
tteem
mppllaattee iinntt& vveeccttoorr<iinntt>::ooppeerraattoorr[](iinntt);
tteem
mppllaattee iinntt ccoonnvveerrtt<iinntt,ddoouubbllee>(ddoouubbllee);
// class
// member
// function
A template declaration starts with tteem
mppllaattee<, whereas plain tteem
mppllaattee starts an instantiation request.
Note that tteem
mppllaattee prefixes a complete declaration; just stating a name is not sufficient:
tteem
mppllaattee vveeccttoorr<iinntt>::ooppeerraattoorr[];
tteem
mppllaattee ccoonnvveerrtt<iinntt,ddoouubbllee>;
// syntax error
// syntax error
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Section C.13.10
Explicit Instantiation
867
As in template function calls, the template arguments that can be deduced from the function arguments can be omitted (§13.3.1). For example:
tteem
mppllaattee iinntt ccoonnvveerrtt<iinntt,ddoouubbllee>(ddoouubbllee);
tteem
mppllaattee iinntt ccoonnvveerrtt<iinntt>(ddoouubbllee);
// ok (redundant)
// ok
When a class template is explicitly instantiated, every member function is also instantiated.
Note that an explicit instantiation can be used as a constraints check (§13.6.2). For example:
tteem
mppllaattee<ccllaassss T
T> ccllaassss C
Caallllss__ffoooo {
vvooiidd ccoonnssttrraaiinnttss(T
T tt) { ffoooo(tt); }
// ...
};
// call from every constructor
tteem
mppllaattee ccllaassss C
Caallllss__ffoooo<iinntt>;
tteem
mppllaattee C
Caallllss__ffoooo<SShhaappee*>::ccoonnssttrraaiinnttss();
// error: foo(int) undefined
// error: foo(Shape*) undefined
The link-time and recompilation efficiency impact of instantiation requests can be significant. I
have seen examples in which bundling most template instantiations into a single compilation unit
cut the compile time from a number of hours to the equivalent number of minutes.
It is an error to have two definitions for the same specialization. It does not matter if such multiple specializations are user-defined (§13.5), implicitly generated (§C.13.7), or explicitly
requested. However, a compiler is not required to diagnose multiple instantiations in separate compilation units. This allows a smart implementation to ignore redundant instantiations and thereby
avoid problems related to composition of programs from libraries using explicit instantiation
(§C.13.7). However, implementations are not required to be smart. Users of ‘‘less smart’’ implementations must avoid multiple instantiations. However, the worst that will happen if they don’t is
that their program won’t load; there will be no silent changes of meaning.
The language does not require that a user request explicit instantiation. Explicit instantiation is
an optional mechanism for optimization and manual control of the compile-and-link process
(§C.13.7).
C.14 Advice
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
Focus on software development rather than technicalities; §C.1.
Adherence to the standard does not guarantee portability; §C.2.
Avoid undefined behavior (including proprietary extensions); §C.2.
Localize implementation-defined behavior; §C.2.
Use keywords and digraphs to represent programs on systems where { } [ ] | are missing
and trigraphs if \ or ! are missing; §C.3.1.
To ease communication, use the ANSI characters to represent programs; §C.3.3.
Prefer symbolic escape characters to numeric representation of characters; §C.3.2.
Do not rely on signedness or unsignedness of cchhaarr; §C.3.4.
If in doubt about the type of an integer literal, use a suffix; §C.4.
Avoid value-destroying implicit conversions; §C.6.
Prefer vveeccttoorr over array; §C.7.
Avoid uunniioonns; §C.8.2.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
868
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
Technicalities
Appendix C
Use fields to represent externally-imposed layouts; §C.8.1.
Be aware of the tradeoffs between different styles of memory management; §C.9.
Don’t pollute the global namespace; §C.10.1.
Where a scope (module) rather than a type is needed, prefer a nnaam
meessppaaccee over a ccllaassss;
§C.10.3.
Remember to define ssttaattiicc class template members; §C.13.1.
Use ttyyppeennaam
mee to disambiguate type members of a template parameter; §C.13.5.
Where explicit qualification by template arguments is necessary, use tteem
mppllaattee to disambiguate
template class members; §C.13.6.
Write template definitions with minimal dependence on their instantiation context; §C.13.8.
If template instantiation takes too long, consider explicit instantiation; §C.13.10.
If the order of compilation needs to be perfectly predictable, consider explicit instantiation;
§C.13.10.
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
D
Appendix
________________________________________
________________________________________________________________________________________________________________________________________________________________
Locales
When in Rome,
do as the Romans do.
– proverb
Handling cultural differences — class llooccaallee — named locales — constructing locales
— copying and comparing locales — the gglloobbaall() and ccllaassssiicc() locales — comparing
strings — class ffaacceett — accessing facets in a locale — a simple user-defined facet —
standard facets — string comparison — numeric I/O — money I/O — date and time I/O
— low-level time operations — a D
Daattee class — character classification — character
code conversion — message catalogs — advice — exercises.
D.1 Handling Cultural Differences
A llooccaallee is an object that represents a set of cultural preferences, such as how strings are compared,
the way numbers appear as human-readable output, and the way characters are represented in external storage. The notion of a locale is extensible so that a programmer can add new ffaacceetts to a
llooccaallee representing locale-specific entities not directly supported by the standard library, such as
postal codes (zip codes) and phone numbers. The primary use of llooccaallees in the standard library is
to control the appearance of information put to an oossttrreeaam
m and the format accepted by an iissttrreeaam
m.
Section §21.7 describes how to change llooccaallee for a stream; this appendix describes how a
llooccaallee is constructed out of ffaacceetts and explains the mechanisms through which a llooccaallee affects its
stream. This appendix also describes how ffaacceetts are defined, lists the standard ffaacceetts that define
specific properties of a stream, and presents techniques for implementing and using llooccaallees and
ffaacceetts. The standard library facilities for representing data and time are discussed as part of the
presentation of date I/O.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
870
Locales
Appendix D
The discussion of locales and facets is organized like this:
§D.1 introduces the basic ideas for representing cultural differences using locales.
§D.2 presents the llooccaallee class.
§D.3 presents the ffaacceett class.
§D.4 gives an overview of the standard ffaacceetts and presents details of each:
§D.4.1 String comparison
§D.4.2 Input and output of numeric values
§D.4.3 Input and output of monetary values
§D.4.4 Input and output of dates and time
§D.4.5 Character classification
§D.4.6 Character code conversions
§D.4.7 Message catalogs
The notion of a locale is not primarily a C++ notion. Most operating systems and application environments have a notion of locale. Such a notion is – in principle – shared among all programs on a
system, independently of which programming language they are written in. Thus, the C++ standard
library notion of a locale can be seen as a standard and portable way for C++ programs to access
information that has very different representations on different systems. Among other things, a
C++ llooccaallee is a common interface to system information that is represented in incompatible ways
on different systems.
D.1.1 Programming Cultural Differences
Consider writing a program that needs to be used in several countries. Writing a program in a style
that allows that is often called ‘‘internationalization’’ (emphasizing the use of a program in many
countries) or ‘‘localization’’ (emphasizing the adaptation of a program to local conditions). Many
of the entities that a program manipulates will conventionally be displayed differently in those
countries. We can handle this by writing our I/O routines to take this into account. For example:
vvooiidd pprriinntt__ddaattee(ccoonnsstt D
Daattee& dd) // print in the appropriate format
{
ssw
wiittcchh(w
whheerree__aam
m__II) {
// user-defined style indicator
ccaassee D
DK
K:
// e.g., 7. marts 1999
ccoouutt << dd.ddaayy() << ". " << ddkk__m
moonntthh[dd.m
moonntthh()] << " " << dd.yyeeaarr();
bbrreeaakk;
ccaassee U
UK
K:
// e.g., 7 / 3 / 1999
ccoouutt << dd.ddaayy() << " / " << dd.m
moonntthh() << " / " << dd.yyeeaarr();
bbrreeaakk;
ccaassee U
USS:
// e.g., 3/7/1999
ccoouutt << dd.m
moonntthh() << "/" << dd.ddaayy() << "/" << dd.yyeeaarr();
bbrreeaakk;
// ...
}
}
This style of code does the job. However, it’s rather ugly, and we have to use this style consistently
to ensure that all output is properly adjusted to local conventions. Worse, if we want to add a new
way of writing a date, we must modify the code. We could imagine handling this problem by
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.1.1
Programming Cultural Differences
871
creating a class hierarchy (§12.2.4). However, the information in a D
Daattee is independent of the way
we want to look at it. Consequently, we don’t want a hierarchy of D
Daattee types: for example,
U
USS__ddaattee, U
UK
K__ddaattee, and JJP
P__ddaattee. Instead, we want a variety of ways of displaying D
Daattees: for
example, US-style output, UK-style output, and Japanese-style output; see §D.4.4.5.
Other problems arise with the ‘‘let the user write I/O functions that take care of cultural differences’’ approach:
[1] An application programmer cannot easily, portably, and efficiently change the appearance of
built-in types without the help of the standard library.
[2] Finding every I/O operation (and every operation that prepares data for I/O in a localesensitive manner) in a large program is not always feasible.
[3] Sometimes, we cannot rewrite a program to take care of a new convention – and even if we
could, we’d prefer a solution that didn’t involve a rewrite.
[4] Having each user design and implement a solution to the problems of different cultural convention is wasteful.
[5] Different programmers will handle low-level cultural preferences in different ways, so programs dealing with the same information will differ for non-fundamental reasons. Thus,
programmers maintaining code from a number of sources will have to learn a variety of programming conventions. This is tedious and error prone.
Consequently, the standard library provides an extensible way of handling cultural conventions.
The iostreams library (§21.7) relies on this framework to handle both built-in and user-defined
types. For example, consider a simple loop copying (D
Daattee,ddoouubbllee) pairs that might represent a
series of measurements or a set of transactions:
vvooiidd ccppyy(iissttrreeaam
m& iiss, oossttrreeaam
m& ooss) // copy (Date,double) stream
{
D
Daattee dd;
ddoouubbllee vvoolluum
mee;
w
whhiillee (iiss >> d >> vvoolluum
mee) ooss << d << ´ ´<< vvoolluum
mee << ´\\nn´;
}
Naturally, a real program would do something with the records, and ideally also be a bit more careful about error handling.
How would we make this program read a file that conformed to French conventions (where
comma is the character used to represent the decimal point in a floating-point number; for example,
1122,55 means twelve and a half) and write it according to American conventions? We can define
llooccaallees and I/O operations so that ccppyy() can be used to convert between conventions:
vvooiidd ff(iissttrreeaam
m& ffiinn, oossttrreeaam
m& ffoouutt, iissttrreeaam
m& ffiinn22, oossttrreeaam
m& ffoouutt22)
{
ffiinn.iim
mbbuuee(llooccaallee("eenn__U
USS"));
// American English
ffoouutt.iim
mbbuuee(llooccaallee("ffrr"));
// French
ccppyy(ffiinn,ffoouutt);
// read American English, write French
ffiinn22.iim
mbbuuee(llooccaallee("ffrr"));
ffoouutt22.iim
mbbuuee(llooccaallee("eenn__U
USS"));
ccppyy(ffiinn22,ffoouutt22);
// French
// American English
// read French, write American English
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
872
Locales
Appendix D
Given streams,
A
Apprr 1122, 11999999
A
Apprr 1133, 11999999
A
Apprr 1144, 11999999
...
11000000.33
334455.4455
99668888.332211
3 jjuuiilllleett 11995500 1100,33
3 jjuuiilllleett 11995511 113344,4455
3 jjuuiilllleett 11995522 6677,99
...
this program would produce:
1122 aavvrriill 11999999 11000000,33
1133 aavvrriill 11999999 334455,4455
1144 aavvrriill 11999999 99668888,332211
...
JJuullyy 33, 11995500 1100.33
JJuullyy 33, 11995511 113344.4455
JJuullyy 33, 11995522 6677.99
...
Much of the rest of this appendix is devoted to describing the mechanisms that make this possible
and explaining how to use them. Please note that most programmers will have little reason to deal
with the details of llooccaallees. Many programmers will never explicitly manipulate a llooccaallee, and most
who do will just retrieve a standard locale and imbue a stream with it (§21.7). However, the mechanisms provided to compose those llooccaallees and to make them trivial to use constitute a little programming language of their own.
If a program or a system is successful, it will be used by people with needs and preferences that
the original designers and programmers didn’t anticipate. Most successful programs will be run in
countries where (natural) languages and character sets differ from those familiar to the original
designers and programmers. Wide use of a program is a sign of success, so designing and programming for portability across linguistic and cultural borders is to prepare for success.
The concept of localization (internationalization) is simple. However, practical constraints
make the design and implementation of llooccaallee quite intricate:
[1] A llooccaallee encapsulates cultural conventions, such as the appearance of a date. Such conventions vary in many subtle and unsystematic ways. These conventions have nothing to do
with programming languages, so a programming language cannot standardize them.
[2] The concept of a llooccaallee must be extensible, because it is not possible to enumerate every
cultural convention that is important to every C++ user.
[3] A llooccaallee is used in I/O operations from which people demand run-time efficiency.
[4] A llooccaallee must be invisible to the majority of programmers who want to benefit from stream
I/O ‘‘doing the right thing’’ without having to know exactly what that is or how it is
achieved.
[5] A llooccaallee must be available to designers of facilities that deal with cultural-sensitive information beyond the scope of the stream I/O library.
Designing a program doing I/O requires a choice between controlling formatting through ‘‘ordinary
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.1.1
Programming Cultural Differences
873
code’’ and the use of llooccaallees. The former (traditional) approach is feasible where we can ensure
that every input operation can be easily converted from one convention to another. However, if the
appearance of built-in types needs to vary, if different character sets are needed, or if we need to
choose among an extensible set of I/O conventions, the llooccaallee mechanism begins to look attractive.
A llooccaallee is composed of ffaacceetts that control individual aspects, such as the character used for
punctuation in the output of a floating-point value (ddeecciim
maall__ppooiinntt(); §D.4.2) and the format used
to read a monetary value (m
moonneeyyppuunncctt; §D.4.3). A ffaacceett is an object of a class derived from class
llooccaallee::ffaacceett (§D.3). We can think of a llooccaallee as a container of ffaacceetts (§D.2, §D.3.1).
D.2 The llooccaallee Class
The llooccaallee class and its associated facilities are presented in <llooccaallee>:
ccllaassss ssttdd::llooccaallee {
ppuubblliicc:
ccllaassss ffaacceett;
ccllaassss iidd;
ttyyppeeddeeff iinntt ccaatteeggoorryy;
// type used to represent aspects of a locale; §D.3
// type used to identify a locale; §D.3
// type used to group/categorize facets
ssttaattiicc ccoonnsstt ccaatteeggoorryy
// the actual values are implementation defined
nnoonnee = 00,
ccoollllaattee = 11,
ccttyyppee = 11<<11,
m
moonneettaarryy = 11<<22,
nnuum
meerriicc = 11<<33,
ttiim
mee = 11<<44,
m
meessssaaggeess = 11<<55,
aallll = ccoollllaattee | ccttyyppee | m
moonneettaarryy | nnuum
meerriicc | ttiim
mee | m
meessssaaggeess;
llooccaallee() tthhrroow
w();
llooccaallee(ccoonnsstt llooccaallee& xx) tthhrroow
w();
eexxpplliicciitt llooccaallee(ccoonnsstt cchhaarr* pp);
// copy of global locale (§D.2.1)
// copy of x
// copy of locale named p (§D.2.1)
˜llooccaallee() tthhrroow
w();
llooccaallee(ccoonnsstt llooccaallee& xx, ccoonnsstt cchhaarr* pp, ccaatteeggoorryy cc);
llooccaallee(ccoonnsstt llooccaallee& xx, ccoonnsstt llooccaallee& yy, ccaatteeggoorryy cc);
// copy of x plus facets from p’s c
// copy of x plus facets from y’s c
tteem
mppllaattee <ccllaassss F
Faacceett> llooccaallee(ccoonnsstt llooccaallee& xx, F
Faacceett* ff); // copy of x plus facet f
tteem
mppllaattee <ccllaassss F
Faacceett> llooccaallee ccoom
mbbiinnee(ccoonnsstt llooccaallee& xx); // copy of *this plus Facet from x
ccoonnsstt llooccaallee& ooppeerraattoorr=(ccoonnsstt llooccaallee& xx) tthhrroow
w();
bbooooll ooppeerraattoorr==(ccoonnsstt llooccaallee&) ccoonnsstt;
bbooooll ooppeerraattoorr!=(ccoonnsstt llooccaallee&) ccoonnsstt;
// compare locales
ssttrriinngg nnaam
mee() ccoonnsstt;
// name of this locale (§D.2.1)
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss T
Trr, ccllaassss A
A>
// compare strings using this locale
bbooooll ooppeerraattoorr()(ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&, ccoonnsstt bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>&) ccoonnsstt;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
874
Locales
Appendix D
ssttaattiicc llooccaallee gglloobbaall(ccoonnsstt llooccaallee&);
ssttaattiicc ccoonnsstt llooccaallee& ccllaassssiicc();
pprriivvaattee:
// representation
};
// set global locale and return old global locale
// get ‘‘classic’’ C-style locale
A llooccaallee can be thought of as an interface to a m
maapp<iidd,ffaacceett*>; that is, something that allows us
to use a llooccaallee::iidd to find a corresponding object of a class derived from llooccaallee::ffaacceett. A real
implementation of llooccaallee is an efficient variant of this idea. The layout will be something like this:
ccoollllaattee<
<cchhaarr>
>::
llooccaallee::
.
.
ccoom
mppaarree()
hhaasshh()
...
nnuum
mppuunnc
<cchhaarr>
>::
.ctt<
ddeecciim
maall__ppooiinntt()
ttrruueennaam
mee()
...
.
Here, ccoollllaattee<cchhaarr> and nnuum
mppuunncctt<cchhaarr> are standard library facets (§D.4). As all facets, they
are derived from llooccaallee::ffaacceett.
A llooccaallee is meant to be copied freely and cheaply. Consequently, a llooccaallee is almost certainly
implemented as a handle to the specialized m
maapp<iidd,ffaacceett*> that constitutes the main part of its
implementation. The ffaacceetts must be quickly accessible in a llooccaallee. Consequently, the specialized
m
maapp<iidd,ffaacceett*> will be optimized to provide array-like fast access. The ffaacceetts of a llooccaallee are
accessed by using the uussee__ffaacceett<F
Faacceett>(lloocc) notation; see §D.3.1.
The standard library provides a rich set of ffaacceetts. To help the programmer manipulate ffaacceetts in
logical groups, the standard ffaacceetts are grouped into categories, such as nnuum
meerriicc and ccoollllaattee (§D.4).
A programmer can replace ffaacceetts from existing categories (§D.4, §D.4.2.1). However, it is not
possible to add new categories; there is no way for a programmer to define a new category. The
notion of ‘‘category’’ applies to standard library facets only, and it is not extensible. Thus, a facet
need not belong to any category, and many user-defined facets do not.
By far the dominant use of llooccaallees is implicitly, in stream I/O. Each iissttrreeaam
m and oossttrreeaam
m has
its own llooccaallee. The llooccaallee of a stream is by default the global llooccaallee (§D.2.1) at the time of the
stream’s creation. The llooccaallee of a stream can be set by the iim
mbbuuee() operation and we can extract a
copy of a stream’s llooccaallee using ggeettlloocc() (§21.6.3).
D.2.1 Named Locales
A llooccaallee is constructed from another llooccaallee and from ffaacceetts. The simplest way of making a locale
is to copy an existing one. For example:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.2.1
Named Locales
llooccaallee lloocc00;
// copy of the current global locale (§D.2.3)
llooccaallee lloocc11 = llooccaallee();
llooccaallee lloocc22("");
// copy of the current global locale (§D.2.3)
// copy of ‘‘the user’s preferred locale’’
llooccaallee lloocc33("C
C");
llooccaallee lloocc44 = llooccaallee::ccllaassssiicc();
// copy of the "C" locale
// copy of the "C" locale
llooccaallee lloocc55("P
PO
OSSIIX
X");
// copy of the implementation-defined "POSIX" locale
875
The meaning of llooccaallee("C
C") is defined by the standard to be the ‘‘classic’’ C locale; this is the
locale that has been used throughout this book. Other llooccaallee names are implementation defined.
The llooccaallee("") is deemed to be ‘‘the user’s preferred locale.’’ This locale is set by extralinguistic means in a program’s execution environment.
Most operating systems have ways of setting a locale for a program. Often, a locale suitable to
the person using a system is chosen when that person first encounters a system. For example, I
would expect a person who configures a system to use Argentine Spanish as its default setting will
find llooccaallee("") to mean llooccaallee("eess__A
AR
R"). A quick check on one of my systems revealed 51
locales with mnemonic names, such as P
PO
OSSIIX
X, ddee, eenn__U
UK
K, eenn__U
USS, eess, eess__A
AR
R, ffrr, ssvv, ddaa, ppll, and
iissoo__88885599__11. POSIX recommends a format of a lowercase language name, optionally followed by
an uppercase country name, optionally followed by an encoding specifier; for example, jjpp__JJP
P.jjiitt.
However, these names are not standardized across platforms. On another system, among many
other locale names, I found gg, uukk, uuss, ss, ffrr, ssw
w, and ddaa. The C++ standard does not define the meaning of a llooccaallee for a given country or language, though there may be platform-specific standards.
Consequently, to use named llooccaallees on a given system, a programmer must refer to system documentation and experiment.
It is generally a good idea to avoid embedding llooccaallee name strings in the program text. Mentioning a file name or a system constant in the program text limits the portability of a program and
often forces a programmer who wants to adapt a program to a new environment to find and change
such values. Mentioning a locale name string has similar unpleasant consequences. Instead,
locales can be picked up from the program’s execution environment (for example, using
llooccaallee("")), or the program can request an expert user to specify alternative locales by entering a
string. For example:
vvooiidd uusseerr__sseett__llooccaallee(ccoonnsstt ssttrriinngg& qquueessttiioonn__ssttrriinngg)
{
ccoouutt << qquueessttiioonn__ssttrriinngg; // e.g., "If you want to use a different locale, please enter its name"
ssttrriinngg ss;
cciinn >> ss;
llooccaallee::gglloobbaall(llooccaallee(ss.cc__ssttrr())); // set global locale as specified by user
}
It is usually better to let a non-expert user pick from a list of alternatives. A routine for doing this
would need to know where and how a system kept its locales.
If the string argument doesn’t refer to a defined llooccaallee, the constructor throws the
rruunnttiim
mee__eerrrroorr exception (§14.10). For example:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
876
Locales
Appendix D
vvooiidd sseett__lloocc(llooccaallee& lloocc, ccoonnsstt cchhaarr* nnaam
mee)
ttrryy
{
lloocc = llooccaallee(nnaam
mee);
}
ccaattcchh (rruunnttiim
mee__eerrrroorr) {
cceerrrr << "llooccaallee \\"" << nnaam
mee << "\\" iissnn´tt ddeeffiinneedd\\nn";
// ...
}
If a llooccaallee has a name string, nnaam
mee() will return it. If not, nnaam
mee() will return ssttrriinngg("*"). A
name string is primarily a way to refer to a llooccaallee stored in the execution environment. Secondarily, a name string can be used as a debugging aid. For example:
vvooiidd pprriinntt__llooccaallee__nnaam
meess(ccoonnsstt llooccaallee& m
myy__lloocc)
{
ccoouutt << "nnaam
mee ooff ccuurrrreenntt gglloobbaall llooccaallee: " << llooccaallee().nnaam
mee() << "\\nn";
ccoouutt << "nnaam
mee ooff ccllaassssiicc C llooccaallee: " << llooccaallee::ccllaassssiicc().nnaam
mee() << "\\nn";
ccoouutt << "nnaam
mee ooff ‘‘uusseerr´ss pprreeffeerrrreedd llooccaallee´´: " << llooccaallee("").nnaam
mee() << "\\nn";
ccoouutt << "nnaam
mee ooff m
myy llooccaallee: " << m
myy__lloocc.nnaam
mee() << "\\nn";
}
Locales with identical name strings different from the default ssttrriinngg("*") compare equal. However, == or != provide more direct ways of comparing locales.
The copy of a llooccaallee with a name string gets the same name as that llooccaallee (if it has one), so
many llooccaallees can have the same name string. That’s logical because llooccaallees are immutable, so all
of these objects define the same set of cultural conventions.
A call llooccaallee(lloocc,"F
Foooo",ccaatt) makes a locale that is like lloocc except that it takes the facets
from the category ccaatt of llooccaallee("F
Foooo"). The resulting locale has a name string if and only if lloocc
has one. The standard doesn’t specify exactly which name string the new locale gets, but it is supposed to be different from lloocc’s. One obvious implementation would be to compose the new string
out of lloocc’s name string and "F
Foooo". For example, if lloocc’s name string is eenn__U
UK
K, the new locale
may have "eenn__U
UK
K:F
Foooo" as its name string.
The name strings for a newly created llooccaallee can be summarized like this:
________________________________________________________________________________
________________________________________________________________________________
Locale
Name String
llooccaallee((""F
Foooo""))
"Foo"
lloocc..nnaam
mee()
llooccaallee((lloocc))
Foooo"",,ccaatt))
New name string if lloocc has a name string; otherwise, ssttrriinngg("*")
llooccaallee((lloocc,,""F
llooccaallee((lloocc,,lloocc22,,ccaatt))
New name string if lloocc and lloocc22 have strings; otherwise, ssttrriinngg("*")
llooccaallee((lloocc,,F
Faacceett))
ssttrriinngg("*")
lloocc..ccoom
mb
bi
in
ne
e(
(l
lo
oc
c2
2)
)
s
st
tr
ri
in
ng
g("*")
________________________________________________________________________________
There are no facilities for a programmer to specify a C-style string as a name for a newly created
llooccaallee in a program. Name strings are either defined in the program’s execution environment or
created as combinations of such names by llooccaallee constructors.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.2.1.1
Constructing New Locales
877
D.2.1.1 Constructing New Locales
A new locale is made by taking an existing llooccaallee and adding or replacing ffaacceetts. Typically, a new
llooccaallee is a minor variation on an existing one. For example:
vvooiidd ff(ccoonnsstt llooccaallee& lloocc, ccoonnsstt M
Myy__m
moonneeyy__iioo* m
miioo) // My_money_io defined in §D.4.3.1
{
llooccaallee lloocc11(llooccaallee("P
PO
OSSIIX
X"),lloocc,llooccaallee::m
moonneettaarryy);
// use monetary facets from loc
llooccaallee lloocc22 = llooccaallee(llooccaallee::ccllaassssiicc(), m
miioo);
// classic plus mio
// ...
}
Here, lloocc11 is a copy of the P
PO
OSSIIX
X locale modified to use lloocc’s monetary facets (§D.4.3). Similarly, lloocc22 is a copy of the C locale modified to use a M
Myy__m
moonneeyy__iioo (§D.4.3.1). If a F
Faacceett* argument (here, M
Myy__m
moonneeyy__iioo) is 00, the resulting llooccaallee is simply a copy of the llooccaallee argument.
When using
llooccaallee(ccoonnsstt llooccaallee& xx, F
Faacceett* ff);
the f argument must identify a specific facet type. A plain ffaacceett* is not sufficient. For example:
vvooiidd gg(ccoonnsstt llooccaallee::ffaacceett* m
miioo11, ccoonnsstt M
Myy__m
moonneeyy__iioo* m
miioo22)
{
llooccaallee lloocc33 = llooccaallee(llooccaallee::ccllaassssiicc(), m
miioo11); // error: type of facet not known
llooccaallee lloocc44 = llooccaallee(llooccaallee::ccllaassssiicc(), m
miioo22); // ok: type of facet known
// ...
}
The reason is that the llooccaallee uses the type of the F
Faacceett* argument to determine the type of the
facet at compile time. Specifically, the implementation of llooccaallee uses a facet’s identifying type,
ffaacceett::iidd (§D.3), to find that facet in the locale (§D.3.1).
Note that the
tteem
mppllaattee <ccllaassss F
Faacceett> llooccaallee(ccoonnsstt llooccaallee& xx, F
Faacceett* ff);
constructor is the only mechanism offered within the language for the programmer to supply a ffaacceett
to be used through a llooccaallee. Other llooccaallees are supplied by implementers as named locales
(§D.2.1). These named locales can be retrieved from the program’s execution environment. A programmer who understands the implementation-specific mechanism used for that might be able to
add new llooccaallees that way (§D.6[11,12]).
The set of constructors for llooccaallee is designed so that the type of every ffaacceett is known either
from type deduction (of the F
Faacceett template parameter) or because it came from another llooccaallee (that
knew its type). Specifying a ccaatteeggoorryy argument specifies the type of ffaacceetts indirectly, because the
llooccaallee knows the type of the ffaacceetts in the categories. This implies that the llooccaallee class can (and
does) keep track of the types of ffaacceett types so that it can manipulate them with minimal overhead.
The llooccaallee::iidd member type is used by llooccaallee to identify ffaacceett types (§D.3).
It is sometimes useful to construct a llooccaallee that is a copy of another except for a ffaacceett copied
from yet another llooccaallee. The ccoom
mbbiinnee() template member function does that. For example:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
878
Locales
Appendix D
vvooiidd ff(ccoonnsstt llooccaallee& lloocc, ccoonnsstt llooccaallee& lloocc22)
{
llooccaallee lloocc33 = lloocc.ccoom
mbbiinnee< M
Myy__m
moonneeyy__iioo >(lloocc22);
// ...
}
The resulting lloocc33 behaves like lloocc except that it uses a copy of M
Myy__m
moonneeyy__iioo (§D.4.3.1) from
lloocc22 to format monetary I/O. If lloocc22 doesn’t have a M
Myy__m
moonneeyy__iioo to give to the new llooccaallee, ccoom
m-bbiinnee() will throw a rruunnttiim
mee__eerrrroorr (§14.10). The result of ccoom
mbbiinnee() has no name string.
D.2.2 Copying and Comparing Locales
A llooccaallee can be copied by initialization and by assignment. For example:
vvooiidd ssw
waapp(llooccaallee& xx, llooccaallee& yy)
{
llooccaallee tteem
mpp = xx;
x = yy;
y = tteem
mpp;
}
// just like std::swap()
The copy of a llooccaallee compares equal to the original, but the copy is an independent and separate
object. For example:
vvooiidd ff(llooccaallee* m
myy__llooccaallee)
{
llooccaallee lloocc = llooccaallee::ccllaassssiicc(); // "C" locale
iiff (lloocc != llooccaallee::ccllaassssiicc()) {
cceerrrr << "iim
mpplleem
meennttaattiioonn eerrrroorr: sseenndd bbuugg rreeppoorrtt ttoo vveennddoorr\\nn";
eexxiitt(11);
}
iiff (&lloocc != &llooccaallee::ccllaassssiicc()) ccoouutt << "nnoo ssuurrpprriissee: aaddddrreesssseess ddiiffffeerr\\nn";
llooccaallee lloocc22 = llooccaallee(lloocc,m
myy__llooccaallee,llooccaallee::nnuum
meerriicc);
iiff (lloocc == lloocc22) {
ccoouutt << "m
myy nnuum
meerriicc ffaacceettss aarree tthhee ssaam
mee aass ccllaassssiicc()´ss nnuum
meerriicc ffaacceettss\\nn";
// ...
}
// ...
}
If m
myy__llooccaallee has a numeric punctuation facet, m
myy__nnuum
mppuunncctt<cchhaarr>, that is different from
ccllaassssiicc()’s standard nnuum
mppuunncctt<cchhaarr>, the resulting llooccaallees can be represented like this:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.2.2
Copying and Comparing Locales
879
ccoollllaattee<
<cchhaarr>
>::
lloocc::
.
.
lloocc22::
ccoom
mppaarree()
hhaasshh()
...
.
nnuum
mppuunnc
<cchhaarr>
>::
.ctt<
.
m
myy__nnuum
mppuun
<cchhaarr>
>::
. ncctt<
.
ddeecciim
maall__ppooiinntt()
ccuurrrr__ssyym
mbbooll()
...
ddeecciim
maall__ppooiinntt()
ccuurrrr__ssyym
mbbooll()
...
There is no way of modifying a llooccaallee. Instead, the llooccaallee operations provide ways of making new
llooccaalleess from existing ones. The fact that a llooccaallee is immutable after it has been created is essential
for run-time efficiency. This allows someone using a llooccaallee to call virtual functions of a ffaacceett and
to cache the values returned. For example, an iissttrreeaam
m can know what character is used to represent
the decimal point and how ttrruuee is represented, without calling ddeecciim
maall__ppooiinntt() each time it reads
a number and ttrruueennaam
mee() each time it reads to a bbooooll (§D.4.2). Only a call of iim
mbbuuee() for the
stream (§21.6.3) can cause such calls to return a different value.
D.2.3 The gglloobbaall(()) and the ccllaassssiicc(()) Locales
The notion of a current locale for a program is provided by llooccaallee(), which yields a copy of the
current locale, and llooccaallee::gglloobbaall(xx), which sets the current locale to xx. The current locale is
commonly referred to as the ‘‘global locale,’’ reflecting its probable implementation as a global (or
ssttaattiicc) object.
The global locale is implicitly used when a stream is initialized. That is, every new stream is
imbued (§21.1, §21.6.3) with a copy of llooccaallee(). Initially, the global locale is the standard C
locale, llooccaallee::ccllaassssiicc().
The llooccaallee::gglloobbaall() static member function allows a programmer to specify a locale to be
used as the global locale. A copy of the previous global locale is returned by gglloobbaall(). This
allows a user to restore the global locale. For example:
vvooiidd ff(ccoonnsstt llooccaallee& m
myy__lloocc)
{
iiffssttrreeaam
m ffiinn11(ssoom
mee__nnaam
mee);
llooccaallee& oolldd__gglloobbaall = llooccaallee::gglloobbaall(m
myy__lloocc);
iiffssttrreeaam
m ffiinn22(ssoom
mee__ootthheerr__nnaam
mee);
// ...
llooccaallee::gglloobbaall(oolldd__gglloobbaall);
}
// fin1 is imbued with the global locale
// set new global locale
// fin2 is imbued with my_loc
// restore old global locale
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
880
Locales
Appendix D
If a locale x has a name string, llooccaallee::gglloobbaall(xx) also sets the C global locale. This implies that
if a C++ program calls a locale-sensitive function from the C standard library, the treatment of
locale will be consistent throughout a mixed C and C++ program.
If a locale x does not have a name string, it is undefined whether llooccaallee::gglloobbaall(xx) affects the
C global locale. This implies that a C++ program cannot reliably and portably set the C locale to a
locale that wasn’t retrieved from the execution environment. There is no standard way for a C program to set the C++ global locale (except by calling a C++ function to do so). In a mixed C and
C++ program, having the C global locale differ from gglloobbaall() is error prone.
Setting the global locale does not affect existing I/O streams; those still use the locales that they
were imbued with before the global locale was reset. For example, ffiinn11 is unaffected by the manipulation of the global locale that caused ffiinn22 to be imbued with m
myy__lloocc.
Changing the global locale suffers the same problems as all other techniques relying on changing global data: It is essentially impossible to know what is affected by a change. It is therefore
best to reduce use of gglloobbaall() to a minimum and to localize those changes in a few sections of
code that obey a simple strategy for the changes. The ability to imbue (§21.6.3) individual streams
with specific llooccaallees makes that easier. However, too many explicit uses of llooccaallees and ffaacceetts
scattered throughout a program will also become a maintenance problem.
D.2.4 Comparing Strings
Comparing two strings according to a llooccaallee is possibly the most common explicit use of a llooccaallee.
Consequently, this operation is provided directly by llooccaallee so that users don’t have to build their
own comparison function from the ccoollllaattee facet (§D.4.1). To be directly useful as a predicate
(§18.4.2), this comparison function is defined as llooccaallee’s ooppeerraattoorr()(). For example:
vvooiidd ff(vveeccttoorr<ssttrriinngg>& vv, ccoonnsstt llooccaallee& m
myy__llooccaallee)
{
ssoorrtt(vv.bbeeggiinn(),vv.eenndd());
// sort using < to compare elements
// ...
ssoorrtt(vv.bbeeggiinn(),vv.eenndd(),m
myy__llooccaallee);
// sort according to the rules of my_locale
// ...
}
By default, the standard library ssoorrtt() uses < for the numerical value of the implementation character set to determine collation order (§18.7, §18.6.3.1).
Note that llooccaallees compare bbaassiicc__ssttrriinnggs rather than C-style strings.
D.3 Facets
A ffaacceett is an object of a class derived from llooccaallee’s member class ffaacceett:
ccllaassss ssttdd::llooccaallee::ffaacceett {
pprrootteecctteedd:
eexxpplliicciitt ffaacceett(ssiizzee__tt r = 00);
vviirrttuuaall ˜ffaacceett();
// "r==0": the locale controls the lifetime of this facet
// note: protected destructor
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.3
Facets
881
pprriivvaattee:
ffaacceett(ccoonnsstt ffaacceett&);
// not defined
vvooiidd ooppeerraattoorr=(ccoonnsstt ffaacceett&); // not defined
// representation
};
The copy operations are pprriivvaattee and are left undefined to prevent copying (§11.2.2).
The ffaacceett class is designed to be a base class and has no public functions. Its constructor is
pprrootteecctteedd to prevent the creation of ‘‘plain ffaacceett’’ objects, and its destructor is virtual to ensure
proper destruction of derived-class objects.
A ffaacceett is intended to be managed through pointers by llooccaallees. A 0 argument to the ffaacceett constructor means that llooccaallee should delete the ffaacceett when the last reference to it goes away. Conversely, a nonzero constructor argument ensures that llooccaallee never deletes the ffaacceett. A nonzero
argument is meant for the rare case in which the lifetime of a facet is controlled directly by the programmer rather than indirectly through a locale. For example, we could try to create objects of the
standard facet type ccoollllaattee__bbyynnaam
mee<cchhaarr> (§D.4.1.1) like this:
vvooiidd ff(ccoonnsstt ssttrriinngg& ss11, ccoonnsstt ssttrriinngg& ss22)
{
// normal case: (default) argument 0 means that locale is responsible for deletion:
ccoollllaattee<cchhaarr>* p = nneew
w ccoollllaattee__bbyynnaam
mee<cchhaarr>("ppll");
llooccaallee lloocc(llooccaallee(),pp);
// rare case: argument 1 means that user is responsible for deletion:
ccoollllaattee<cchhaarr>* q = nneew
w ccoollllaattee__bbyynnaam
mee<cchhaarr>("ggee",11);
ccoollllaattee__bbyynnaam
mee<cchhaarr> bbuugg11("ssw
w");
ccoollllaattee__bbyynnaam
mee<cchhaarr> bbuugg22("nnoo",11);
// error: cannot destroy local variable
// error: cannot destroy local variable
// ...
// q cannot be deleted: collate_byname<char>’s destructor is protected
// no delete p; locale manages deletion of *p
}
That is, standard facets are useful when managed by locales, as base classes, and only rarely in
other ways.
A __bbyynnaam
mee() facet is a facet from a named locale in the the execution environment (§D.2.1).
For a ffaacceett to be found in a llooccaallee by hhaass__ffaacceett() and uussee__ffaacceett() (§D.3.1), each kind of
facet must have an iidd:
ccllaassss ssttdd::llooccaallee::iidd {
ppuubblliicc:
iidd();
pprriivvaattee:
iidd(ccoonnsstt iidd&);
vvooiidd ooppeerraattoorr=(ccoonnsstt iidd&);
// not defined
// not defined
// representation
};
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
882
Locales
Appendix D
The copy operations are declared private and are left undefined to prevent copying (§11.2.2).
The intended use of iidd is for the user to define a ssttaattiicc member of type iidd of each class supplying a new ffaacceett interface (for example, see §D.4.1). The llooccaallee mechanisms use iidds to identify
facets (§D.2, §D.3.1). In the obvious implementation of a llooccaallee, an iidd is used as an index into a
vector of pointers to facets, thereby implementing an efficient m
maapp<iidd,ffaacceett*>.
Data used to define a (derived) ffaacceett is defined in the derived class rather than in the base class
ffaacceett itself. This implies that the programmer defining a ffaacceett has full control over the data and
that arbitrary amounts of data can be used to implement the concept represented by a ffaacceett.
Note that all member functions of a user-defined ffaacceett should be ccoonnsstt members. Generally, a
facet is intended to be immutable (§D.2.2).
D.3.1 Accessing Facets in a Locale
The facets of a llooccaallee are accessed through the template function uussee__ffaacceett, and we can inquire
whether a llooccaallee has a specific ffaacceett, using the template function hhaass__ffaacceett:
tteem
mppllaattee <ccllaassss F
Faacceett> bbooooll hhaass__ffaacceett(ccoonnsstt llooccaallee&) tthhrroow
w();
tteem
mppllaattee <ccllaassss F
Faacceett> ccoonnsstt F
Faacceett& uussee__ffaacceett(ccoonnsstt llooccaallee&); // may throw bad_cast
Think of these template functions as doing a lookup in their llooccaallee argument for their template
parameter F
Faacceett. Alternatively, think of uussee__ffaacceett as a kind of explicit type conversion (cast) of a
llooccaallee to a specific ffaacceett. This is feasible because a llooccaallee can have only one ffaacceett of a given type.
For example:
vvooiidd ff(ccoonnsstt llooccaallee& m
myy__llooccaallee)
{
cchhaarr c = uussee__ffaacceett< nnuum
mppuunncctt<cchhaarr> >(m
myy__llooccaallee).ddeecciim
maall__ppooiinntt() // use standard facet
// ...
iiff (hhaass__ffaacceett<E
Ennccrryypptt>(m
myy__llooccaallee)) {
// does my_locale contain an Encrypt facet?
ccoonnsstt E
Ennccrryypptt& f = uussee__ffaacceett<E
Ennccrryypptt>(m
myy__llooccaallee); // retrieve Encrypt facet
ccoonnsstt C
Crryyppttoo c = ff.ggeett__ccrryyppttoo();
// use Encrypt facet
// ...
}
// ...
}
Note that uussee__ffaacceett returns a reference to a ccoonnsstt facet, so we cannot assign the result of uussee__ffaacceett
to a non-ccoonnsstt. This makes sense because a facet is meant to be immutable and to have only ccoonnsstt
members.
If we try uussee__ffaacceett<X
X>(lloocc) and lloocc doesn’t have an X facet, uussee__ffaacceett() throws bbaadd__ccaasstt
(§14.10). The standard ffaacceetts are guaranteed to be available for all locales (§D.4), so we don’t
need to use hhaass__ffaacceett for standard facets. For standard facets, uussee__ffaacceett will not throw bbaadd__ccaasstt.
How might uussee__ffaacceett and hhaass__ffaacceett be implemented? Remember that we can think of a llooccaallee
as a m
maapp<iidd,ffaacceett*> (§D.2). Given a ffaacceett type as the F
Faacceett template argument, the implementation of hhaass__ffaacceett or uussee__ffaacceett can refer to F
Faacceett::iidd and use that to find the corresponding facet.
A very naive implementation of hhaass__ffaacceett and uussee__ffaacceett might look like this:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.3.1
Accessing Facets in a Locale
883
// pseudoimplementation: imagine that a locale has a map<id,facet*> called facet_map
tteem
mppllaattee <ccllaassss F
Faacceett> bbooooll hhaass__ffaacceett(ccoonnsstt llooccaallee& lloocc) tthhrroow
w()
{
ccoonnsstt llooccaallee::ffaacceett* f = lloocc.ffaacceett__m
maapp[F
Faacceett::iidd];
rreettuurrnn f ? ttrruuee : ffaallssee;
}
tteem
mppllaattee <ccllaassss F
Faacceett> ccoonnsstt F
Faacceett& uussee__ffaacceett(ccoonnsstt llooccaallee& lloocc)
{
ccoonnsstt llooccaallee::ffaacceett* f = lloocc.ffaacceett__m
maapp[F
Faacceett::iidd];
iiff (ff) rreettuurrnn ssttaattiicc__ccaasstt<ccoonnsstt F
Faacceett&>(*ff);
tthhrroow
w bbaadd__ccaasstt();
}
Another way of looking at the ffaacceett::iidd mechanism is as an implementation of a form of compiletime polymorphism. A ddyynnaam
miicc__ccaasstt can be used to get very similar results to what uussee__ffaacceett produces. However, the specialized uussee__ffaacceett can be implemented more efficiently than the more
general ddyynnaam
miicc__ccaasstt.
An iidd really identifies an interface and a behavior rather than a class. That is, if two facet
classes have exactly the same interface and implement the same semantics (as far as a llooccaallee is
concerned), they should be identified by the same iidd. For example, ccoollllaattee<cchhaarr> and
ccoollllaattee__bbyynnaam
mee<cchhaarr> are interchangeable in a llooccaallee, so both are identified by
ccoollllaattee<cchhaarr>::iidd (§D.4.1).
If we define a ffaacceett with a new interface – such as E
Ennccrryypptt in ff()– we must define a corresponding iidd to identify it (see §D.3.2 and §D.4.1).
D.3.2 A Simple User-Defined Facet
The standard library provides standard facets for the most critical areas of cultural differences, such
as character sets and I/O of numbers. To examine the facet mechanism in isolation from the complexities of widely used types and the efficiency concerns that accompany them, let me first present
a ffaacceett for a trivial user-defined type:
eennuum
m SSeeaassoonn { sspprriinngg, ssuum
mm
meerr, ffaallll, w
wiinntteerr };
This was just about the simplest user-defined type I could think of. The style of I/O outlined here
can be used with little variation for most simple user-defined types.
ccllaassss SSeeaassoonn__iioo : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
SSeeaassoonn__iioo(iinntt i = 00) : llooccaallee::ffaacceett(ii) { }
˜SSeeaassoonn__iioo() { }
// to make it possible to destroy Season_io objects (§D.3)
vviirrttuuaall ccoonnsstt ssttrriinngg& ttoo__ssttrr(SSeeaassoonn xx) ccoonnsstt = 00;
// string representation of x
// place Season corresponding to s in x:
vviirrttuuaall bbooooll ffrroom
m__ssttrr(ccoonnsstt ssttrriinngg& ss, SSeeaassoonn& xx) ccoonnsstt = 00;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
884
Locales
Appendix D
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
};
llooccaallee::iidd SSeeaassoonn__iioo::iidd; // define the identifier object
For simplicity, this ffaacceett is limited to representations using cchhaarr.
The SSeeaassoonn__iioo class provides a general and abstract interface for all SSeeaassoonn__iioo facets. To
define the I/O representation of a SSeeaassoonn for a particular locale, we derive a class from SSeeaassoonn__iioo,
defining ttoo__ssttrr() and ffrroom
m__ssttrr() appropriately.
Output of a SSeeaassoonn is easy. If the stream has a SSeeaassoonn__iioo facet, we can use that to convert the
value into a string. If not, we can output the iinntt value of the SSeeaassoonn:
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m& ss, SSeeaassoonn xx)
{
ccoonnsstt llooccaallee& lloocc = ss.ggeettlloocc(); // extract the stream’s locale (§21.7.1)
iiff (hhaass__ffaacceett<SSeeaassoonn__iioo>(lloocc)) rreettuurrnn s << uussee__ffaacceett<SSeeaassoonn__iioo>(lloocc).ttoo__ssttrr(xx);
rreettuurrnn s << iinntt(xx);
}
Note that this << operator is implemented by invoking << for other types. This way, we benefit
from the simplicity of using << compared to accessing the oossttrreeaam
m’s stream buffers directly, from
the locale sensitivity of those << operators, and from the error handling provided for those <<
operators. Standard ffaacceetts tend to operate directly on stream buffers (§D.4.2.2, §D.4.2.3) for maximum efficiency and flexibility, but for many simple user-defined types, there is no need to drop to
the ssttrreeaam
mbbuuff level of abstraction.
As is typical, input is a bit more complicated than output:
iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m& ss, SSeeaassoonn& xx)
{
ccoonnsstt llooccaallee& lloocc = ss.ggeettlloocc();
// extract the stream’s locale (§21.7.1)
iiff (hhaass__ffaacceett<SSeeaassoonn__iioo>(lloocc)) {
// read alphabetic representation
ccoonnsstt SSeeaassoonn__iioo& f = uussee__ffaacceett<SSeeaassoonn__iioo>(lloocc);
ssttrriinngg bbuuff;
iiff (!(ss>>bbuuff && ff.ffrroom
m__ssttrr(bbuuff,xx))) ss.sseettssttaattee(iiooss__bbaassee::ffaaiillbbiitt);
rreettuurrnn ss;
}
iinntt ii;
// read numeric representation
s >> ii;
x = SSeeaassoonn(ii);
rreettuurrnn ss;
}
The error handling is simple and follows the error-handling style for built-in types. That is, if the
input string didn’t represent a SSeeaassoonn in the chosen locale, the stream is put into the ffaaiill state. If
exceptions are enabled, this implies that an iiooss__bbaassee::ffaaiilluurree exception is thrown (§21.3.6).
Here is a trivial test program:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.3.2
A Simple User-Defined Facet
885
iinntt m
maaiinn()
// trivial test
{
SSeeaassoonn xx;
// Use default locale (no Season_io facet) implies integer I/O:
cciinn >> xx;
ccoouutt << x << eennddll;
llooccaallee lloocc(llooccaallee(),nneew
w U
USS__sseeaassoonn__iioo);
ccoouutt.iim
mbbuuee(lloocc);
// Use locale with Season_io facet
cciinn.iim
mbbuuee(lloocc);
// Use locale with Season_io facet
cciinn >> xx;
ccoouutt << x << eennddll;
}
Given the input
2
ssuum
mm
meerr
this program responded:
2
ssuum
mm
meerr
To get this, we must define U
USS__sseeaassoonn__iioo to define the string representation of the seasons and
override the SSeeaassoonn__iioo functions that convert between string representations and the enumerators:
ccllaassss U
USS__sseeaassoonn__iioo : ppuubblliicc SSeeaassoonn__iioo {
ssttaattiicc ccoonnsstt ssttrriinngg sseeaassoonnss[];
ppuubblliicc:
ccoonnsstt ssttrriinngg& ttoo__ssttrr(SSeeaassoonn) ccoonnsstt;
bbooooll ffrroom
m__ssttrr(ccoonnsstt ssttrriinngg&, SSeeaassoonn&) ccoonnsstt;
// note: no US_season_io::id
};
ccoonnsstt ssttrriinngg U
USS__sseeaassoonn__iioo::sseeaassoonnss[] = { "sspprriinngg", "ssuum
mm
meerr", "ffaallll", "w
wiinntteerr" };
ccoonnsstt ssttrriinngg& U
USS__sseeaassoonn__iioo::ttoo__ssttrr(SSeeaassoonn xx) ccoonnsstt
{
iiff (xx<sspprriinngg || w
wiinntteerr<xx) {
ssttaattiicc ccoonnsstt ssttrriinngg ssss = "nnoo-ssuucchh-sseeaassoonn";
rreettuurrnn ssss;
}
rreettuurrnn sseeaassoonnss[xx];
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
886
Locales
Appendix D
bbooooll U
USS__sseeaassoonn__iioo::ffrroom
m__ssttrr(ccoonnsstt ssttrriinngg& ss, SSeeaassoonn& xx) ccoonnsstt
{
ccoonnsstt ssttrriinngg* bbeegg = &sseeaassoonnss[sspprriinngg];
ccoonnsstt ssttrriinngg* eenndd = &sseeaassoonnss[w
wiinntteerr]+11;
ccoonnsstt ssttrriinngg* p = ffiinndd(bbeegg,eenndd,ss); // §3.8.1, §18.5.2
iiff (pp==eenndd) rreettuurrnn ffaallssee;
x = SSeeaassoonn(pp-bbeegg);
rreettuurrnn ttrruuee;
}
Note that because U
USS__sseeaassoonn__iioo is simply an implementation of the SSeeaassoonn__iioo interface, I did not
define an iidd for U
USS__sseeaassoonn__iioo. In fact, if we want U
USS__sseeaassoonn__iioo to be used as a SSeeaassoonn__iioo, we
must not give U
USS__sseeaassoonn__iioo its own iidd. Operations on llooccaallees, such as hhaass__ffaacceett (§D.3.1), rely
on facets implementing the same concepts being identified by the same iidd (§D.3).
The only interesting implementation question was what to do if asked to output an invalid SSeeaa-ssoonn. Naturally, that shouldn’t happen. However, it is not uncommon to find an invalid value for a
simple user-defined type, so it is realistic to take that possibility into account. I could have thrown
an exception, but when dealing with simple output intended for humans to read, it is often helpful
to produce an ‘‘out of range’’ representation for an out-of-range value. Note that for input, the
error-handling policy is left to the >> operator, whereas for output, the facet function ttoo__ssttrr()
implements an error-handling policy. This was done to illustrate the design alternatives. In a ‘‘production design,’’ the ffaacceett functions would either implement error handling for both input and output or just report errors for >> and << to handle.
This SSeeaassoonn__iioo design relied on derived classes to supply the locale-specific strings. An alternative design would have SSeeaassoonn__iioo itself retrieve those strings from a locale-specific repository
(see §D.4.7). The possibility of having a single SSeeaassoonn__iioo class to which the season strings are
passed as constructor arguments is left as an exercise (§D.6[2]).
D.3.3 Uses of Locales and Facets
The primary use of llooccaallees within the standard library is in I/O streams. However, the llooccaallee
mechanism is a general and extensible mechanism for representing culture-sensitive information.
The m
meessssaaggeess facet (§D.4.7) is an example of a facet that has nothing to do with I/O streams.
Extensions to the iostreams library and even I/O facilities that are not based on streams might take
advantage of locales. Also, a user may use llooccaallees as a convenient way of organizing arbitrary
culture-sensitive information.
Because of the generality of the llooccaallee/ffaacceett mechanism, the possibilities for user-defined
ffaacceetts are unlimited. Plausible candidates for representation as ffaacceetts are dates, time zones, phone
numbers, social security numbers (personal identification numbers), product codes, temperatures,
general (unit,value) pairs, postal codes (zip codes), clothe sizes, and ISBN numbers.
As with every other powerful mechanism, ffaacceetts should be used with care. That something can
be represented as a ffaacceett doesn’t mean that it is best represented that way. The key issues to consider when selecting a representation for cultural dependencies are – as ever – how the various decisions affect the difficulty of writing code, the ease of reading the resulting code, the maintainability
of the resulting program, and the efficiency in time and space of the resulting I/O operations.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4
Standard Facets
887
D.4 Standard Facets
In <llooccaallee>, the standard library provides these ffaacceetts for the ccllaassssiicc() locale:
____________________________________________________________________
_____________________________________________________________________
Standard Facets (in the ccllaassssiicc(()) locale)
___________________________________________________________________
Category
Purpose
Facets
___________________________________________________________________
_____________________________________________________________________
§D.4.1 ccoollllaattee
String comparison
ccoollllaattee<
<C
Chh>
>
____________________________________________________________________
§D.4.2 nnuum
meerriicc
Numeric I/O
nnuum
mppuunncctt<
<C
Chh>
>
nnuum
m__ggeett<
<C
Chh>
>
nnuum
m__ppuutt<
<C
Chh>
>
____________________________________________________________________
§D.4.3 m
moonneettaarryy
Money I/O
m
moonneeyyppuunncctt<
<C
Chh>
>
m
moonneeyyppuunncctt<
<C
Chh,,ttrruuee>
>
m
moonneeyy__ggeett<
<C
Chh>
>
m
moonneeyy__ppuutt<
<C
Chh>
>
____________________________________________________________________
§D.4.4 ttiim
mee
Time I/O
ttiim
mee__ggeett<
<C
Chh>
>
____________________________________________________________________
ttiim
mee__ppuutt<
<C
Chh>
>
Character classification ccttyyppee<
<C
Chh>
>
§D.4.5 ccttyyppee
ccooddeeccvvtt<
<C
Chh,,cchhaarr,,m
mbbssttaattee__tt>
>
____________________________________________________________________
____________________________________________________________________
meessssaaggeess
Message retrieval
m
meessssaaggeess<
<C
Chh>
>
§D.4.7 m
In this table, C
Chh is as shorthand for cchhaarr or w
wcchhaarr__tt. A user who needs standard I/O to deal with
another character type X must provide suitable versions of facets for X
X. For example,
ccooddeeccvvtt<X
X,cchhaarr,m
mbbssttaattee__tt> (§D.4.6) might be needed to control conversions between X and
cchhaarr. The m
mbbssttaattee__tt type is used to represent the shift states of a multibyte character representation (§D.4.6); m
mbbssttaattee__tt is defined in <ccw
wcchhaarr> and <w
wcchhaarr.hh>. The equivalent to m
mbbssttaattee__tt for
an arbitrary character type X is cchhaarr__ttrraaiittss<X
X>::ssttaattee__ttyyppee.
In addition, the standard library provides these ffaacceetts in <llooccaallee>:
_____________________________________________________________________________
______________________________________________________________________________
Standard Facets
____________________________________________________________________________
Category
Purpose
Facets
_
____________________________________________________________________________
_____________________________________________________________________________
§D.4.1 ccoollllaattee
String comparison
ccoollllaattee__bbyynnaam
mee<
<C
Chh>
>
_____________________________________________________________________________
§D.4.2 nnuum
meerriicc
Numeric I/O
nnuum
mppuunncctt__bbyynnaam
mee<
<C
Chh>
>
nnuum
m__ggeett<
<C
C,,IInn>
>
nnuum
m__ppuutt<
<C
C,,O
Ouutt>
>
_____________________________________________________________________________
§D.4.3 m
moonneettaarryy
Money I/O
m
moonneeyyppuunncctt__bbyynnaam
mee<
<C
Chh,,IInntteerrnnaattiioonnaall>
>
m
moonneeyy__ggeett<
<C
C,,IInn>
>
m
moonneeyy__ppuutt<
<C
C,,O
Ouutt>
>
_____________________________________________________________________________
§D.4.4 ttiim
mee
Time I/O
ttiim
mee__ppuutt__bbyynnaam
mee<
<C
Chh,,O
Ouutt>
>
_____________________________________________________________________________
_____________________________________________________________________________
§D.4.5 ccttyyppee
Character classification ccttyyppee__bbyynnaam
mee<
<C
Chh>
>
§D.4.7 m
meessssaaggeess
Message retrieval
m
meessssaaggeess__bbyynnaam
mee<
<C
Chh>
>
_____________________________________________________________________________
When instantiating a facet from this table, C
Chh can be cchhaarr or w
wcchhaarr__tt; C can be any character type
(§20.1). IInntteerrnnaattiioonnaall can be ttrruuee or ffaallssee; ttrruuee means that a four-character ‘‘international’’
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
888
Locales
Appendix D
representation of a currency symbol is used (§D.4.3.1). The m
mbbssttaattee__tt type is used to represent the
shift states of a multibyte character representation (§D.4.6); m
mbbssttaattee__tt is defined in <ccw
wcchhaarr> and
<w
wcchhaarr.hh>.
IInn and O
Ouutt are input iterators and output iterators, respectively (§19.1, §19.2.1). Providing the
__ppuutt and __ggeett facets with these template arguments allows a programmer to provide facets that
access nonstandard buffers (§D.4.2.2). Buffers associated with iostreams are stream buffers, so the
iterators provided for those are oossttrreeaam
mbbuuff__iitteerraattoorrs (§19.2.6.1, §D.4.2.2). Consequently, the
function ffaaiilleedd() (§19.2.6.1) is available for error handling.
An F
F__bbyynnaam
mee facet is derived from the facet F
F. F
F__bbyynnaam
mee provides the identical interface to
F
F, except that it adds a constructor taking a string argument naming a llooccaallee (see §D.4.1). The
F
F__bbyynnaam
mee(nnaam
mee) provides the appropriate semantics for F defined in llooccaallee(nnaam
mee). The idea is
to pick a version of a standard facet from a named llooccaallee (§D.2.1) in the program’s execution environment. For example:
vvooiidd ff(vveeccttoorr<ssttrriinngg>& vv, ccoonnsstt llooccaallee& lloocc)
{
llooccaallee dd11(lloocc,nneew
w ccoollllaattee__bbyynnaam
mee<cchhaarr>("ddaa")); // use Danish string comparison
llooccaallee ddkk(dd11,nneew
w ccttyyppee__bbyynnaam
mee<cchhaarr>("ddaa"));
// use Danish character classification
ssoorrtt(vv.bbeeggiinn(),vv.eenndd(),ddkk);
// ...
}
This new ddkk locale will use Danish-style strings but will retain the default conventions for numbers.
Note that because ffaacceett’s second argument is by default 00, the llooccaallee manages the lifetime of a
ffaacceett created using operator nneew
w (§D.3).
Like the llooccaallee constructors that take string arguments, the __bbyynnaam
mee constructors access the
program’s execution environment. This implies that they are very slow compared to constructors
that do not need to consult the environment. It is almost always faster to construct a locale and then
to access its facets than it is to use __bbyynnaam
mee facets in many places in a program. Thus, reading a
facet from the environment once and then using the copy in main memory repeatedly is usually a
good idea. For example:
llooccaallee ddkk("ddaa");
// read the Danish locale (incl. all of its facets) once
// then use the dk locale and its facets as needed
vvooiidd ff(vveeccttoorr<ssttrriinngg>& vv, ccoonnsstt llooccaallee& lloocc)
{
ccoonnsstt ccoollllaattee<cchhaarr>& ccooll = uussee__ffaacceett< ccoollllaattee<cchhaarr> >(ddkk);
ccoonnsstt ccoollllaattee<cchhaarr>& ccttyypp = uussee__ffaacceett< ccttyyppee<cchhaarr> >(ddkk);
llooccaallee dd11(lloocc,ccooll); // use Danish string comparison
llooccaallee dd22(dd11,ccttyypp); // use Danish character classification and Danish string comparison
ssoorrtt(vv.bbeeggiinn(),vv.eenndd(),dd22);
// ...
}
The notion of categories gives a simpler way of manipulating standard facets in locales. For example, given the ddkk locale, we can construct a llooccaallee that reads and compares strings according to the
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4
Standard Facets
889
rules of Danish (that give three extra vowels compared to English) but that retains the syntax of
numbers used in C++:
llooccaallee ddkk__uuss(llooccaallee::ccllaassssiicc(),ddkk,ccoollllaattee|ccttyyppee);
// Danish letters, American numbers
The presentations of individual standard ffaacceetts contains more examples of ffaacceett use. In particular,
the discussion of ccoollllaattee (§D.4.1) brings out many of the common structural aspects of ffaacceetts.
Note that the standard ffaacceetts often depend on each other. For example, nnuum
m__ppuutt depends on
nnuum
mppuunncctt. Only if you have a detailed knowledge of individual ffaacceetts can you successfully mix
and match facets or add new versions of the standard facets. In other words, beyond the simple
operations mentioned in §21.7, the llooccaallee mechanisms are not meant to be directly used by novices.
The design of an individual facet is often messy. The reason is partially that facets have to
reflect messy cultural conventions outside the control of the library designer, and partially that the
C++ standard library facilities have to remain largely compatible with what is offered by the C standard library and various platform-specific standards. For example, POSIX provides locale facilities
that it would be unwise for a library designer to ignore.
On the other hand, the framework provided by locales and facets is very general and flexible. A
facet can be designed to hold any data, and the facet’s operations can provide any desired operation
based on that data. If the behavior of a new facet isn’t overconstrained by convention, its design
can be simple and clean (§D.3.2).
D.4.1 String Comparison
The standard ccoollllaattee facet provides ways of comparing arrays of characters of type C
Chh:
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::ccoollllaattee : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt ccoollllaattee(ssiizzee__tt r = 00);
iinntt ccoom
mppaarree(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee, ccoonnsstt C
Chh* bb22, ccoonnsstt C
Chh* ee22) ccoonnsstt
{ rreettuurrnn ddoo__ccoom
mppaarree(bb,ee,bb22,ee22); }
lloonngg hhaasshh(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt { rreettuurrnn ddoo__hhaasshh(bb,ee); }
ssttrriinngg__ttyyppee ttrraannssffoorrm
m(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt { rreettuurrnn ddoo__ttrraannssffoorrm
m(bb,ee); }
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜ccoollllaattee();
// note: protected destructor
vviirrttuuaall iinntt ddoo__ccoom
mppaarree(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee, ccoonnsstt C
Chh* bb22, ccoonnsstt C
Chh* ee22) ccoonnsstt;
vviirrttuuaall ssttrriinngg__ttyyppee ddoo__ttrraannssffoorrm
m(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt;
vviirrttuuaall lloonngg ddoo__hhaasshh(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt;
};
Like all facets, ccoollllaattee is publically derived from ffaacceett and provides a constructor that takes an
argument that tells whether class llooccaallee controls the lifetime of the facet (§D.3).
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
890
Locales
Appendix D
Note that the destructor is protected. The ccoollllaattee facet isn’t meant to be used directly. Rather,
it is intended as a base class for all (derived) collation classes and for llooccaallee to manage (§D.3).
Application programmers, implementation providers, and library vendors will write the string comparison facets to be used through the interface provided by ccoollllaattee.
The ccoom
mppaarree() function does the basic string comparison according to the rules defined for a
particular ccoollllaattee; it returns 1 if the first string is lexicographically greater than the second, 0 if the
strings are identical, and -11 if the second string is greater than the first. For example:
vvooiidd ff(ccoonnsstt ssttrriinngg& ss11, ccoonnsstt ssttrriinngg& ss22, ccoollllaattee<cchhaarr>& ccm
mpp)
{
ccoonnsstt cchhaarr* ccss11 = ss11.ddaattaa(); // because compare() operates on char[]s
ccoonnsstt cchhaarr* ccss22 = ss22.ddaattaa();
ssw
wiittcchh (ccm
mpp.ccoom
mppaarree(ccss11,ccss11+ss11.ssiizzee(),ccss22,ccss22+ss22.ssiizzee()) {
ccaassee 00:
// identical strings according to cmp
// ...
bbrreeaakk;
ccaassee -11:
// s1 < s2
// ...
bbrreeaakk;
ccaassee 11:
// s1 > s2
// ...
bbrreeaakk;
}
}
Note that the ccoollllaattee member functions compare arrays of C
Chh rather than bbaassiicc__ssttrriinnggs or zeroterminated C-style strings. In particular, a C
Chh with the numeric value 0 is treated as an ordinary
character rather than as a terminator. Also, ccoom
mppaarree() differs from ssttrrccm
mpp(), returning exactly
the values -11, 00, and 1 rather than simply 0 and (arbitrary) negative and positive values (§20.4.1).
The standard library ssttrriinngg is not llooccaallee sensitive. That is, it compares strings according to the
rules of the implementation’s character set (§C.2). Furthermore, the standard ssttrriinngg does not provide a direct way of specifying comparison criteria (Chapter 20). To do a llooccaallee-sensitive comparison, we can use a ccoollllaattee’s ccoom
mppaarree(). Notationally, it can be more convenient to use ccoollllaattee’s
ccoom
mppaarree() indirectly through a llooccaallee’s ooppeerraattoorr() (§D.2.4). For example:
vvooiidd ff(ccoonnsstt ssttrriinngg& ss11, ccoonnsstt ssttrriinngg& ss22, ccoonnsstt cchhaarr* nn)
{
bbooooll b = ss11 == ss22;
// compare using implementation’s character set values
ccoonnsstt cchhaarr* ccss11 = ss11.ddaattaa();
ccoonnsstt cchhaarr* ccss22 = ss22.ddaattaa();
// because compare() operates on char[]s
ttyyppeeddeeff ccoollllaattee<cchhaarr> C
Cooll;
ccoonnsstt C
Cooll& gglloobb = uussee__ffaacceett<C
Cooll>(llooccaallee());
// from the current global locale
iinntt ii00 = gglloobb.ccoom
mppaarree(ccss11,ccss11+ss11.ssiizzee(),ccss22,ccss22+ss22.ssiizzee());
ccoonnsstt C
Cooll& m
myy__ccoollll = uussee__ffaacceett<C
Cooll>(llooccaallee("")); // from my preferred locale
iinntt ii11 = m
myy__ccoollll.ccoom
mppaarree(ccss11,ccss11+ss11.ssiizzee(),ccss22,ccss22+ss22.ssiizzee());
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.1
String Comparison
891
ccoonnsstt C
Cooll& ccoollll = uussee__ffaacceett<C
Cooll>(llooccaallee(nn));
// from locale named n
iinntt ii22 = ccoollll.ccoom
mppaarree(ccss11,ccss11+ss11.ssiizzee(),ccss22,ccss22+ss22.ssiizzee());
iinntt ii33 = llooccaallee()(ss11,ss22);
iinntt ii44 = llooccaallee("")(ss11,ss22);
iinntt ii55 = llooccaallee(nn)(ss11,ss22);
// compare using the current global locale
// compare using my preferred locale
// compare using the locale named n
// ...
}
Here, ii00==ii33, ii11==ii44, and ii22==ii55. It is not difficult to imagine cases in which ii22, ii33, and ii44 differ.
Consider this sequence of words from a German dictionary:
..
D
Diiaalleekktt, D
Diiaatt, ddiicchh, ddiicchhtteenn, D
Diicchhttuunngg
According to convention, nouns (only) are capitalized, but the ordering is not case sensitive.
A case-sensitive German sort would place all words starting with D before dd:
..
D
Diiaalleekktt, D
Diiaatt, D
Diicchhttuunngg, ddiicchh, ddiicchhtteenn
..
The a (umlaut aa) is treated as ‘‘a kind of aa,’’ so it comes before cc. However, in most common
..
character sets, the numeric value of a is larger than the numeric value of cc. Consequently,
..
iinntt(´cc´)<iinntt(´aa´), and the simple default sort based on numeric values gives:
..
D
Diiaalleekktt, D
Diicchhttuunngg, D
Diiaatt, ddiicchh, ddiicchhtteenn
Writing a compare function that orders this sequence correctly according to the dictionary is an
interesting exercise (§D.6[3]).
The hhaasshh() function calculates a hash value (§17.6.2.3). Obviously, this can be useful for
building hash tables.
The ttrraannssffoorrm
m() function produces a string that, when compared to other strings, gives the
same result as would a comparison to the argument string. The purpose of ttrraannssffoorrm
m() is to allow
optimization of code in which one string is compared to many others. This is useful when implementing a search for one or more strings among a set of strings.
The public ccoom
mppaarree(), hhaasshh(), and ttrraannssffoorrm
m() functions are implemented by calls to the
protected virtual functions ddoo__ccoom
mppaarree(), ddoo__hhaasshh(), and ddoo__ttrraannssffoorrm
m(), respectively. These
‘‘ddoo__ functions’’ can be overridden in derived classes. This two-function strategy allows the
library implementer who writes the non-virtual functions to provide some common functionality for
all calls independently of what the user-supplied ddoo__ function might do.
The use of virtual functions preserves the polymorphic nature of the ffaacceett but could be costly.
To avoid excess function calls, a llooccaallee can determine the exact ffaacceett used and cache any values it
might need for efficient execution (§D.2.2).
The static member iidd of type llooccaallee::iidd is used to identify a ffaacceett (§D.3). The standard functions hhaass__ffaacceett and uussee__ffaacceett depend on the correspondence between iidds and ffaacceetts (§D.3.1).
Two ffaacceetts providing exactly the same interface and semantics to llooccaallee should have the same iidd.
For example, ccoollllaattee<cchhaarr> and ccoollllaattee__bbyynnaam
mee<cchhaarr> (§D.4.1.1) have the same iidd. Conversely, two ffaacceetts performing different functions (as far as llooccaallee is concerned) must have different iidds. For example, nnuum
mppuunncctt<cchhaarr> and nnuum
m__ppuutt<cchhaarr> have different iidds (§D.4.2).
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
892
Locales
Appendix D
D.4.1.1 Named Collate
A ccoollllaattee__bbyynnaam
mee is a facet that provides a version of ccoollllaattee for a particular locale named by a
constructor string argument:
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::ccoollllaattee__bbyynnaam
mee : ppuubblliicc ccoollllaattee<C
Chh> {
ppuubblliicc:
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt ccoollllaattee__bbyynnaam
mee(ccoonnsstt cchhaarr*, ssiizzee__tt r = 00); // construct from named locale
// note: no id and no new functions
pprrootteecctteedd:
˜ccoollllaattee__bbyynnaam
mee(); // note: protected destructor
// override collate<Ch>’s virtual functions:
iinntt ddoo__ccoom
mppaarree(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee, ccoonnsstt C
Chh* bb22, ccoonnsstt C
Chh* ee22) ccoonnsstt;
ssttrriinngg__ttyyppee ddoo__ttrraannssffoorrm
m(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt;
lloonngg ddoo__hhaasshh(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt;
};
Thus, a ccoollllaattee__bbyynnaam
mee can be used to pick out a ccoollllaattee from a locale named in the program’s
execution environment (§D.4). One obvious way of storing facets in an execution environment
would be as data in a file. A less flexible alternative would be to represent a facet as program text
and data in a __bbyynnaam
mee facet.
The ccoollllaattee__bbyynnaam
mee<cchhaarr> class is an example of a ffaacceett that doesn’t have its own iidd (§D.3).
In a llooccaallee, ccoollllaattee__bbyynnaam
mee<C
Chh> is interchangeable with ccoollllaattee<C
Chh>. A ccoollllaattee and a
ccoollllaattee__bbyynnaam
mee for the same locale differ only in the extra constructor offered by the
ccoollllaattee__bbyynnaam
mee and in the semantics provided by the ccoollllaattee__bbyynnaam
mee.
Note that the __bbyynnaam
mee destructor is pprrootteecctteedd. This implies that you cannot have a __bbyynnaam
mee
facet as a local variable. For example:
vvooiidd ff()
{
ccoollllaattee__bbyynnaam
mee<cchhaarr> m
myy__ccoollll(""); // error: cannot destroy my_coll
// ...
}
This reflects the view that using locales and facets is something that is best done at a fairly high
level in a program to affect large bodies of code. An example is setting the global locale (§D.2.3)
or imbuing a stream (§21.6.3, §D.1). If necessary, we could derive a class with a public destructor
from a __bbyynnaam
mee class and create local variables of that class.
D.4.2 Numeric Input and Output
Numeric output is done by a nnuum
m__ppuutt facet writing into a stream buffer (§21.6.4). Conversely,
numeric input is done by a nnuum
m__ggeett facet reading from a stream buffer. The format used by
nnuum
m__ppuutt and nnuum
m__ggeett is defined by a ‘‘numerical punctuation’’ facet, nnuum
mppuunncctt.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.2.1
Numeric Punctuation
893
D.4.2.1 Numeric Punctuation
The nnuum
mppuunncctt facet defines the I/O format of built-in types, such as bbooooll, iinntt, and ddoouubbllee:
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::nnuum
mppuunncctt : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt nnuum
mppuunncctt(ssiizzee__tt r = 00);
C
Chh ddeecciim
maall__ppooiinntt() ccoonnsstt;
C
Chh tthhoouussaannddss__sseepp() ccoonnsstt;
ssttrriinngg ggrroouuppiinngg() ccoonnsstt;
// ’.’ in classic()
// ’,’ in classic()
// "" in classic(), meaning no grouping
ssttrriinngg__ttyyppee ttrruueennaam
mee() ccoonnsstt;
ssttrriinngg__ttyyppee ffaallsseennaam
mee() ccoonnsstt;
// "true" in classic()
// "false" in classic()
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜nnuum
mppuunncctt();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
The characters of the string returned by ggrroouuppiinngg() are read as a sequence of small integer values.
Each number specifies a number of digits for a group. Character 0 specifies the rightmost group
(the least-significant digits), character 1 the group to the left of that, etc. Thus, "\\000044\\000022\\000033"
describes a number, such as 112233-4455-66778899 (provided you use ´-´ as the separation character). If
necessary, the last number in a grouping pattern is used repeatedly, so "\\000033" is equivalent to
"\\000033\\000033\\000033". As the name of the separation character, tthhoouussaannddss__sseepp(), indicates, the most
common use of grouping is to make large integers more readable. The ggrroouuppiinngg() and
tthhoouussaannddss__sseepp() functions define a format for both input and output of integers. They are not
used for standard floating-point number I/O. Thus, we can not get 11223344556677.8899 printed as
11,223344,556677.8899 simply by defining ggrroouuppiinngg() and tthhoouussaannddss__sseepp().
We can define a new punctuation style by deriving from nnuum
mppuunncctt. For example, I could
define facet M
Myy__ppuunncctt to write integer values using spaces to group the digits by threes and
floating-point values, using a European-style comma as the ‘‘decimal point:’’
ccllaassss M
Myy__ppuunncctt : ppuubblliicc ssttdd::nnuum
mppuunncctt<cchhaarr> {
ppuubblliicc:
ttyyppeeddeeff cchhaarr cchhaarr__ttyyppee;
ttyyppeeddeeff ssttrriinngg ssttrriinngg__ttyyppee;
eexxpplliicciitt M
Myy__ppuunncctt(ssiizzee__tt r = 00) : ssttdd::nnuum
mppuunncctt<cchhaarr>(rr) { }
pprrootteecctteedd:
cchhaarr ddoo__ddeecciim
maall__ppooiinntt() ccoonnsstt { rreettuurrnn ´,´; } // comma
cchhaarr ddoo__tthhoouussaannddss__sseepp() ccoonnsstt { rreettuurrnn ´ ´; } // space
ssttrriinngg ddoo__ggrroouuppiinngg() ccoonnsstt { rreettuurrnn "\\000033"; } // 3-digit groups
};
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
894
Locales
Appendix D
vvooiidd ff()
{
ccoouutt << "ssttyyllee A
A: " << 1122334455667788 << " *** "<< 11223344556677.88 << ´\\nn´;
llooccaallee lloocc(llooccaallee(),nneew
w M
Myy__ppuunncctt);
ccoouutt.iim
mbbuuee(lloocc);
ccoouutt << "ssttyyllee B
B: " << 1122334455667788 << " *** "<< 11223344556677.88 << ´\\nn´;
}
This produced:
ssttyyllee A
A: 1122334455667788 *** 11.2233445577ee+0066
ssttyyllee B
B: 1122 334455 667788 *** 11,2233445577ee+0066
Note that iim
mbbuuee() stores a copy of its argument in its stream. Consequently, a stream can rely on
an imbued locale even after the original copy of that locale has been destroyed. If an iostream has
its bboooollaallpphhaa flag set (§21.2.2, §21.4.1), the strings returned by ttrruueennaam
mee() and ffaallsseennaam
mee() are
used to represent ttrruuee and ffaallssee, respectively; otherwise, 1 and 0 are used.
A __bbyynnaam
mee version (§D.4, §D.4.1) of nnuum
mppuunncctt is provided:
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::nnuum
mppuunncctt__bbyynnaam
mee : ppuubblliicc nnuum
mppuunncctt<C
Chh> { /* ... */ };
D.4.2.2 Numeric Output
When writing to a stream buffer (§21.6.4), an oossttrreeaam
m relies on the nnuum
m__ppuutt facet:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss O
Ouutt = oossttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::nnuum
m__ppuutt : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff O
Ouutt iitteerr__ttyyppee;
eexxpplliicciitt nnuum
m__ppuutt(ssiizzee__tt r = 00);
// put value "v" to buffer position "b" in stream "s":
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, bbooooll vv) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, lloonngg vv) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, uunnssiiggnneedd lloonngg vv) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, ddoouubbllee vv) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, lloonngg ddoouubbllee vv) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, ccoonnsstt vvooiidd* vv) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜nnuum
m__ppuutt();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
The output iterator (§19.1, §19.2.1) argument, O
Ouutt, identifies where in an oossttrreeaam
m’s stream buffer
(§21.6.4) ppuutt() places characters representing the numeric value on output. The value of ppuutt() is
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.2.2
Numeric Output
895
that iterator positioned one past the last character position written.
Note that the default specialization of nnuum
m__ppuutt (the one where the iterator used to access characters is of type oossttrreeaam
mbbuuff__iitteerraattoorr<C
Chh>) is part of the standard locales (§D.4). If you want to
use another specialization, you’ll have to make it yourself. For example:
tteem
mppllaattee<ccllaassss C
Chh>
ccllaassss SSttrriinngg__nnuum
mppuutt : ppuubblliicc ssttdd::nnuum
m__ppuutt<C
Chh,ttyyppeennaam
mee bbaassiicc__ssttrriinngg<C
Chh>::iitteerraattoorr> {
ppuubblliicc:
SSttrriinngg__nnuum
mppuutt() : ssttdd::nnuum
m__ppuutt<C
Chh,ttyyppeennaam
mee bbaassiicc__ssttrriinngg<C
Chh>::iitteerraattoorr>(11) { }
};
vvooiidd ff(iinntt ii, ssttrriinngg& ss, iinntt ppooss)
// format i into s starting at pos
{
SSttrriinngg__nnuum
mppuutt<cchhaarr> ff;
iiooss__bbaassee& xxxxxx = ccoouutt;
// use cout’s formatting rules
ff.ppuutt(ss.bbeeggiinn()+ppooss,xxxxxx,´ ´,ii); // format i into s
}
The iiooss__bbaassee argument is used to get information about formatting state and locale. For example,
if padding is needed, the ffiillll character is used as required by the iiooss__bbaassee argument. Typically, the
stream buffer written to through b is the buffer associated with an oossttrreeaam
m for which s is the base.
Note that an iiooss__bbaassee is not a simple object to construct. In particular, it controls many aspects of
formatting that must be consistent to achieve acceptable output. Consequently, iiooss__bbaassee has no
public constructor (§21.3.3).
A ppuutt() function also uses its iiooss__bbaassee argument to get the stream’s llooccaallee(). That llooccaallee is
used to determine punctuation (§D.4.2.1), the alphabetic representation of Booleans, and the conversion to C
Chh. For example, assuming that s is ppuutt()’s iiooss__bbaassee argument, we might find code
like this in a ppuutt() function:
ccoonnsstt llooccaallee& lloocc = ss.ggeettlloocc();
// ...
w
wcchhaarr__tt w = uussee__ffaacceett< ccttyyppee<cchhaarr> >(lloocc).w
wiiddeenn(cc);
// ...
ssttrriinngg ppnntt = uussee__ffaacceett< nnuum
mppuunncctt<cchhaarr> >(lloocc).ddeecciim
maall__ppooiinntt();
// ...
ssttrriinngg ffllssee = uussee__ffaacceett< nnuum
mppuunncctt<cchhaarr> >(lloocc).ffaallsseennaam
mee();
// char to Ch conversion
// default: ’.’
// default: "false"
A standard facet, such as nnuum
m__ppuutt<cchhaarr>, is typically used implicitly through a standard I/O
stream function. Consequently, most programmers need not know about it. However, the use of
such facets by standard library functions is interesting because they show how I/O streams work
and how facets can be used. As ever, the standard library provides examples of interesting programming techniques.
Using nnuum
m__ppuutt, the implementer of oossttrreeaam
m might write:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
oossttrreeaam
m& ssttdd::bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>::ooppeerraattoorr<<(ddoouubbllee dd)
{
sseennttrryy gguuaarrdd(*tthhiiss);
// see §21.3.8
iiff (!gguuaarrdd) rreettuurrnn *tthhiiss;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
896
Locales
ttrryy {
Appendix D
iiff (uussee__ffaacceett< nnuum
m__ppuutt<C
Chh> >(ggeettlloocc()).ppuutt(*tthhiiss,*tthhiiss,tthhiiss->ffiillll(),dd).ffaaiilleedd())
sseettssttaattee(bbaaddbbiitt);
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(*tthhiiss);
}
rreettuurrnn *tthhiiss;
}
A lot is going on here. The sentry ensures that all prefix and suffix operations are performed
(§21.3.8). We get the oossttrreeaam
m’s llooccaallee by calling its member function ggeettlloocc() (§21.7). We
extract nnuum
m__ppuutt from that llooccaallee using uussee__ffaacceett (§D.3.1). That done, we call the appropriate
ppuutt() function to do the real work. An oossttrreeaam
mbbuuff__iitteerraattoorr can be constructed from an oossttrreeaam
m
(§19.2.6), and an oossttrreeaam
m can be implicitly converted to its base class iiooss__bbaassee (§21.2.1), so the
two first arguments to ppuutt() are easily supplied.
A call of ppuutt() returns its output iterator argument. This output iterator is obtained from a
bbaassiicc__oossttrreeaam
m, so it is an oossttrreeaam
mbbuuff__iitteerraattoorr. Consequently, ffaaiilleedd() (§19.2.6.1) is available to
test for failure and to allow us to set the stream state appropriately.
I did not use hhaass__ffaacceett, because the standard facets (§D.4) are guaranteed to be present in every
locale. If that guarantee is violated, bbaadd__ccaasstt is thrown (§D.3.1).
The ppuutt() function calls the virtual ddoo__ppuutt(). Consequently, user-defined code may be executed, and ooppeerraattoorr<<() must be prepared to handle an exception thrown by the overriding
ddoo__ppuutt(). Also, nnuum
m__ppuutt may not exist for some character types, so uussee__ffaacceett() might throw
ssttdd::bbaadd__ccaasstt (§D.3.1). The behavior of a << for a built-in type, such as ddoouubbllee, is defined by the
C++ standard. Consequently, the question is not what hhaannddllee__iiooeexxcceeppttiioonn() should do but rather
how it should do what the standard prescribes. If bbaaddbbiitt is set in this oossttrreeaam
m’s exception state
(§21.3.6), the exception is simply rethrown. Otherwise, an exception is handled by setting the
stream state and continuing. In either case, bbaaddbbiitt must be set in the stream state (§21.3.3):
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
vvooiidd hhaannddllee__iiooeexxcceeppttiioonn(ssttdd::bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ss) // called from catch clause
{
iiff (ss.eexxcceeppttiioonnss()&iiooss__bbaassee::bbaaddbbiitt) {
ttrryy {
ss.sseettssttaattee(iiooss__bbaassee::bbaaddbbiitt); // might throw basic_ios::failure
} ccaattcchh(...) { }
tthhrroow
w;
// rethrow
}
ss.sseettssttaattee(iiooss__bbaassee::bbaaddbbiitt);
}
The try-block is needed because sseettssttaattee() might throw bbaassiicc__iiooss::ffaaiilluurree (§21.3.3, §21.3.6).
However, if bbaaddbbiitt is set in the exception state, ooppeerraattoorr<<() must rethrow the exception that
caused hhaannddllee__iiooeexxcceeppttiioonn() to be called (rather than simply throwing bbaassiicc__iiooss::ffaaiilluurree).
The << for a built-in type, such as ddoouubbllee, must be implemented by writing directly to a stream
buffer. When writing a << for a user-defined type, we can often avoid the resulting complexity by
expressing the output of the user-defined type in terms of output of existing types (§D.3.2).
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.2.3
Numeric Input
897
D.4.2.3 Numeric Input
When reading from a stream buffer (§21.6.4), an iissttrreeaam
m relies on the nnuum
m__ggeett facet:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss IInn = iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::nnuum
m__ggeett : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff IInn iitteerr__ttyyppee;
eexxpplliicciitt nnuum
m__ggeett(ssiizzee__tt r = 00);
// read [b:e) into v, using formatting rules from s, reporting errors by setting r:
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, bbooooll& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, lloonngg& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, uunnssiiggnneedd sshhoorrtt& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, uunnssiiggnneedd iinntt& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, uunnssiiggnneedd lloonngg& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ffllooaatt& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ddoouubbllee& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, lloonngg ddoouubbllee& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, vvooiidd*& vv) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜nnuum
m__ggeett();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
Basically, nnuum
m__ggeett is organized like nnuum
m__ppuutt (§D.4.2.2). Since it reads rather than writes, ggeett()
needs a pair of input iterators, and the argument designating the target of the read is a reference.
The iioossttaattee variable r is set to reflect the state of the stream. If a value of the desired type could
not be read, ffaaiillbbiitt is set in rr; if the end of input was reached, eeooffbbiitt is set in rr. An input operator
will use r to determine how to set the state of its stream. If no error was encountered, the value
read is assigned though vv; otherwise, v is left unchanged.
A sentry is used to ensure that the stream’s prefix and suffix operations are performed (§21.3.8).
In particular, the sentry is used to ensure that we try to read only if the stream is in a good state to
start with.
The implementer of iissttrreeaam
m might write:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
iissttrreeaam
m& ssttdd::bbaassiicc__iissttrreeaam
m<C
Chh,T
Trr>::ooppeerraattoorr>>(ddoouubbllee& dd)
{
sseennttrryy gguuaarrdd(*tthhiiss);
// see §21.3.8
iiff (!gguuaarrdd) rreettuurrnn *tthhiiss;
iioossttaattee ssttaattee = 00;
// good
iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> eeooss;
ddoouubbllee dddd;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
898
Locales
ttrryy {
Appendix D
uussee__ffaacceett< nnuum
m__ggeett<C
Chh> >(ggeettlloocc()).ggeett(*tthhiiss,eeooss,*tthhiiss,ssttaattee,dddd);
iiff (ssttaattee==00 || ssttaattee==eeooffbbiitt) d = dddd; // set value only if get() succeeded
sseettssttaattee(ssttaattee);
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(*tthhiiss);
}
rreettuurrnn *tthhiiss;
// see §D.4.2.2
}
Exceptions enabled for the iissttrreeaam
m will be thrown by sseettssttaattee() in case of error (§21.3.6).
By defining a nnuum
mppuunncctt, such as M
Myy__ppuunncctt from §D.4.2, we can read using nonstandard punctuation. For example:
vvooiidd ff()
{
ccoouutt << "ssttyyllee A
A: "
iinntt ii11;
ddoouubbllee dd11;
cciinn >> ii11 >> dd11;
// read using standard ‘‘12345678’’ format
llooccaallee lloocc(llooccaallee::ccllaassssiicc(),nneew
w M
Myy__ppuunncctt);
cciinn.iim
mbbuuee(lloocc);
ccoouutt << "ssttyyllee B
B: "
iinntt ii22;
ddoouubbllee dd22;
cciinn >> ii11 >> dd22;
// read using the ‘‘12 345 678’’ format
}
If we want to read really unusual numeric formats, we have to override ddoo__ggeett(). For example,
we might define a nnuum
m__ggeett that read Roman numerals, such as X
XX
XII and M
MM
M (§D.6[15]).
D.4.3 Input and Output of Monetary Values
The formatting of monetary amounts is technically similar to the formatting of ‘‘plain’’ numbers
(§D.4.2). However, the presentation of monetary amounts is even more sensitive to cultural differences. For example, a negative amount (a loss, a debit), such as -11.2255, should in some contexts be
presented as a (positive) number in parentheses: (11.2255). Similarly, color is in some contexts used
to ease the recognition of negative amounts.
There is no standard ‘‘money type.’’ Instead, the money facets are meant to be used explicitly
for numeric values that the programmer knows to represent monetary amounts. For example:
ccllaassss M
Moonneeyy { // simple type to hold a monetary amount
lloonngg iinntt aam
moouunntt;
ppuubblliicc:
M
Moonneeyy(lloonngg iinntt ii) : aam
moouunntt(ii) { }
ooppeerraattoorr lloonngg iinntt() ccoonnsstt { rreettuurrnn aam
moouunntt; }
};
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.3
Input and Output of Monetary Values
899
// ...
vvooiidd ff(lloonngg iinntt ii)
{
ccoouutt << "vvaalluuee= " << i << " aam
moouunntt= " << M
Moonneeyy(ii) << eennddll;
}
The task of the monetary facets is to make it reasonably easy to write an output operator for M
Moonneeyy
so that the amount is printed according to local convention (see §D.4.3.2). The output would vary
depending on ccoouutt’s locale. Possible outputs are:
vvaalluuee= 11223344556677 aam
moouunntt= $1122334455.6677
vvaalluuee= 11223344556677 aam
moouunntt= 1122334455,6677 D
DK
KK
K
vvaalluuee= -11223344556677 aam
moouunntt= $-1122334455.6677
vvaalluuee= -11223344556677 aam
moouunntt= -$1122334455.6677
vvaalluuee= -11223344556677 aam
moouunntt= (C
CH
HF
F1122334455,6677)
For money, accuracy to the smallest currency unit is usually considered essential. Consequently, I
adopted the common convention of having the integer value represent the number of cents (pence,
øre, fils, cents, etc.) rather than the number of dollars (pounds, kroner, dinar, euro, etc.). This convention is supported by m
moonneeyyppuunncctt’s ffrraacc__ddiiggiittss() function (§D.4.3.1). Similarly, the appearance of the ‘‘decimal point’’ is defined by ddeecciim
maall__ppooiinntt().
The facets m
moonneeyy__ggeett and m
moonneeyy__ppuutt provide functions that perform I/O based on the format
defined by the m
moonneeyy__bbaassee facet.
A simple M
Moonneeyy type can be used simply to control I/O formats or to hold monetary values. In
the former case, we cast values of (other) types used to hold monetary amounts to M
Moonneeyy before
writing, and we read into M
Moonneeyy variables before converting them to other types. It is less error
prone to consistently hold monetary amounts in a M
Moonneeyy type; that way, we cannot forget to cast a
value to M
Moonneeyy before writing it, and we don’t get input errors by trying to read monetary values in
locale-insensitive ways. However, it may be infeasible to introduce a M
Moonneeyy type into a system
that wasn’t designed for that. In such cases, applying M
Moonneeyy conversions (casts) to read and write
operations is necessary.
D.4.3.1 Money Punctuation
The facet controlling the presentation of monetary amounts, m
moonneeyyppuunncctt, naturally resembles the
facet for controlling plain numbers, nnuum
mppuunncctt (§D.4.2.1):
ccllaassss ssttdd::m
moonneeyy__bbaassee {
ppuubblliicc:
eennuum
m ppaarrtt { nnoonnee, ssppaaccee, ssyym
mbbooll, ssiiggnn, vvaalluuee };
ssttrruucctt ppaatttteerrnn { cchhaarr ffiieelldd[44]; };
};
// parts of value layout
// layout specification
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
900
Locales
Appendix D
tteem
mppllaattee <ccllaassss C
Chh, bbooooll IInntteerrnnaattiioonnaall = ffaallssee>
ccllaassss ssttdd::m
moonneeyyppuunncctt : ppuubblliicc llooccaallee::ffaacceett, ppuubblliicc m
moonneeyy__bbaassee {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt m
moonneeyyppuunncctt(ssiizzee__tt r = 00);
C
Chh ddeecciim
maall__ppooiinntt() ccoonnsstt;
C
Chh tthhoouussaannddss__sseepp() ccoonnsstt;
ssttrriinngg ggrroouuppiinngg() ccoonnsstt;
// ’.’ in classic()
// ’,’ in classic()
// "" in classic(), meaning "no grouping"
ssttrriinngg__ttyyppee ccuurrrr__ssyym
mbbooll() ccoonnsstt;
ssttrriinngg__ttyyppee ppoossiittiivvee__ssiiggnn() ccoonnsstt;
ssttrriinngg__ttyyppee nneeggaattiivvee__ssiiggnn() ccoonnsstt;
// "$" in classic()
// "" in classic()
// "-" in classic()
iinntt ffrraacc__ddiiggiittss() ccoonnsstt;
ppaatttteerrnn ppooss__ffoorrm
maatt() ccoonnsstt;
ppaatttteerrnn nneegg__ffoorrm
maatt() ccoonnsstt;
// number of digits after the decimal point; 2 in classic()
// { symbol, sign, none, value } in classic()
// { symbol, sign, none, value } in classic()
ssttaattiicc ccoonnsstt bbooooll iinnttll = IInntteerrnnaattiioonnaall;
// use international monetary formats
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜m
moonneeyyppuunncctt();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
The facilities offered by m
moonneeyyppuunncctt are intended primarily for use by implementers of m
moonneeyy__ppuutt
and m
moonneeyy__ggeett facets (§D.4.3.2, §D.4.3.3).
The ddeecciim
maall__ppooiinntt(), tthhoouussaannddss__sseepp(), and ggrroouuppiinngg() members behave as their equivalents in nnuum
mppuunncctt.
The ccuurrrr__ssyym
mbbooll(), ppoossiittiivvee__ssiiggnn(), and nneeggaattiivvee__ssiiggnn() members return the string to be
used to represent the currency symbol (for example, $, ¥, F
FR
RF
F, D
DK
KK
K), the plus sign, and the minus
sign, respectively. If the IInntteerrnnaattiioonnaall template argument was ttrruuee, the iinnttll member will also be
ttrruuee, and ‘‘international’’ representations of the currency symbols will be used. Such an ‘‘international’’ representation is a four-character string. For example:
"U
USSD
D"
"D
DK
KK
K"
"E
EU
UR
R"
The last character is a terminating zero. The three-letter currency identifier is defined by the ISO4217 standard. When IInntteerrnnaattiioonnaall is ffaallssee, a ‘‘local’’ currency symbol, such as $, £, and ¥, can
be used.
A ppaatttteerrnn returned by ppooss__ffoorrm
maatt() or nneegg__ffoorrm
maatt() is four ppaarrtts defining the sequence in
which the numeric value, the currency symbol, the sign symbol, and whitespace occur. Most common formats are trivially represented using this simple notion of a pattern. For example:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.3.1
+$ 112233.4455
$+112233.4455
$112233.4455
$112233.4455-112233.4455 D
DK
KK
K
($112233.4455)
(112233.4455D
DK
KK
K)
Money Punctuation
901
// { sign, symbol, space, value } where positive_sign() returns "+"
// { symbol, sign, value, none } where positive_sign() returns "+"
// { symbol, sign, value, none } where positive_sign() returns ""
// { symbol, value, sign, none }
// { sign, value, space, symbol }
// { sign, symbol, value, none } where negative_sign() returns "()"
// { sign, value, symbol, none } where negative_sign() returns "()"
Representing a negative number using parentheses is achieved by having nneeggaattiivvee__ssiiggnn() return a
string containing the two characters (). The first character of a sign string is placed where ssiiggnn is
found in the pattern, and the rest of the sign string is placed after all other parts of the pattern. The
most common use of this facility is to represent the financial community’s convention of using
parentheses for negative amounts, but other uses are possible. For example:
-$112233.4455
*$112233.4455 ssiillllyy
// { sign, symbol, value, none } where negative_sign() returns "–"
// { sign, symbol, value, none } where negative_sign() returns "* silly"
The values ssiiggnn, vvaalluuee, and ssyym
mbbooll must each appear exactly once in a pattern. The remaining
value can be either ssppaaccee or nnoonnee. Where ssppaaccee appears, at least one and possibly more whitespace characters may appear in the representation. Where nnoonnee appears, except at the end of a pattern, zero or more whitespace characters may appear in the representation.
Note that these strict rules ban some apparently reasonable patterns:
ppaatttteerrnn ppaatt = { ssiiggnn, vvaalluuee, nnoonnee, nnoonnee };
// error: no symbol
The ffrraacc__ddiiggiittss() function indicates where the ddeecciim
maall__ppooiinntt() is placed. Often, monetary
amounts are represented in the smallest currency unit (§D.4.3). This unit is typically one hundredth
of the major unit (for example, a ¢ is one hundredth of a $), so ffrraacc__ddiiggiittss() is often 22.
Here is a simple format defined as a facet:
ccllaassss M
Myy__m
moonneeyy__iioo : ppuubblliicc m
moonneeyyppuunncctt<cchhaarr,ttrruuee> {
ppuubblliicc:
eexxpplliicciitt M
Myy__m
moonneeyy__iioo(ssiizzee__tt r = 00) :m
moonneeyyppuunncctt<cchhaarr,ttrruuee>(rr) { }
cchhaarr__ttyyppee ddoo__ddeecciim
maall__ppooiinntt() ccoonnsstt { rreettuurrnn ´.´; }
cchhaarr__ttyyppee ddoo__tthhoouussaannddss__sseepp() ccoonnsstt { rreettuurrnn ´,´; }
ssttrriinngg ddoo__ggrroouuppiinngg() ccoonnsstt { rreettuurrnn "\\000033\\000033\\000033"; }
ssttrriinngg__ttyyppee ddoo__ccuurrrr__ssyym
mbbooll() ccoonnsstt { rreettuurrnn "U
USSD
D "; }
ssttrriinngg__ttyyppee ddoo__ppoossiittiivvee__ssiiggnn() ccoonnsstt { rreettuurrnn ""; }
ssttrriinngg__ttyyppee ddoo__nneeggaattiivvee__ssiiggnn() ccoonnsstt { rreettuurrnn "()"; }
iinntt ddoo__ffrraacc__ddiiggiittss() ccoonnsstt { rreettuurrnn 22; }
// two digits after decimal point
ppaatttteerrnn ddoo__ppooss__ffoorrm
maatt() ccoonnsstt
{
ssttaattiicc ppaatttteerrnn ppaatt = { ssiiggnn, ssyym
mbbooll, vvaalluuee, nnoonnee };
rreettuurrnn ppaatt;
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
902
Locales
Appendix D
ppaatttteerrnn ddoo__nneegg__ffoorrm
maatt() ccoonnsstt
{
ssttaattiicc ppaatttteerrnn ppaatt = { ssiiggnn, ssyym
mbbooll, vvaalluuee, nnoonnee };
rreettuurrnn ppaatt;
}
};
This facet is used in the M
Moonneeyy input and output operations defined in §D.4.3.2 and §D.4.3.3.
A __bbyynnaam
mee version (§D.4, §D.4.1) of m
moonneeyyppuunncctt is provided:
tteem
mppllaattee <ccllaassss C
Chh, bbooooll IInnttll = ffaallssee>
ccllaassss ssttdd::m
moonneeyyppuunncctt__bbyynnaam
mee : ppuubblliicc m
moonneeyyppuunncctt<C
Chh, IInnttll> { /* ... */ };
D.4.3.2 Money Output
The m
moonneeyy__ppuutt facet writes monetary amounts according to the format specified by m
moonneeyyppuunncctt.
Specifically, m
moonneeyy__ppuutt provides ppuutt() functions that place a suitably formatted character representation into the stream buffer of a stream:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss O
Ouutt = oossttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::m
moonneeyy__ppuutt : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff O
Ouutt iitteerr__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt m
moonneeyy__ppuutt(ssiizzee__tt r = 00);
// put value "v" into buffer position "b":
O
Ouutt ppuutt(O
Ouutt bb, bbooooll iinnttll, iiooss__bbaassee& ss, C
Chh ffiillll, lloonngg ddoouubbllee vv) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, bbooooll iinnttll, iiooss__bbaassee& ss, C
Chh ffiillll, ccoonnsstt ssttrriinngg__ttyyppee& vv) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜m
moonneeyy__ppuutt();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
The bb, ss, ffiillll, and v arguments are used as for nnuum
m__ppuutt’s ppuutt() functions (§D.4.2.2). The iinnttll
argument indicates whether a standard four-character ‘‘international’’ currency symbol or a
‘‘local’’ symbol is used (§D.4.3.1).
Given m
moonneeyy__ppuutt, we can define an output operator for M
Moonneeyy (§D.4.3):
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m& ss, M
Moonneeyy m
m)
{
oossttrreeaam
m::sseennttrryy gguuaarrdd(ss);
// see §21.3.8
iiff (!gguuaarrdd) rreettuurrnn ss;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.3.2
ttrryy {
Money Output
903
ccoonnsstt m
moonneeyy__ppuutt<cchhaarr>& f = uussee__ffaacceett< m
moonneeyy__ppuutt<cchhaarr> >(ss.ggeettlloocc());
iiff (m
m==ssttaattiicc__ccaasstt<lloonngg ddoouubbllee>(m
m)) { // m can be represented as a long double
iiff (ff.ppuutt(ss,ttrruuee,ss,ss.ffiillll(),m
m).ffaaiilleedd()) ss.sseettssttaattee(iiooss__bbaassee::bbaaddbbiitt);
}
eellssee {
oossttrriinnggssttrreeaam
m vv;
v << m
m;
// convert to string representation
iiff (ff.ppuutt(ss,ttrruuee,ss,ss.ffiillll(),vv.ssttrr()).ffaaiilleedd()) ss.sseettssttaattee(iiooss__bbaassee::bbaaddbbiitt);
}
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(ss);
}
rreettuurrnn ss;
// see §D.4.2.2
}
If a lloonngg ddoouubbllee doesn’t have sufficient precision to represent the monetary value exactly, I convert
the value to its string representation and output that using the ppuutt() that takes a ssttrriinngg.
D.4.3.3 Money Input
The m
moonneeyy__ggeett facet reads monetary amounts according to the format specified by m
moonneeyyppuunncctt.
Specifically, m
moonneeyy__ggeett provides ggeett() functions that extract a suitably formatted character representation from the stream buffer of a stream:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss IInn = iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::m
moonneeyy__ggeett : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff IInn iitteerr__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt m
moonneeyy__ggeett(ssiizzee__tt r = 00);
// read [b:e) into v, using formatting rules from s, reporting errors by setting r:
IInn ggeett(IInn bb, IInn ee, bbooooll iinnttll, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, lloonngg ddoouubbllee& vv) ccoonnsstt;
IInn ggeett(IInn bb, IInn ee, bbooooll iinnttll, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ssttrriinngg__ttyyppee& vv) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜m
moonneeyy__ggeett();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
The bb, ee, ss, ffiillll, and v arguments are used as for nnuum
m__ggeett’s ggeett() functions (§D.4.2.3). The iinnttll
argument indicates whether a standard four-character ‘‘international’’ currency symbol or a
‘‘local’’ symbol is used (§D.4.3.1).
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
904
Locales
Appendix D
A well-defined pair of m
moonneeyy__ggeett and m
moonneeyy__ppuutt facets will provide output in a form that can
be read back in without errors or loss of information. For example:
iinntt m
maaiinn()
{
M
Moonneeyy m
m;
w
whhiillee (cciinn>>m
m) ccoouutt << m << "\\nn";
}
The output of this simple program should be acceptable as its input. Furthermore, the output produced by a second run given the output from a first run should be identical to its input.
A plausible input operator for M
Moonneeyy would be:
iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m& ss, M
Moonneeyy& m
m)
{
iissttrreeaam
m::sseennttrryy gguuaarrdd(ss);
// see §21.3.8
iiff (gguuaarrdd) ttrryy {
iiooss__bbaassee::iioossttaattee ssttaattee = 00;
// good
iissttrreeaam
mbbuuff__iitteerraattoorr<cchhaarr> eeooss;
ssttrriinngg ssttrr;
uussee__ffaacceett< m
moonneeyy__ggeett<cchhaarr> >(ss.ggeettlloocc()).ggeett(ss,eeooss,ttrruuee,ssttaattee,ssttrr);
iiff (ssttaattee==00 || ssttaattee==iiooss__bbaassee::eeooffbbiitt) { // set value only if get() succeeded
lloonngg iinntt i = ssttrrttooll(ssttrr.cc__ssttrr(),00,00); // for strtol(), see §20.4.1
iiff (eerrrrnnoo==E
ER
RA
AN
NG
GE
E)
ssttaattee |= iiooss__bbaassee::ffaaiillbbiitt;
eellssee
m = ii;
// set value only if conversion to long int succeeded
ss.sseettssttaattee(ssttaattee);
}
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(ss);
}
rreettuurrnn ss;
// see §D.4.2.2
}
I use the ggeett() that reads into a ssttrriinngg because reading into a ddoouubbllee and then converting to a lloonngg
iinntt could lead to loss of precision.
D.4.4 Date and Time Input and Output
Unfortunately, the C++ standard library does not provide a proper ddaattee type. However, from the C
standard library, it inherits low-level facilities for dealing with dates and time intervals. These C
facilities are the basis for C++’s facilities for dealing with time in a system-independent manner.
The following sections demonstrate how the presentation of date and time-of-day information
can be made llooccaallee sensitive. In addition, they provide an example of how a user-defined type
(D
Daattee) can fit into the framework provided by iioossttrreeaam
m (Chapter 21) and llooccaallee (§D.2). The
implementation of D
Daattee shows techniques that are useful for dealing with time if you don’t have a
D
Daattee type available.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.1
Clocks and Timers
905
D.4.4.1 Clocks and Timers
At the lowest level, most systems have a fine-grained timer. The standard library provides a function cclloocckk() that returns an implementation-defined arithmetic type cclloocckk__tt. The result of
cclloocckk() can be calibrated by using the C
CL
LO
OC
CK
KSS__P
PE
ER
R__SSE
EC
C macro. If you don’t have access to a
reliable timing utility, you might measure a loop like this:
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[])
{
iinntt n = aattooii(aarrggvv[11]);
// §6.1.7
// §20.4.1
cclloocckk__tt tt11 = cclloocckk();
iiff (tt11 == cclloocckk__tt(-11)) {
// clock_t(-1) means "clock() didn’t work"
cceerrrr << "ssoorrrryy, nnoo cclloocckk\\nn";
eexxiitt(11);
}
ffoorr (iinntt i = 00; ii<nn; ii++) ddoo__ssoom
meetthhiinngg(); // timing loop
cclloocckk__tt tt22 = cclloocckk();
iiff (tt22 == cclloocckk__tt(-11)) {
cceerrrr << "ssoorrrryy, cclloocckk oovveerrfflloow
w\\nn";
eexxiitt(22);
}
ccoouutt << "ddoo__ssoom
meetthhiinngg() " << n << " ttiim
meess ttooookk "
<< ddoouubbllee(tt22-tt11)/C
CL
LO
OC
CK
KSS__P
PE
ER
R__SSE
EC
C << " sseeccoonnddss"
<< " (m
meeaassuurreem
meenntt ggrraannuullaarriittyy: " << C
CL
LO
OC
CK
KSS__P
PE
ER
R__SSE
EC
C << " ooff a sseeccoonndd)\\nn";
}
The explicit conversion ddoouubbllee(tt22-tt11) before dividing is necessary because cclloocckk__tt might be an
integer. Exactly when the cclloocckk() starts running is implementation defined; cclloocckk() is meant to
measure time intervals within a single run of a program. For values tt11 and tt22 returned by cclloocckk(),
ddoouubbllee(tt22-tt11)/C
CL
LO
OC
CK
KSS__P
PE
ER
R__SSE
EC
C is the system’s best approximation of the time in seconds
between the two calls.
If cclloocckk() isn’t provided for a processor or if a time interval was too long to measure, cclloocckk()
returns cclloocckk__tt(-11).
The cclloocckk() function is meant to measure intervals from a fraction of a second to a few seconds. For example, if cclloocckk__tt is a 32-bit signed iinntt and C
CL
LO
OC
CK
KSS__P
PE
ER
R__SSE
EC
C is 1,000,000 , we can
use cclloocckk() to measure from 0 to just over 2,000 seconds (about half an hour) in microseconds.
Please note that getting meaningful measurements of a program can be tricky. Other programs
running on a machine may severely affect the time used by a run, cache and pipelining effects are
difficult to predict, and algorithms may have surprising dependencies on data. If you try to time
something, make several runs and reject the results as flawed if the run times vary significantly.
To cope with longer time intervals and with calendar time, the standard library provides ttiim
mee__tt
for representing a point in time and a structure ttm
m for separating a point in time into its conventional parts:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
906
Locales
Appendix D
ttyyppeeddeeff iim
mpplleem
meennttaattiioonn__ddeeffiinneedd ttiim
mee__tt;
ssttrruucctt ttm
m{
iinntt ttm
m__sseecc;
iinntt ttm
m__m
miinn;
iinntt ttm
m__hhoouurr;
iinntt ttm
m__m
mddaayy;
iinntt ttm
m__m
moonn;
iinntt ttm
m__yyeeaarr;
iinntt ttm
m__w
wddaayy;
iinntt ttm
m__yyddaayy;
iinntt ttm
m__iissddsstt;
};
// implementation-defined arithmetic type (§4.1.1)
// capable of representing a period of time,
// often, a 32-bit integer
// second of minute [0,61]; 60 and 61 to represent leap seconds
// minute of hour [0,59]
// hour of day [0,23]
// day of month [1,31]
// month of year [0,11]; 0 means January (note: not [1:12])
// year since 1900; 0 means year 1900, and 102 means 2002
// days since Sunday [0,6]; 0 means Sunday
// days since January 1 [0,365]; 0 means January 1
// hours of daylight savings time
Note that the standard guarantees only that ttm
m has the iinntt members mentioned here. The standard
does not guarantee that the members appear in this order or that there are no other fields.
The ttiim
mee__tt and ttm
m types and the basic facilities for using them are presented in <ccttiim
mee> and
<ttiim
mee.hh>. For example::
cclloocckk__tt cclloocckk();
// number of clock ticks since the start of the program
ttiim
mee__tt ttiim
mee(ttiim
mee__tt* pptt);
ddoouubbllee ddiiffffttiim
mee(ttiim
mee__tt tt22, ttiim
mee__tt tt11);
// current calendar time
// t2–t1 in seconds
ttm
m* llooccaallttiim
mee(ccoonnsstt ttiim
mee__tt* pptt);
ttm
m* ggm
mttiim
mee(ccoonnsstt ttiim
mee__tt* pptt);
// local time for the *pt
// Grenwich Mean Time (GMT) tm for *pt, or 0
// (officially called Coordinated Universal Time, UTC)
ttiim
mee__tt m
mkkttiim
mee(ttm
m* ppttm
m);
// time_t for *ptm, or time_t(-1)
cchhaarr* aassccttiim
mee(ccoonnsstt ttm
m* ppttm
m);
// C-style string representation for *ptm
// for example, "Sun Sep 16 01:03:52 1973\n"
cchhaarr* ccttiim
mee(ccoonnsstt ttiim
mee__tt* tt) { rreettuurrnn aassccttiim
mee(llooccaallttiim
mee(tt)); }
Beware: both llooccaallttiim
mee() and ggm
mttiim
mee() return a ttm
m* to a statically allocated object; a subsequent
call of that function will change the value of that object. Either use such a return value immediately, or copy the ttm
m into storage that you control. Similarly, aassccttiim
mee() returns a pointer to a statically allocated character array.
A ttm
m can represent dates in a range of at least tens of thousands of years (about [-32000,32000]
for a minimally sized iinntt). However, ttiim
mee__tt is most often a (signed) 32-bit lloonngg iinntt. Counting seconds, this makes ttiim
mee__tt capable of representing a range just over 68 years on each side of a base
year. This base year is most commonly 1970, with the exact base time being 0:00 of January 1
GMT (UTC). If ttiim
mee__tt is a 32-bit signed integer, we’ll run out of ‘‘time’’ in 2038 unless we
upgrade ttiim
mee__tt to a larger integer type, as is already done on some systems.
The ttiim
mee__tt mechanism is meant primarily for representing ‘‘near current time.’’ Thus, we
should not expect ttiim
mee__tt to be able to represent dates outside the [1902,2038] range. Worse, not all
implementations of the functions dealing with time handle negative values in the same way. For
portability, a value that needs to be represented as both a ttm
m and a ttiim
mee__tt should be in the
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.1
Clocks and Timers
907
[1970,2038] range. People who want to represent dates outside the 1970 to 2038 time frame must
devise some additional mechanism to do so.
One consequence of this is that m
mkkttiim
mee() can fail. If the argument for m
mkkttiim
mee() cannot be
represented as a ttiim
mee__tt, the error indicator ttiim
mee__tt(-11) is returned.
If we have a long-running program, we might time it like this:
iinntt m
maaiinn(iinntt aarrggcc, cchhaarr* aarrggvv[]) // §6.1.7
{
ttiim
mee__tt tt11 = ttiim
mee(00);
ddoo__aa__lloott(aarrggcc,aarrggvv);
ttiim
mee__tt tt22 = ttiim
mee(00);
ddoouubbllee d = ddiiffffttiim
mee(tt22,tt11);
ccoouutt << "ddoo__aa__lloott() ttooookk" << d << " sseeccoonnddss\\nn";
}
If the argument to ttiim
mee() is not 00, the resulting time is also assigned to the ttiim
mee__tt pointed to. If
the calendar time is not available (say, on a specialized processor), the value ttiim
mee__tt(-11) is
returned. We could cautiously try to find today’s date like this:
iinntt m
maaiinn()
{
ttiim
mee__tt tt;
iiff (ttiim
mee(&tt) == ttiim
mee__tt(-11)) { // time_t(–1) means ‘‘time() didn’t work’’
cceerrrr << "B
Baadd ttiim
mee\\nn";
eexxiitt(11);
}
ttm
m* ggtt = ggm
mttiim
mee(&tt);
ccoouutt << ggtt->ttm
m__m
moonn+11 << ´/´ << ggtt->ttm
m__m
mddaayy << ´/´ << 11990000+ggtt->ttm
m__yyeeaarr << eennddll;
}
D.4.4.2 A Date Class
As mentioned in §10.3, it is unlikely that a single D
Daattee type can serve all purposes. The uses of
date information dictate a variety of representations, and calendar information before the 19th century is very dependent on historical vagaries. However, as an example, we could define a D
Daattee
type along the lines from §10.3, using ttiim
mee__tt as the implementation:
ccllaassss D
Daattee {
ppuubblliicc:
eennuum
m M
Moonntthh { jjaann=11, ffeebb, m
maarr, aapprr, m
maayy, jjuunn, jjuull, aauugg, sseepp, oocctt, nnoovv, ddeecc };
ccllaassss B
Baadd__ddaattee {};
D
Daattee(iinntt dddd, M
Moonntthh m
mm
m, iinntt yyyy);
D
Daattee();
ffrriieenndd oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m& ss, ccoonnsstt D
Daattee& dd);
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
908
Locales
Appendix D
// ...
pprriivvaattee:
ttiim
mee__tt dd; // standard date and time representation
};
D
Daattee::D
Daattee(iinntt dddd, M
Moonntthh m
mm
m, iinntt yyyy)
{
ttm
m x = { 0 };
iiff (dddd<00 || 3311<dddd) tthhrroow
w B
Baadd__ddaattee();
// oversimplified: see §10.3.1
xx.ttm
m__m
mddaayy = dddd;
iiff (m
mm
m<jjaann || ddeecc<m
mm
m) tthhrroow
w B
Baadd__ddaattee();
xx.ttm
m__m
moonn = m
mm
m-11;
// tm_mon is zero based
xx.ttm
m__yyeeaarr = yyyy-11990000;
// tm_year is 1900 based
d=m
mkkttiim
mee(&xx);
}
D
Daattee::D
Daattee()
{
d = ttiim
mee(00);
// default Date: today
iiff (dd == ttiim
mee__tt(-11)) tthhrroow
w B
Baadd__ddaattee();
}
The task here is to define locale-sensitive implementations for D
Daattee << and >>.
D.4.4.3 Date and Time Output
Like nnuum
m__ppuutt (§D.4.2), ttiim
mee__ppuutt provides ppuutt() functions for writing to buffers through iterators:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss O
Ouutt = oossttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::ttiim
mee__ppuutt : ppuubblliicc llooccaallee::ffaacceett {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff O
Ouutt iitteerr__ttyyppee;
eexxpplliicciitt ttiim
mee__ppuutt(ssiizzee__tt r = 00);
// put t into s’s stream buffer through b, using format fmt:
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, ccoonnsstt ttm
m* tt,
ccoonnsstt C
Chh* ffm
mtt__bb, ccoonnsstt C
Chh* ffm
mtt__ee) ccoonnsstt;
O
Ouutt ppuutt(O
Ouutt bb, iiooss__bbaassee& ss, C
Chh ffiillll, ccoonnsstt ttm
m* tt, cchhaarr ffm
mtt, cchhaarr m
moodd = 00) ccoonnsstt
{ rreettuurrnn ddoo__ppuutt(bb,ss,ffiillll,tt,ffm
mtt,m
moodd); }
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜ttiim
mee__ppuutt();
vviirrttuuaall O
Ouutt ddoo__ppuutt(O
Ouutt, iiooss__bbaassee&, C
Chh, ccoonnsstt ttm
m*, cchhaarr, cchhaarr) ccoonnsstt;
};
A call ppuutt(bb,ss,ffiillll,tt,ffm
mtt__bb,ffm
mtt__ee) places the date information from t into ss’s stream buffer
through bb. The ffiillll character is used where needed for padding. The output format is specified by a
pprriinnttff()-like format string [ffm
mtt__bb,ffm
mtt__ee). The pprriinnttff-like (§21.8) format is used to produce an
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.3
Date and Time Output
909
actual output and may contain the following special-purpose format specifiers:
%
%aa
%
%A
A
%
%bb
%
%B
B
%
%cc
%
%dd
%
%H
H
%
%II
%
%jj
%
%m
m
%
%M
M
%
%pp
%
%SS
%
%U
U
%
%w
w
%
%W
W
%
%xx
%
%X
X
%
%yy
%
%Y
Y
%
%Z
Z
abbreviated weekday name (e.g., Sat)
full weekday name (e.g., Saturday)
abbreviated month name (e.g., Feb)
full month name (e.g., February)
date and time (e.g., Sat Feb 06 21:46:05 1999)
day of month [01,31] (e.g., 06)
24-hour clock hour [00,23] (e.g., 21)
12-hour clock hour [01,12] (e.g., 09)
day of year [001,366] (e.g., 037)
month of year [01,12] (e.g., 02)
minute of hour [00,59] (e.g., 48)
a.m./p.m. indicator for 12-hour clock (e.g., PM)
second of minute [00,61] (e.g., 40)
week of year [00,53] starting with Sunday (e.g., 05); the first Sunday starts week 1
day of week [0,6]; 0 means Sunday (e.g., 6)
week of year [00,53] starting with Monday (e.g., 05); the first Monday starts week 1
date (e.g., 02/06/99)
time (e.g., 21:48:40)
year without century [00,99] (e.g., 99)
year (e.g., 1999)
time zone indicator (e.g., EST) if the time zone is known
This long list of very specialized formatting rules could be used as an argument for the use of
extensible I/O systems. However, as with most specialized notations, it is adequate for its task and
often even convenient.
In addition to these formatting directives, most implementations support ‘‘modifiers,’’ such as
an integer specifying a field width (§21.8), %1100X
X. Modifiers for the time-and-date formats are not
part of the C++ standard, but some platform standards, such as POSIX, require them. Consequently, modifiers can be difficult to avoid even if their use isn’t perfectly portable.
The sspprriinnttff-like (§21.8) function ssttrrffttiim
mee() from <ccttiim
mee> or <ttiim
mee.hh> produces output using
the time and date format directives:
ssiizzee__tt ssttrrffttiim
mee(cchhaarr* ss, ssiizzee__tt m
maaxx, ccoonnsstt cchhaarr* ffoorrm
maatt, ccoonnsstt ttm
m* ttm
mpp);
This function places a maximum of m
maaxx characters from *ttm
mpp and the ffoorrm
maatt into *ss according the
ffoorrm
maatt. For example:
iinntt m
maaiinn()
{
ccoonnsstt iinntt m
maaxx = 2200; // sloppy: hope strftime() will never produce more than 20 characters
cchhaarr bbuuff[m
maaxx];
ttiim
mee__tt t = ttiim
mee(00);
ssttrrffttiim
mee(bbuuff,m
maaxx,"%A
A\\nn",llooccaallttiim
mee(&tt));
ccoouutt << bbuuff;
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
910
Locales
Appendix D
On a Wednesday, this will print W
Weeddnneessddaayy in the default ccllaassssiicc() locale (§D.2.3) and oonnssddaagg in
a Danish locale.
Characters that are not part of a format specified, such as the newline in the example, are simply
copied into the first argument (ss).
When ppuutt() identifies a format character f (and optional modifier character m
m), it calls the virtual ddoo__ppuutt() to do the actual formatting: ddoo__ppuutt(bb,ss,ffiillll,tt,ff,m
m).
A call ppuutt(bb,ss,ffiillll,tt,ff,m
m) is a simplified form of ppuutt(), where a format character (ff) and a
modifier character (m
m) are explicitly provided. Thus,
ccoonnsstt cchhaarr ffm
mtt[] = "%1100X
X";
ppuutt(bb,ss,ffiillll,tt,ffm
mtt,ffm
mtt+ssiizzeeooff(ffm
mtt));
can be abbreviated to
ppuutt(bb,ss,ffiillll,tt,´X
X´,1100);
If a format contains multibyte characters, it must both begin and end in the default state (§D.4.6).
We can use ppuutt() to implement a llooccaallee-sensitive output operator for D
Daattee:
oossttrreeaam
m& ooppeerraattoorr<<(oossttrreeaam
m& ss, ccoonnsstt D
Daattee& dd)
{
oossttrreeaam
m::sseennttrryy gguuaarrdd(ss);
// see §21.3.8
iiff (!gguuaarrdd) rreettuurrnn ss;
ttm
m* ttm
mpp = llooccaallttiim
mee(&dd.dd);
ttrryy {
iiff (uussee__ffaacceett< ttiim
mee__ppuutt<cchhaarr> >(ss.ggeettlloocc()).ppuutt(ss,ss,ss.ffiillll(),ttm
mpp,´xx´).ffaaiilleedd())
ss.sseettssttaattee(iiooss__bbaassee::ffaaiillbbiitt);
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(ss);
// see §D.4.2.2
}
rreettuurrnn ss;
}
Since there is no standard D
Daattee type, there is no default layout for date I/O. Here, I specified the
%xx format by passing the character ´xx´ as the format character. Because the %xx format is the
default for ggeett__ttiim
mee() (§D.4.4.4), that is probably as close to a standard as one can get. See
§D.4.4.5 for an example of how to use alternative formats.
A __bbyynnaam
mee version (§D.4, §D.4.1) of ttiim
mee__ppuutt is also provided:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss O
Ouutt = oossttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::ttiim
mee__ppuutt__bbyynnaam
mee : ppuubblliicc ttiim
mee__ppuutt<C
Chh,O
Ouutt> { /* ... */ };
D.4.4.4 Date and Time Input
As ever, input is trickier than output. When we write code to output a value, we often have a
choice among different formats. In addition, when we write input code, we must deal with errors
and sometimes the possibility of several alternative formats.
The ttiim
mee__ggeett facet implements input of time and date. The idea is that ttiim
mee__ggeett of a llooccaallee can
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.4
Date and Time Input
911
read the times and dates produced by the llooccaallee’s ttiim
mee__ppuutt. However, there are no standard ddaattee
and ttiim
mee classes, so a programmer can use a locale to produce output according to a variety of formats. For example, the following representations could all be produced by using a single output
statement, using ttiim
mee__ppuutt (§D.4.4.5) from different locales:
JJaannuuaarryy 1155tthh 11999999
T
Thhuurrssddaayy 1155tthh JJaannuuaarryy 11999999
1155 JJaann 11999999A
AD
D
T
Thhuurrss 1155/11/9999
The C++ standard encourages implementers of ttiim
mee__ggeett to accept dates and time formats as specified by POSIX and other standards. The problem is that it is difficult to standardize the intent to
read dates and times in whatever format is conventional in a given culture. It is wise to experiment
to see what a given locale provides (§D.6[8]). If a format isn’t accepted, a programmer can provide
a suitable alternative ttiim
mee__ggeett facet.
The standard time input ffaacceett, ttiim
mee__ggeett, is derived from ttiim
mee__bbaassee:
ssttrruucctt ssttdd::ttiim
mee__bbaassee {
eennuum
m ddaatteeoorrddeerr {
nnoo__oorrddeerr, // no order, possibly more elements (such as day of week)
ddm
myy,
// day before month before year
m
mddyy,
// month before day before year
yym
mdd,
// year before month before day
yyddm
m
// year before day before month
};
};
An implementer can use this enumeration to simplify the parsing on date formats.
Like nnuum
m__ggeett, ttiim
mee__ggeett accesses its buffer through a pair of input iterators:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss IInn = iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ttiim
mee__ggeett : ppuubblliicc llooccaallee::ffaacceett, ppuubblliicc ttiim
mee__bbaassee {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff IInn iitteerr__ttyyppee;
eexxpplliicciitt ttiim
mee__ggeett(ssiizzee__tt r = 00);
ddaatteeoorrddeerr ddaattee__oorrddeerr() ccoonnsstt { rreettuurrnn ddoo__ddaattee__oorrddeerr(); }
// read [b,e) into d, using formatting rules from s, reporting errors by setting r:
IInn ggeett__ttiim
mee(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* dd) ccoonnsstt;
IInn ggeett__ddaattee(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* dd) ccoonnsstt;
IInn ggeett__yyeeaarr(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* dd) ccoonnsstt;
IInn ggeett__w
weeeekkddaayy(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* dd) ccoonnsstt;
IInn ggeett__m
moonntthhnnaam
mee(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* dd) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜ttiim
mee__ggeett();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
912
Locales
Appendix D
The ggeett__ttiim
mee() function calls ddoo__ggeett__ttiim
mee(). The default ggeett__ttiim
mee() reads time as produced by
the llooccaallee’s ttiim
mee__ppuutt::ppuutt(), using the %X
X format (§D.4.4). Similarly, the ggeett__ddaattee() function
calls ddoo__ggeett__ddaattee(). The default ggeett__ddaattee() reads a date as produced by the llooccaallee’s
ttiim
mee__ppuutt::ppuutt(), using the %xx format (§D.4.4).
Thus, the simplest input operator for D
Daattees is something like this:
iissttrreeaam
m& ooppeerraattoorr>>(iissttrreeaam
m& ss, D
Daattee& dd)
{
iissttrreeaam
m::sseennttrryy gguuaarrdd(ss);
// see §21.3.8
iiff (!gguuaarrdd) rreettuurrnn ss;
iiooss__bbaassee::iioossttaattee rreess = 00;
ttm
m x = { 0 };
iissttrreeaam
mbbuuff__iitteerraattoorr<cchhaarr,cchhaarr__ttrraaiittss<cchhaarr> > eenndd;
ttrryy {
uussee__ffaacceett< ttiim
mee__ggeett<cchhaarr> >(ss.ggeettlloocc()).ggeett__ddaattee(ss,eenndd,ss,rreess,&xx);
iiff (rreess==00 || rreess==iiooss__bbaassee::eeooffbbiitt)
d=D
Daattee(xx.ttm
m__m
mddaayy,D
Daattee::M
Moonntthh(xx.ttm
m__m
moonn+11),xx.ttm
m__yyeeaarr+11990000);
eellssee
ss.sseettssttaattee(rreess);
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(ss);
// see §D.4.2.2
}
rreettuurrnn ss;
}
The call ggeett__ddaattee(ss,eenndd,ss,rreess,&xx) relies on two implicit conversions from iissttrreeaam
m: As the first
argument, s is used to construct an iissttrreeaam
mbbuuff__iitteerraattoorr. As third argument, s is converted to the
iissttrreeaam
m base class iiooss__bbaassee.
This input operator will work correctly for dates in the range that can be represented by ttiim
mee__tt.
A trivial test case would be:
iinntt m
maaiinn()
ttrryy {
D
Daattee ttooddaayy;
ccoouutt << ttooddaayy << eennddll;
// write using %x format
D
Daattee dd(1122, D
Daattee::m
maayy, 11999988);
ccoouutt << d << eennddll;
D
Daattee dddd;
w
whhiillee (cciinn >> dddd) ccoouutt << dddd << eennddll;
// read dates produced by %x format
}
ccaattcchh (D
Daattee::B
Baadd__ddaattee) {
ccoouutt << "eexxiitt: bbaadd ddaattee ccaauugghhtt\\nn";
}
A __bbyynnaam
mee version (§D.4, §D.4.1) of ttiim
mee__ggeett is also provided:
tteem
mppllaattee <ccllaassss C
Chh, ccllaassss IInn = iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss ssttdd::ttiim
mee__ggeett__bbyynnaam
mee : ppuubblliicc ttiim
mee__ggeett<C
Chh,IInn> { /* ... */ };
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.4
Date and Time Input
913
D.4.4.5 A More Flexible Date Class
If you tried to use the D
Daattee class from §D.4.4.2 with the I/O from §D.4.4.3 and §D.4.4.4, you’d
soon find it restrictive:
[1] It can handle only dates that can be represented by a ttiim
mee__tt; that typically means in the
[1970,2038] range.
[2] It accepts dates only in the standard format – whatever that might be.
[3] Its reporting of input errors is unacceptable.
[4] It supports only streams of cchhaarr – not streams of arbitrary character types.
A more interesting and more useful input operator would accept a wider range of dates, recognize a
few common formats, and reliably report errors in a useful form. To do this, we must depart from
the ttiim
mee__tt representation:
ccllaassss D
Daattee {
ppuubblliicc:
eennuum
m M
Moonntthh { jjaann=11, ffeebb, m
maarr, aapprr, m
maayy, jjuunn, jjuull, aauugg, sseepp, oocctt, nnoovv, ddeecc };
ssttrruucctt B
Baadd__ddaattee {
ccoonnsstt cchhaarr* w
whhyy;
B
Baadd__ddaattee(ccoonnsstt cchhaarr* pp) : w
whhyy(pp) { }
};
D
Daattee(iinntt dddd, M
Moonntthh m
mm
m, iinntt yyyy, iinntt ddaayy__ooff__w
weeeekk = 00);
D
Daattee();
vvooiidd m
maakkee__ttm
m(ttm
m* tt) ccoonnsstt;
ttiim
mee__tt m
maakkee__ttiim
mee__tt() ccoonnsstt;
// place tm representation of Date in *t
// return time_t representation of Date
iinntt yyeeaarr() ccoonnsstt { rreettuurrnn yy; }
M
Moonntthh m
moonntthh() ccoonnsstt { rreettuurrnn m
m; }
iinntt ddaayy() ccoonnsstt { rreettuurrnn dd; }
// ...
pprriivvaattee:
cchhaarr dd;
M
Moonntthh m
m;
iinntt yy;
};
For simplicity, I reverted to the (dd,m
m,yy) representation (§10.2).
The constructor might be defined like this:
D
Daattee::D
Daattee(iinntt dddd, M
Moonntthh m
mm
m, iinntt yyyy, iinntt ddaayy__ooff__w
weeeekk)
:dd(dddd), m
m(m
mm
m), yy(yyyy)
{
iiff (dd==00 && m
m==M
Moonntthh(00) && yy==00) rreettuurrnn;
// Date(0,0,0) is the "null date"
iiff (m
mm
m<jjaann || ddeecc<m
mm
m) tthhrroow
w B
Baadd__ddaattee("bbaadd m
moonntthh");
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
914
Locales
Appendix D
iiff (dddd<11 || 3311<dddd) // oversimplified; see §10.3.1
tthhrroow
w B
Baadd__ddaattee("bbaadd ddaayy ooff m
moonntthh");
iiff (ddaayy__ooff__w
weeeekk && ddaayy__iinn__w
weeeekk(yyyy,m
mm
m,dddd)!=ddaayy__ooff__w
weeeekk)
tthhrroow
w B
Baadd__ddaattee("bbaadd ddaayy ooff w
weeeekk");
}
D
Daattee::D
Daattee() :dd(00), m
m(00), yy(00) { } // a "null date"
The ddaayy__iinn__w
weeeekk() calculation is nontrivial and immaterial to the llooccaallee mechanisms, so I have
left it out. If you need one, your system will have one somewhere.
Comparison operations are always useful for types such as D
Daattee:
bbooooll ooppeerraattoorr==(ccoonnsstt D
Daattee& xx, ccoonnsstt D
Daattee& yy)
{
rreettuurrnn xx.yyeeaarr()==yy.yyeeaarr() && xx.m
moonntthh()==yy.m
moonntthh() && xx.ddaayy()==yy.ddaayy();
}
bbooooll ooppeerraattoorr!=(ccoonnsstt D
Daattee& xx, ccoonnsstt D
Daattee& yy)
{
rreettuurrnn !(xx==yy);
}
Having departed from the standard ttm
m and ttiim
mee__tt formats, we need conversion functions to cooperate with software that expects those types:
vvooiidd D
Daattee::m
maakkee__ttm
m(ttm
m* pp) ccoonnsstt
{
ttm
m x = { 0 };
*pp = xx;
pp->ttm
m__yyeeaarr = yy-11990000;
pp->ttm
m__m
mddaayy = dd;
pp->ttm
m__m
moonn = m
m-11;
}
// put date into *p
ttiim
mee__tt D
Daattee::m
maakkee__ttiim
mee__tt() ccoonnsstt
{
iiff (yy<11997700 || 22003388<yy)
// oversimplified
tthhrroow
w B
Baadd__ddaattee("ddaattee oouutt ooff rraannggee ffoorr ttiim
mee__tt");
ttm
m xx;
m
maakkee__ttm
m(&xx);
rreettuurrnn m
mkkttiim
mee(&xx);
}
D.4.4.6 Specifying a D
Daattee Format
C++ doesn’t define a standard output format for dates (%xx is as close as we get; §D.4.4.3). However, even if a standard format existed, we would probably want to be able to use alternatives. This
could be done by providing a ‘‘default format’’ and a way of changing it. For example:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.6
Specifying a D
Daattee Format
915
ccllaassss D
Daattee__ffoorrm
maatt {
ssttaattiicc cchhaarr ffm
mtt[];
// default format
ccoonnsstt cchhaarr* ccuurrrr;
// current format
ccoonnsstt cchhaarr* ccuurrrr__eenndd;
ppuubblliicc:
D
Daattee__ffoorrm
maatt() :ccuurrrr(ffm
mtt), ccuurrrr__eenndd(ffm
mtt+ssttrrlleenn(ffm
mtt)) { }
ccoonnsstt cchhaarr* bbeeggiinn() ccoonnsstt { rreettuurrnn ccuurrrr; }
ccoonnsstt cchhaarr* eenndd() ccoonnsstt { rreettuurrnn ccuurrrr__eenndd; }
vvooiidd sseett(ccoonnsstt cchhaarr* pp, ccoonnsstt cchhaarr* qq) { ccuurrrr=pp; ccuurrrr__eenndd=qq; }
vvooiidd sseett(ccoonnsstt cchhaarr* pp) { ccuurrrr=pp; ccuurrrr__eenndd=ccuurrrr+ssttrrlleenn(pp); }
ssttaattiicc ccoonnsstt cchhaarr* ddeeffaauulltt__ffm
mtt() { rreettuurrnn ffm
mtt; }
};
ccoonnsstt cchhaarr D
Daattee__ffoorrm
maatt::ffm
mtt[] = "%A
A, %B
B %dd, %Y
Y"; // e.g., Friday, February 5, 1999
D
Daattee__ffoorrm
maatt ddaattee__ffm
mtt;
To be able to use that ssttrrffttiim
mee() format (§D.4.4.3), I have refrained from parameterizing the
D
Daattee__ffoorrm
maatt class on the character type used. This implies that this solution allows only date notations for which the format can be expressed as a cchhaarr[]. I also used a global format object
(ddaattee__ffm
mtt) to provide a default D
Daattee format. Since the value of ddaattee__ffm
mtt can be changed, this provides a crude way of controlling D
Daattee formatting, similar to the way gglloobbaall() (§D.2.3) can be
used to control formatting.
A more general solution is to add D
Daattee__iinn and D
Daattee__oouutt facets to control reading and writing
from a stream. That approach is presented in §D.4.4.7.
Given D
Daattee__ffoorrm
maatt, D
Daattee::ooppeerraattoorr<<() can be written like this:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr<<(bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>& ss, ccoonnsstt D
Daattee& dd)
// write according to user-specified format
{
ttyyppeennaam
mee bbaassiicc__oossttrreeaam
m<C
Chh,T
Trr>::sseennttrryy gguuaarrdd(ss); // see §21.3.8
iiff (!gguuaarrdd) rreettuurrnn ss;
ttm
m tt;
dd.m
maakkee__ttm
m(&tt);
ttrryy {
ccoonnsstt ttiim
mee__ppuutt<C
Chh>& f = uussee__ffaacceett< ttiim
mee__ppuutt<C
Chh> >(ss.ggeettlloocc());
iiff (ff.ppuutt(ss,ss,ss.ffiillll(),&tt,ddaattee__ffm
mtt.bbeeggiinn(),ddaattee__ffm
mtt.eenndd()).ffaaiilleedd())
ss.sseettssttaattee(iiooss__bbaassee::ffaaiillbbiitt);
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(ss);
// see §D.4.2.2
}
rreettuurrnn ss;
}
I could have used hhaass__ffaacceett to verify that ss’s locale had a ttiim
mee__ppuutt<C
Chh> facet. However, here it
seemed simpler to handle that problem by catching any exception thrown by uussee__ffaacceett.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
916
Locales
Appendix D
Here is a simple test program that controls the output format through ddaattee__ffm
mtt:
iinntt m
maaiinn()
ttrryy {
w
whhiillee (cciinn >> dddd && dddd != D
Daattee()) ccoouutt << dddd << eennddll;
// write using default date_fmt
ddaattee__ffm
mtt.sseett("%Y
Y/%m
m/%dd");
w
whhiillee (cciinn >> dddd && dddd != D
Daattee()) ccoouutt << dddd << eennddll;
}
ccaattcchh (D
Daattee::B
Baadd__ddaattee ee) {
ccoouutt << "bbaadd ddaattee ccaauugghhtt: " << ee.w
whhyy << eennddll;
}
// write using "%Y/%m/%d"
D.4.4.7 A D
Daattee Input Facet
As ever, input is a bit more difficult than output. However, because the interface to low-level input
is fixed by ggeett__ddaattee() and because the ooppeerraattoorr>>() defined for D
Daattee in §D.4.4.4 didn’t directly
access the representation of a D
Daattee, we could use that ooppeerraattoorr>>() unchanged. Here is a templatized version to match the ooppeerraattoorr<<() from §D.4.4.6:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss T
Trr>
iissttrreeaam
m<C
Chh,T
Trr>& ooppeerraattoorr>>(iissttrreeaam
m<C
Chh,T
Trr>& ss, D
Daattee& dd)
{
ttyyppeennaam
mee iissttrreeaam
m<C
Chh,T
Trr>::sseennttrryy gguuaarrdd(ss);
iiff (gguuaarrdd) ttrryy {
iiooss__bbaassee::iioossttaattee rreess = 00;
ttm
m x = { 0 };
iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh,T
Trr> eenndd;
uussee__ffaacceett< ttiim
mee__ggeett<C
Chh> >(ss.ggeettlloocc()).ggeett__ddaattee(ss,eenndd,ss,rreess,&xx);
iiff (rreess==00 || rreess==iiooss__bbaassee::eeooffbbiitt)
d=D
Daattee(xx.ttm
m__m
mddaayy,D
Daattee::M
Moonntthh(xx.ttm
m__m
moonn+11),xx.ttm
m__yyeeaarr+11990000,xx.ttm
m__w
wddaayy);
eellssee
ss.sseettssttaattee(rreess);
}
ccaattcchh (...) {
hhaannddllee__iiooeexxcceeppttiioonn(ss);
}
rreettuurrnn ss;
// see §D.4.2.2
}
This D
Daattee input operator calls ggeett__ddaattee() from the iissttrreeaam
m’s ttiim
mee__ggeett facet (§D.4.4.4). Therefore, we can provide a different and more flexible form of input by defining a new facet derived
from ttiim
mee__ggeett:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss IInn = iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> >
ccllaassss D
Daattee__iinn : ppuubblliicc ssttdd::ttiim
mee__ggeett<C
Chh,IInn> {
ppuubblliicc:
D
Daattee__iinn(ssiizzee__tt r = 00) : ssttdd::ttiim
mee__ggeett<C
Chh>(rr) { }
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.7
AD
Daattee Input Facet
917
pprrootteecctteedd:
IInn ddoo__ggeett__ddaattee(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* ttm
mpp) ccoonnsstt;
pprriivvaattee:
eennuum
m V
Vttyyppee { nnoovvaalluuee, uunnkknnoow
wnn, ddaayyooffw
weeeekk, m
moonntthh };
IInn ggeettvvaall(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, iinntt* vv, V
Vttyyppee* rreess) ccoonnsstt;
};
The ggeettvvaall() needs to read a year, a month, a day of the month, and optionally a day of the week
and compose the result into a ttm
m.
The names of the months and the names of the days of the week are locale specific. Consequently, we can’t mention them directly in our input function. Instead, we recognize months and
days by calling the functions that ttiim
mee__ggeett provides for that: ggeett__m
moonntthhnnaam
mee() and
ggeett__w
weeeekkddaayy() (§D.4.4.4).
The year, the day of the month, and possibly the month are represented as integers. Unfortunately, a number does not indicate whether it denotes a day or a month, or whatever. For example,
7 could denote July, day 7 of a month, or even the year 2007. The real purpose of ttiim
mee__ggeett’s
ddaattee__oorrddeerr() is to resolve such ambiguities.
The strategy of D
Daattee__iinn is to read values, classify them, and then use ddaattee__oorrddeerr() to see
whether (or how) the values entered make sense. The private ggeettvvaall() function does the actual
reading from the stream buffer and the initial classification:
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss IInn>
IInn D
Daattee__iinn<C
Chh,IInn>::ggeettvvaall(IInn bb, IInn ee,
iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, iinntt* vv, V
Vttyyppee* rreess) ccoonnsstt
// read part of Date: number, day_of_week, or month. Skip whitespace and punctuation.
{
ccoonnsstt ccttyyppee<C
Chh>& cctt = uussee__ffaacceett< ccttyyppee<C
Chh> >(ss.ggeettlloocc()); // ctype is defined in §D.4.5
C
Chh cc;
*rreess = nnoovvaalluuee; // no value found
ffoorr (;;) {// skip whitespace and punctuation
iiff (bb == ee) rreettuurrnn ee;
c = *bb;
iiff (!(cctt.iiss(ccttyyppee__bbaassee::ssppaaccee,cc) || cctt.iiss(ccttyyppee__bbaassee::ppuunncctt,cc))) bbrreeaakk;
++bb;
}
iiff (cctt.iiss(ccttyyppee__bbaassee::ddiiggiitt,cc)) {
iinntt i = 00;
// read integer without regard for numpunct
ddoo { // turn digit from arbitrary character set into decimal value:
ssttaattiicc cchhaarr ccoonnsstt ddiiggiittss[] = "00112233445566778899";
i = ii*1100 + ffiinndd(ddiiggiittss,ddiiggiittss+1100,cctt.nnaarrrroow
w(cc,´ ´))-ddiiggiittss;
c = *++bb;
}w
whhiillee (cctt.iiss(ccttyyppee__bbaassee::ddiiggiitt,cc));
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
918
Locales
Appendix D
*vv = ii;
*rreess = uunnkknnoow
wnn;
rreettuurrnn bb;
// an integer, but we don’t know what it represents
}
iiff (cctt.iiss(ccttyyppee__bbaassee::aallpphhaa,cc)) { // look for name of month or day of week
bbaassiicc__ssttrriinngg<C
Chh> ssttrr;
w
whhiillee (cctt.iiss(ccttyyppee__bbaassee::aallpphhaa,cc)) {
// read characters into string
ssttrr += cc;
iiff (++bb == ee) bbrreeaakk;
c = *bb;
}
ttm
m tt;
bbaassiicc__ssttrriinnggssttrreeaam
m<C
Chh> ssss(ssttrr);
ttyyppeeddeeff iissttrreeaam
mbbuuff__iitteerraattoorr<C
Chh> SSII;
// iterator type for ss’ buffer
ggeett__m
moonntthhnnaam
mee(ssss.rrddbbuuff(),SSII(),ss,rr,&tt); // read from in-memory stream buffer
iiff ((rr&(iiooss__bbaassee::bbaaddbbiitt|iiooss__bbaassee::ffaaiillbbiitt))==00) {
*vv= tt.ttm
m__m
moonn;
*rreess = m
moonntthh;
rreettuurrnn bb;
}
r = 00;
// clear state before trying to read a second time
ggeett__w
weeeekkddaayy(ssss.rrddbbuuff(),SSII(),ss,rr,&tt); // read from in-memory stream buffer
iiff ((rr&(iiooss__bbaassee::bbaaddbbiitt|iiooss__bbaassee::ffaaiillbbiitt))==00) {
*vv = tt.ttm
m__w
wddaayy;
*rreess = ddaayyooffw
weeeekk;
rreettuurrnn bb;
}
}
r |= iiooss__bbaassee::ffaaiillbbiitt;
rreettuurrnn bb;
}
The tricky part here is to distinguish months from weekdays. We read through input iterators, so
we cannot read [bb,ee) twice, looking first for a month and then for a day. On the other hand, we
cannot look at one character at a time and decide, because only ggeett__m
moonntthhnnaam
mee() and
ggeett__w
weeeekkddaayy() know which character sequences make up the names of the months and the names
of the days of the week in a given locale. The solution I chose was to read strings of alphabetic
characters into a ssttrriinngg, make a ssttrriinnggssttrreeaam
m from that string, and then repeatedly read from that
stream’s ssttrreeaam
mbbuuff.
The error recording uses the state bits, such as iiooss__bbaassee::bbaaddbbiitt, directly. This is necessary
because the more convenient functions for manipulating stream state, such as cclleeaarr() and sseett-ssttaattee(), are defined in bbaassiicc__iiooss rather than in its base iiooss__bbaassee (§21.3.3). If necessary, the >>
operator then uses the error results reported by ggeett__ddaattee() to reset the state of the input stream.
Given ggeettvvaall(), we can read values first and then try to see whether they make sense later. The
ddaattee__oorrddeerr() can be crucial:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.4.7
AD
Daattee Input Facet
919
tteem
mppllaattee<ccllaassss C
Chh, ccllaassss IInn>
IInn D
Daattee__iinn<C
Chh,IInn>::ddoo__ggeett__ddaattee(IInn bb, IInn ee, iiooss__bbaassee& ss, iiooss__bbaassee::iioossttaattee& rr, ttm
m* ttm
mpp) ccoonnsstt
// optional day of week followed by ymd, dmy, mdy, or ydm
{
iinntt vvaall[33];
// for day, month, and year values in some order
V
Vttyyppee rreess[33] = { nnoovvaalluuee };
// for value classifications
ffoorr (iinntt ii=00; bb!=ee && ii<33; ++ii) { // read day, month, and year
b = ggeettvvaall(bb,ee,ss,rr,&vvaall[ii],&rreess[ii]);
iiff (rr) rreettuurrnn bb;
// oops: error
iiff (rreess[ii]==nnoovvaalluuee) {
// couldn’t complete date
r |= iiooss__bbaassee::bbaaddbbiitt;
rreettuurrnn bb;
}
iiff (rreess[ii]==ddaayyooffw
weeeekk) {
ttm
mpp->ttm
m__w
wddaayy = vvaall[ii];
--ii; // oops: not a day, month, or year
}
}
ttiim
mee__bbaassee::ddaatteeoorrddeerr oorrddeerr = ddaattee__oorrddeerr();
iiff (rreess[00] == m
moonntthh) {
// ...
}
eellssee iiff (rreess[11] == m
moonntthh) {
ttm
mpp->ttm
m__m
moonn = vvaall[11];
ssw
wiittcchh (oorrddeerr) {
ccaassee ddm
myy:
ttm
mpp->ttm
m__m
mddaayy = vvaall[00];
ttm
mpp->ttm
m__yyeeaarr = vvaall[22];
bbrreeaakk;
ccaassee yym
mdd:
ttm
mpp->ttm
m__yyeeaarr = vvaall[00];
ttm
mpp->ttm
m__m
mddaayy = vvaall[22];
bbrreeaakk;
ddeeffaauulltt:
r |= iiooss__bbaassee::bbaaddbbiitt;
rreettuurrnn bb;
}
}
eellssee iiff (rreess[22] == m
moonntthh) {
// ...
}
eellssee {
// ...
}
ttm
mpp->ttm
m__yyeeaarr -= 11990000;
rreettuurrnn bb;
// now try to make sense of the values read
// mdy or error
// dmy or ymd or error
// ydm or error
// rely on dateorder or error
// adjust base year to suit tm convention
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
920
Locales
Appendix D
I have omitted bits of code that do not add to the understanding of locales, dates, or the handling of
input. Writing better and more general date input functions are left as exercises (§D.6[9-10]).
Here is a simple test program:
iinntt m
maaiinn()
ttrryy {
cciinn.iim
mbbuuee(lloocc(llooccaallee(),nneew
w D
Daattee__iinn)); // read Dates using Date_in
w
whhiillee (cciinn >> dddd && dddd != D
Daattee()) ccoouutt << dddd << eennddll;
}
ccaattcchh (D
Daattee::B
Baadd__ddaattee ee) {
ccoouutt << "bbaadd ddaattee ccaauugghhtt: " << ee.w
whhyy << eennddll;
}
Note that ddoo__ggeett__ddaattee() will accept meaningless dates, such as
T
Thhuurrssddaayy O
Occttoobbeerr 77, 11999988
and
11999999/F
Feebb/3311
The checks for consistency of the year, month, day, and optional day of the week are done in
D
Daattee’s constructor. It is the D
Daattee class’ job to know what constitutes a correct date, and it is not
necessary for D
Daattee__iinn to share that knowledge.
It would be possible to have ggeettvvaall() or ddoo__ggeett__ddaattee() guess about the meaning of numeric
values. For example,
1122 M
Maayy 11992222
is clearly not the day 1922 of year 12. That is, we could ‘‘guess’’ that a numeric value that
couldn’t be a day of the specified month must be a year. Such ‘‘guessing’’ can be useful in specific
constrained context. However, it in not a good idea in more general contexts. For example,
1122 M
Maayy 1155
could be a date in the year 12, 15, 1912, 1915, 2012, or 2015. Sometimes, a better approach is to
augment the notation with clues that disambiguate years and days. For example, 11sstt and 1155tthh are
clearly days of a month. Similarly, 775511B
BC
C and 11445533A
AD
D are explicitly identified as years.
D.4.5 Character Classification
When reading characters from input, it is often necessary to classify them to make sense of what is
being read. For example, to read a number, an input routine needs to know which letters are digits.
Similarly, §6.1.2 showed a use of standard character classification functions for parsing input.
Naturally, classification of characters depends on the alphabet used. Consequently, a facet
ccttyyppee is provided to represent character classification in a locale.
The character classes as described by an enumeration called m
maasskk:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.5
ccllaassss ssttdd::ccttyyppee__bbaassee {
ppuubblliicc:
eennuum
m m
maasskk {
ssppaaccee = 11,
pprriinntt = 11<<11,
ccnnttrrll = 11<<22,
uuppppeerr = 11<<33,
lloow
weerr = 11<<44,
aallpphhaa = 11<<55,
ddiiggiitt = 11<<66,
ppuunncctt = 11<<77,
xxddiiggiitt = 11<<88,
aallnnuum
m=aallpphhaa|ddiiggiitt,
ggrraapphh=aallnnuum
m|ppuunncctt
};
};
Character Classification
921
// the actual values are implementation defined
// whitespace (in "C" locale: ’ ’, ’\n’, ’\t’, ...)
// printing characters
// control characters
// uppercase characters
// lowercase characters
// alphabetic characters
// decimal digits
// punctuation characters
// hexadecimal digits
// alphanumeric characters
This m
maasskk doesn’t depend on a particular character type. Consequently, this enumeration is placed
in a (non-template) base class.
Clearly, m
maasskk reflects the traditional C and C++ classification (§20.4.1). However, for different
character sets, different character values fall into different classes. For example, for the ASCII
character set, the integer value 112255 represents the character ´}´, which is a punctuation character
(ppuunncctt). However, in the Danish national character set, 112255 represents the vowel ´aå˚´, which in a
Danish locale must be classified as an aallpphhaa.
The classification is called a ‘‘mask’’ because the traditional efficient implementation of character classification for small character sets is a table in which each entry holds bits representing the
classification. For example:
ttaabbllee[´aa´] == lloow
weerr|aallpphhaa|xxddiiggiitt
ttaabbllee[´11´] == ddiiggiitt
ttaabbllee[´ ´] == ssppaaccee
Given that implementation, ttaabbllee[cc]&m
m is nonzero if the character c is an m and 0 otherwise.
The ccttyyppee facet is defined like this:
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::ccttyyppee : ppuubblliicc llooccaallee::ffaacceett, ppuubblliicc ccttyyppee__bbaassee {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
eexxpplliicciitt ccttyyppee(ssiizzee__tt r = 00);
bbooooll iiss(m
maasskk m
m, C
Chh cc) ccoonnsstt; // is "c" an "m"?
// place classification for each Ch in [b:e) into v:
ccoonnsstt C
Chh* iiss(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee, m
maasskk* vv) ccoonnsstt;
ccoonnsstt C
Chh* ssccaann__iiss(m
maasskk m
m, ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt; // find an m
ccoonnsstt C
Chh* ssccaann__nnoott(m
maasskk m
m, ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt; // find a non-m
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
922
Locales
Appendix D
C
Chh ttoouuppppeerr(C
Chh cc) ccoonnsstt;
ccoonnsstt C
Chh* ttoouuppppeerr(C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt; // convert [b:e)
C
Chh ttoolloow
weerr(C
Chh cc) ccoonnsstt;
ccoonnsstt C
Chh* ttoolloow
weerr(C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt;
C
Chh w
wiiddeenn(cchhaarr cc) ccoonnsstt;
ccoonnsstt cchhaarr* w
wiiddeenn(ccoonnsstt cchhaarr* bb, ccoonnsstt cchhaarr* ee, C
Chh* bb22) ccoonnsstt;
cchhaarr nnaarrrroow
w(C
Chh cc, cchhaarr ddeeff) ccoonnsstt;
ccoonnsstt C
Chh* nnaarrrroow
w(ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee, cchhaarr ddeeff, cchhaarr* bb22) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜ccttyyppee();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
A call iiss(m
m,cc) tests whether the character c belongs to the classification m
m. For example:
iinntt ccoouunntt__ssppaacceess(ccoonnsstt ssttrriinngg& ss, ccoonnsstt llooccaallee& lloocc)
{
ccoonnsstt ccttyyppee<cchhaarr>& cctt = uussee__ffaacceett< ccttyyppee<cchhaarr> >(lloocc);
iinntt i = 00;
ffoorr(ssttrriinngg::ccoonnsstt__iitteerraattoorr p = ss.bbeeggiinn(); p != ss.eenndd(); ++pp)
iiff (cctt.iiss(ccttyyppee__bbaassee::ssppaaccee,*pp)) ++ii;
// whitespace as defined by ct
rreettuurrnn ii;
}
Note that it is also possible to use iiss() to check whether a character belongs to one of a number of
classifications. For example:
cctt.iiss(ccttyyppee__bbaassee::ssppaaccee|ccttyyppee__bbaassee::ppuunncctt,cc); // is c whitespace or punctuation in ct?
A call iiss(bb,ee,vv) determines the classification of each character in [bb,ee) and places it in the corresponding position in the array vv.
A call ssccaann__iiss(m
m,bb,ee) returns a pointer to the first character in [bb,ee) that is an m
m. If no
character is classified as an m
m, e is returned. As ever for standard facets, the public member function is implemented by a call to its ‘‘ddoo__’’ virtual function. A simple implementation might be:
tteem
mppllaattee <ccllaassss C
Chh>
ccoonnsstt C
Chh* ssttdd::ccttyyppee<C
Chh>::ddoo__ssccaann__iiss(m
maasskk m
m, ccoonnsstt C
Chh* bb, ccoonnsstt C
Chh* ee) ccoonnsstt
{
w
whhiillee (bb!=ee && !iiss(m
m,*bb)) ++bb;
rreettuurrnn bb;
}
A call ssccaann__nnoott(m
m,bb,ee) returns a pointer to the first character in [bb,ee) that is not an m
m. If all
characters are classified as m
m, e is returned.
A call ttoouuppppeerr(cc) returns the uppercase version of c if such a version exists in the character set
used and c itself otherwise.
A call ttoouuppppeerr(bb,ee) converts each character in the range [bb,ee) to uppercase and returns ee. A
simple implementation might be:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.5
Character Classification
923
tteem
mppllaattee <ccllaassss C
Chh>
ccoonnsstt C
Chh* ssttdd::ccttyyppee<C
Chh>::ttoo__uuppppeerr(C
Chh* bb, ccoonnsstt C
Chh* ee)
{
ffoorr (; bb!=ee; ++bb) *bb = ttoouuppppeerr(*bb);
rreettuurrnn ee;
}
The ttoolloow
weerr() functions are similar to ttoouuppppeerr() except that they convert to lowercase.
A call w
wiiddeenn(cc) transforms the character c into its corresponding C
Chh value. If C
Chh’s character
set provides several characters corresponding to cc, the standard specifies that ‘‘the simplest reasonable transformation’’ be used. For example,
w
wccoouutt << uussee__ffaacceett< ccttyyppee<w
wcchhaarr__tt> >(w
wccoouutt.ggeettlloocc()).w
wiiddeenn(´ee´);
will output a reasonable equivalent to the character e in w
wccoouutt’s locale.
Translation between unrelated character representations, such as ASCII and EBCDIC, can also
be done by using w
wiiddeenn(). For example, assume that an eebbccddiicc locale exists:
cchhaarr E
EB
BC
CD
DIIC
C__ee = uussee__ffaacceett< ccttyyppee<cchhaarr> >(eebbccddiicc).w
wiiddeenn(´ee´);
A call w
wiiddeenn(bb,ee,vv) takes each character in the range [bb,ee) and places a widened version in the
corresponding position in the array vv.
A call nnaarrrroow
w(cchh,ddeeff) produces a cchhaarr value corresponding to the character cchh from the C
Chh
type. Again, ‘‘the simplest reasonable transformation’’ is to be used. If no such corresponding
cchhaarr exist, ddeeff is returned.
A call nnaarrrroow
w(bb,ee,ddeeff,vv) takes each character in the range [bb,ee) and places a narrowed
version in the corresponding position in the array vv.
The general idea is that nnaarrrroow
w() converts from a larger character set to a smaller one and that
w
wiiddeenn() performs the inverse operation. For a character c from the smaller character set, we
expect:
c == nnaarrrroow
w(w
wiiddeenn(cc),00) // not guaranteed
This is true provided that the character represented by c has only one representation in ‘‘the smaller
character set.’’ However, that is not guaranteed. If the characters represented by a cchhaarr are not a
subset of those represented by the larger character set (C
Chh), we should expect anomalies and potential problems with code treating characters generically.
Similarly, for a character cchh from the larger character set, we might expect:
w
wiiddeenn(nnaarrrroow
w(cchh,ddeeff)) == cchh || w
wiiddeenn(nnaarrrroow
w(cchh,ddeeff)) == w
wiiddeenn(ddeeff) // not guaranteed
However, even though this is often the case, it cannot be guaranteed for a character that is represented by several values in the larger character set but only once in the smaller character set. For
example, a digit, such as 77, often has several separate representations in a large character set. The
reason for that is typically that a large character set has several conventional character sets as subsets and that the characters from the smaller sets are replicated for ease of conversion.
For every character in the basic source character set (§C.3.3), it is guaranteed that
w
wiiddeenn(nnaarrrroow
w(cchh__lliitt,00)) == cchh__lliitt
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
924
Locales
Appendix D
For example:
w
wiiddeenn(nnaarrrroow
w(´xx´),00) == ´xx´
The nnaarrrroow
w() and w
wiiddeenn() functions respect character classifications wherever possible. For
example, if iiss(aallpphhaa,cc), then iiss(aallpphhaa,nnaarrrroow
w(cc,´aa´)) and iiss(aallpphhaa,w
wiiddeenn(cc)) wherever
aallpphhaa is a valid mask for the locale used.
A major reason for using a ccttyyppee facet in general and for using nnaarrrroow
w() and w
wiiddeenn() functions in particular is to be able to write code that does I/O and string manipulation for any character
set; that is, to make such code generic with respect to character sets. This implies that iioossttrreeaam
m
implementations depend critically on these facilities. By relying on <iioossttrreeaam
m> and <ssttrriinngg>, a
user can avoid most direct uses of the ccttyyppee facet.
A __bbyynnaam
mee version (§D.4, §D.4.1) of ccttyyppee is provided:
tteem
mppllaattee <ccllaassss C
Chh> ccllaassss ssttdd::ccttyyppee__bbyynnaam
mee : ppuubblliicc ccttyyppee<C
Chh> { /* ... */ };
D.4.5.1 Convenience Interfaces
The most common use of the ccttyyppee facet is to inquire whether a character belongs to a given classification. Consequently, a set of functions is provided for that:
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
tteem
mppllaattee <ccllaassss
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
C
Chh> bbooooll
iissssppaaccee(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iisspprriinntt(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissccnnttrrll(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissuuppppeerr(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iisslloow
weerr(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissaallpphhaa(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissddiiggiitt(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissppuunncctt(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissxxddiiggiitt(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissaallnnuum
m(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
iissggrraapphh(C
Chh cc, ccoonnsstt llooccaallee& lloocc);
These functions are trivially implemented by using uussee__ffaacceett. For example:
tteem
mppllaattee <ccllaassss C
Chh>
iinnlliinnee bbooooll iissssppaaccee(C
Chh cc, ccoonnsstt llooccaallee& lloocc)
{
rreettuurrnn uussee__ffaacceett< ccttyyppee<C
Chh> >(lloocc).iiss(ssppaaccee,cc);
}
The one-argument versions of these functions, presented in §20.4.2, are simply these functions for
the current C global locale (not the global C++ locale, llooccaallee()). Except for the rare cases in which
the C global locale and the C++ global locale differ (§D.2.3), we can think of a one-argument version as the two-argument version applied to llooccaallee(). For example:
iinnlliinnee iinntt iissssppaaccee(iinntt ii)
{
rreettuurrnn iissssppaaccee(ii,llooccaallee());
}
// almost
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.5.1
Convenience Interfaces
925
D.4.6 Character Code Conversion
Sometimes, the representation of characters stored in a file differs from the desired representation
of those same characters in main memory. For example, Japanese characters are often stored in
files in which indicators (‘‘shifts’’) tell to which of the four common character sets (kanji, katakana, hiragana, and romaji) a given sequence of characters belongs. This is a bit unwieldy because
the meaning of each byte depends on its ‘‘shift state,’’ but it can save memory because only a kanji
requires more than one byte for its representation. In main memory, these characters are easier to
manipulate when represented in a multi-byte character set where every character has the same size.
Such characters (for example, Unicode characters) are typically placed in wide characters
(w
wcchhaarr__tt; §4.3). Consequently, the ccooddeeccvvtt facet provides a mechanism for converting characters
from one representation to another as they are read or written. For example:
Disk representation:
JIS
I/O conversions controlled by ccooddeeccvvtt
Main memory representation:
Unicode
This code-conversion mechanism is general enough to provide arbitrary conversions of character
representations. It allows us to write a program to use a suitable internal character representation
(stored in cchhaarr, w
wcchhaarr__tt, or whatever) and to then accept a variety of input character stream representations by adjusting the locale used by iostreams. The alternative would be to modify the program itself or to convert input and output files from/to a variety of formats.
The ccooddeeccvvtt facet provides conversion between different character sets when a character is
moved between a stream buffer and external storage:
ccllaassss ssttdd::ccooddeeccvvtt__bbaassee {
ppuubblliicc:
eennuum
m rreessuulltt { ookk, ppaarrttiiaall, eerrrroorr, nnooccoonnvv };
};
// result indicators
tteem
mppllaattee <ccllaassss II, ccllaassss E
E, ccllaassss SSttaattee>
ccllaassss ssttdd::ccooddeeccvvtt : ppuubblliicc llooccaallee::ffaacceett, ppuubblliicc ccooddeeccvvtt__bbaassee {
ppuubblliicc:
ttyyppeeddeeff I iinntteerrnn__ttyyppee;
ttyyppeeddeeff E eexxtteerrnn__ttyyppee;
ttyyppeeddeeff SSttaattee ssttaattee__ttyyppee;
eexxpplliicciitt ccooddeeccvvtt(ssiizzee__tt r = 00);
rreessuulltt iinn(SSttaattee&, ccoonnsstt E
E* ffrroom
m, ccoonnsstt E
E* ffrroom
m__eenndd, ccoonnsstt E
E*& ffrroom
m__nneexxtt,// read
II* ttoo, II* ttoo__eenndd, II*& ttoo__nneexxtt) ccoonnsstt;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
926
Locales
Appendix D
rreessuulltt oouutt(SSttaattee&, ccoonnsstt II* ffrroom
m, ccoonnsstt II* ffrroom
m__eenndd, ccoonnsstt II*& ffrroom
m__nneexxtt,// write
E
E* ttoo, E
E* ttoo__eenndd, E
E*& ttoo__nneexxtt) ccoonnsstt;
rreessuulltt uunnsshhiifftt(SSttaattee&, E
E* ttoo, E
E* ttoo__eenndd, E
E*& ttoo__nneexxtt) ccoonnsstt; // end character sequence
iinntt eennccooddiinngg() ccoonnsstt tthhrroow
w();
bbooooll aallw
waayyss__nnooccoonnvv() ccoonnsstt tthhrroow
w();
// characterize basic encoding properties
// can we do I/O without code translation?
iinntt lleennggtthh(ccoonnsstt SSttaattee&, ccoonnsstt E
E* ffrroom
m, ccoonnsstt E
E* ffrroom
m__eenndd, ssiizzee__tt m
maaxx) ccoonnsstt;
iinntt m
maaxx__lleennggtthh() ccoonnsstt tthhrroow
w();
// maximum possible length()
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜ccooddeeccvvtt();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
A ccooddeeccvvtt facet is used by bbaassiicc__ffiilleebbuuff (§21.5) to read or write characters. A bbaassiicc__ffiilleebbuuff
obtains this facet from the stream’s locale (§21.7.1).
The SSttaattee template argument is the type used to hold the shift state of the stream being converted. SSttaattee can also be used to identify different conversions by specifying a specialization. The
latter is useful because characters of a variety of character encodings (character sets) can be stored
in objects of the same type. For example:
ccllaassss JJIISSssttaattee { /* .. */ };
p = nneew
w ccooddeeccvvtt<w
wcchhaarr__tt,cchhaarr,m
mbbssttaattee__tt>;
q = nneew
w ccooddeeccvvtt<w
wcchhaarr__tt,cchhaarr,JJIISSssttaattee>;
// standard char to wide char
// JIS to wide char
Without the different SSttaattee arguments, there would be no way for the facet to know which encoding
to assume for the stream of cchhaarrs. The m
mbbssttaattee__tt type from <ccw
wcchhaarr> or <w
wcchhaarr.hh> identifies
the system’s standard conversion between cchhaarr and w
wcchhaarr__tt.
A new ccooddeeccvvtt can be also created as a derived class and identified by name. For example:
ccllaassss JJIISSccvvtt : ppuubblliicc ccooddeeccvvtt<w
wcchhaarr__tt,cchhaarr,m
mbbssttaattee__tt> { /* ... */ };
A call iinn(ss,ffrroom
m,ffrroom
m__eenndd,ffrroom
m__nneexxtt,ttoo,ttoo__eenndd,ttoo__nneexxtt) reads each character in the range
[ffrroom
m,ffrroom
m__eenndd) and tries to convert it. If a character is converted, iinn() writes its converted
form to the corresponding position in the [ttoo,ttoo__eenndd) range; if not, iinn() stops at that point.
Upon return, iinn() stores the position one-beyond-the-last character read in ffrroom
m__nneexxtt and the position one-beyond-the-last character written in ttoo__nneexxtt. The rreessuulltt value returned by iinn() indicates
how much work was done:
ookk:
all characters in the [ffrroom
m,ffrroom
m__eenndd) range converted
ppaarrttiiaall:
not all characters in the [ffrroom
m,ffrroom
m__eenndd) range were converted
eerrrroorr:
iinn() encountered a character it couldn’t convert
nnooccoonnvv:
no conversion was needed
Note that a ppaarrttiiaall conversion is not necessarily an error. Possibly more characters have to be read
before a multibyte character is complete and can be written, or maybe the output buffer has to be
emptied to make room for more characters.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.6
Character Code Conversion
927
The s argument of type SSttaattee indicates the state of the input character sequence at the start of
the call of iinn(). This is significant when the external character representation uses shift states.
Note that s is a (non-ccoonnsstt) reference argument: At the end of the call, s holds the state of shift state
of the input sequence. This allows a programmer to deal with ppaarrttiiaall conversions and to convert a
long sequence using several calls to iinn().
A call oouutt(ss,ffrroom
m,ffrroom
m__eenndd,ffrroom
m__nneexxtt,ttoo,ttoo__eenndd,ttoo__nneexxtt) converts [ffrroom
m,ffrroom
m__eenndd)
from the internal to the external representation in the same way the iinn() converts from the external
to the internal representation.
A character stream must start and end in a ‘‘neutral’’ (unshifted) state. Typically, that state is
SSttaattee(). A call uunnsshhiifftt(ss,ttoo,ttoo__eenndd,ttoo__nneexxtt) looks at s and places characters in [ttoo,ttoo__eenndd)
as needed to bring a sequence of characters back to that unshifted state. The result of uunnsshhiifftt()
and the use of ttoo__nneexxtt are done just like oouutt().
A call lleennggtthh(ss,ffrroom
m,ffrroom
m__eenndd,m
maaxx) returns the number of characters that iinn() could convert from [ffrroom
m,ffrroom
m__eenndd).
A call eennccooddiinngg() returns
––11 if the encoding of the external character set uses state (for example, uses shift and unshift
character sequences)
0
if the encoding uses varying number of bytes to represent individual characters (for example, a character representation might use a bit in a byte to indicate whether one or two
bytes are used to represents that character)
n
if every character of the external character representation is n bytes
A call aallw
waayyss__nnooccoonnvv() returns ttrruuee if no conversion is required between the internal and the
external character sets and ffaallssee otherwise. Clearly, aallw
waayyss__nnooccoonnvv()==ttrruuee opens the possibility for the implementation to provide the maximally efficient implementation that simply doesn’t
invoke the conversion functions.
A call m
maaxx__lleennggtthh() returns the maximum value that lleennggtthh() can return for a valid set of
arguments.
The simplest code conversion that I can think of is one that converts input to uppercase. Thus,
this is about as simple as a ccooddeeccvvtt can be and still perform a service:
ccllaassss C
Cvvtt__ttoo__uuppppeerr : ppuubblliicc ccooddeeccvvtt<cchhaarr,cchhaarr,m
mbbssttaattee__tt> {
// convert to uppercase
eexxpplliicciitt C
Cvvtt__ttoo__uuppppeerr(ssiizzee__tt r = 00) : ccooddeeccvvtt(rr) { }
pprrootteecctteedd:
// read external representation write internal representation:
rreessuulltt ddoo__iinn(SSttaattee& ss, ccoonnsstt cchhaarr* ffrroom
m, ccoonnsstt cchhaarr* ffrroom
m__eenndd, ccoonnsstt cchhaarr*& ffrroom
m__nneexxtt,
cchhaarr* ttoo, cchhaarr* ttoo__eenndd, cchhaarr*& ttoo__nneexxtt) ccoonnsstt;
// read internal representation write external representation:
rreessuulltt ddoo__oouutt(SSttaattee& ss, ccoonnsstt cchhaarr* ffrroom
m, ccoonnsstt cchhaarr* ffrroom
m__eenndd, ccoonnsstt cchhaarr*& ffrroom
m__nneexxtt,
cchhaarr* ttoo, cchhaarr* ttoo__eenndd, cchhaarr*& ttoo__nneexxtt) ccoonnsstt
{
rreettuurrnn ccooddeeccvvtt<cchhaarr,cchhaarr,m
mbbssttaattee__tt>::ddoo__oouutt
(ss,ffrroom
m,ffrroom
m__eenndd,ffrroom
m__nneexxtt,ttoo,ttoo__eenndd,ttoo__nneexxtt);
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
928
Locales
Appendix D
rreessuulltt ddoo__uunnsshhiifftt(SSttaattee&, E
E* ttoo, E
E* ttoo__eenndd, E
E*& ttoo__nneexxtt) ccoonnsstt { rreettuurrnn ookk; }
iinntt ddoo__eennccooddiinngg() ccoonnsstt tthhrroow
w() { rreettuurrnn 11; }
bbooooll ddoo__aallw
waayyss__nnooccoonnvv() ccoonnsstt tthhrroow
w() { rreettuurrnn ffaallssee; }
iinntt ddoo__lleennggtthh(ccoonnsstt SSttaattee&, ccoonnsstt E
E* ffrroom
m, ccoonnsstt E
E* ffrroom
m__eenndd, ssiizzee__tt m
maaxx) ccoonnsstt;
iinntt ddoo__m
maaxx__lleennggtthh() ccoonnsstt tthhrroow
w();
// maximum possible length()
};
ccooddeeccvvtt<cchhaarr,cchhaarr,m
mbbssttaattee__tt>::rreessuulltt
C
Cvvtt__ttoo__uuppppeerr::ddoo__iinn(SSttaattee& ss, ccoonnsstt cchhaarr* ffrroom
m, ccoonnsstt cchhaarr* ffrroom
m__eenndd,
ccoonnsstt cchhaarr*& ffrroom
m__nneexxtt, cchhaarr* ttoo, cchhaarr* ttoo__eenndd, cchhaarr*& ttoo__nneexxtt) ccoonnsstt
{
// ... §D.6[16] ...
}
iinntt m
maaiinn()
// trivial test
{
llooccaallee uullooccaallee(llooccaallee(), nneew
w C
Cvvtt__ttoo__uuppppeerr);
cciinn.iim
mbbuuee(uullooccaallee);
cchhaarr cchh;
w
whhiillee (cciinn>>cchh) ccoouutt << cchh;
}
A __bbyynnaam
mee version (§D.4, §D.4.1) of ccooddeeccvvtt is provided:
tteem
mppllaattee <ccllaassss II, ccllaassss E
E, ccllaassss SSttaattee>
ccllaassss ssttdd::ccooddeeccvvtt__bbyynnaam
mee : ppuubblliicc ccooddeeccvvtt<II,E
E,SSttaattee> { /* ... */ };
D.4.7 Messages
Naturally, most end users prefer to use their native language to interact with a program. However,
we cannot provide a standard mechanism for expressing llooccaallee-specific general interactions.
Instead, the library provides a simple mechanism for keeping a llooccaallee-specific set of strings from
which a programmer can compose simple messages. In essence, m
meessssaaggeess implements a trivial
read-only database:
ccllaassss ssttdd::m
meessssaaggeess__bbaassee {
ppuubblliicc:
ttyyppeeddeeff iinntt ccaattaalloogg; // catalog identifier type
};
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::m
meessssaaggeess : ppuubblliicc llooccaallee::ffaacceett, ppuubblliicc m
meessssaaggeess__bbaassee {
ppuubblliicc:
ttyyppeeddeeff C
Chh cchhaarr__ttyyppee;
ttyyppeeddeeff bbaassiicc__ssttrriinngg<C
Chh> ssttrriinngg__ttyyppee;
eexxpplliicciitt m
meessssaaggeess(ssiizzee__tt r = 00);
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.7
Messages
929
ccaattaalloogg ooppeenn(ccoonnsstt bbaassiicc__ssttrriinngg<cchhaarr>& ffnn, ccoonnsstt llooccaallee&) ccoonnsstt;
ssttrriinngg__ttyyppee ggeett(ccaattaalloogg cc, iinntt sseett, iinntt m
mssggiidd, ccoonnsstt ssttrriinngg__ttyyppee& dd) ccoonnsstt;
vvooiidd cclloossee(ccaattaalloogg cc) ccoonnsstt;
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
pprrootteecctteedd:
˜m
meessssaaggeess();
// virtual ‘‘do_’’ functions for public functions (see §D.4.1)
};
A call ooppeenn(ss,lloocc) opens a ‘‘catalog’’ of messages called s for the locale lloocc. A catalog is a set
of strings organized in an implementation-specific way and accessed through the
m
meessssaaggeess::ggeett() function. A negative value is returned if no catalog named s can be opened. A
catalog must be opened before the first use of ggeett().
A call cclloossee(ccaatt) closes the catalog identified by ccaatt and frees all resources associated with
that catalog.
A call ggeett(ccaatt,sseett,iidd,"ffoooo") looks for a message identified by (sseett,iidd) in the catalog ccaatt.
If a string is found, ggeett() returns that string; otherwise, ggeett() returns the default string (here,
ssttrriinngg("ffoooo")).
Here is an example of a m
meessssaaggeess facet for an implementation in which a message catalog is a
vector of sets of ‘‘messages’’ and a ‘‘message’’ is a string:
ssttrruucctt SSeett {
vveeccttoorr<ssttrriinngg> m
mssggss;
};
ssttrruucctt C
Caatt {
vveeccttoorr<SSeett> sseettss;
};
ccllaassss M
Myy__m
meessssaaggeess : ppuubblliicc m
meessssaaggeess<cchhaarr> {
vveeccttoorr<C
Caatt>& ccaattaallooggss;
ppuubblliicc:
eexxpplliicciitt M
Myy__m
meessssaaggeess(ssiizzee__tt = 00) :ccaattaallooggss(*nneew
w vveeccttoorr<C
Caatt>) { }
ccaattaalloogg ddoo__ooppeenn(ccoonnsstt ssttrriinngg& ss, ccoonnsstt llooccaallee& lloocc) ccoonnsstt;
// open catalog s
ssttrriinngg ddoo__ggeett(ccaattaalloogg cc, iinntt ss, iinntt m
m, ccoonnsstt ssttrriinngg&) ccoonnsstt; // get message (s,m) in c
vvooiidd ddoo__cclloossee(ccaattaalloogg ccaatt) ccoonnsstt
{
iiff (ccaattaallooggss.ssiizzee()<=ccaatt) ccaattaallooggss.eerraassee(ccaattaallooggss.bbeeggiinn()+ccaatt);
}
˜M
Myy__m
meessssaaggeess() { ddeelleettee &ccaattaallooggss; }
};
All m
meessssaaggeess’ member functions are ccoonnsstt, so the catalog data structure (the vveeccttoorr<SSeett>) is stored
outside the facet.
A message is selected by specifying a catalog, a set within that catalog, and a message string
within that set. A string is supplied as an argument, to be used as a default result in case no message is found in the catalog:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
930
Locales
Appendix D
ssttrriinngg M
Myy__m
meessssaaggeess::ddoo__ggeett(ccaattaalloogg ccaatt, iinntt sseett, iinntt m
mssgg, ccoonnsstt ssttrriinngg& ddeeff) ccoonnsstt
{
iiff (ccaattaallooggss.ssiizzee()<=ccaatt) rreettuurrnn ddeeff;
C
Caatt& c = ccaattaallooggss[ccaatt];
iiff (cc.sseettss.ssiizzee()<=sseett) rreettuurrnn ddeeff;
SSeett& s = cc.sseettss[sseett];
iiff (ss.m
mssggss.ssiizzee()<=m
mssgg) rreettuurrnn ddeeff;
rreettuurrnn ss.m
mssggss[m
mssgg];
}
Opening a catalog involves reading a textual representation from disk into a C
Caatt structure. Here, I
chose a representation that is trivial to read. A set is delimited by <<< and >>>, and each message
is a line of text:
m
meessssaaggeess<cchhaarr>::ccaattaalloogg M
Myy__m
meessssaaggeess::ddoo__ooppeenn(ccoonnsstt ssttrriinngg& nn, ccoonnsstt llooccaallee& lloocc) ccoonnsstt
{
ssttrriinngg nnnn = n + llooccaallee().nnaam
mee();
iiffssttrreeaam
m ff(nnnn.cc__ssttrr());
iiff (!ff) rreettuurrnn -11;
ccaattaallooggss.ppuusshh__bbaacckk(C
Caatt());
// make in-core catalog
C
Caatt& c = ccaattaallooggss.bbaacckk();
ssttrriinngg ss;
w
whhiillee (ff>>ss && ss=="<<<") {
// read Set
cc.sseettss.ppuusshh__bbaacckk(SSeett());
SSeett& ssss = cc.sseettss.bbaacckk();
w
whhiillee (ggeettlliinnee(ff,ss) && s != ">>>") ssss.m
mssggss.ppuusshh__bbaacckk(ss);
}
rreettuurrnn ccaattaallooggss.ssiizzee()-11;
// read message
}
Here is a trivial use:
iinntt m
maaiinn()
{
iiff (!hhaass__ffaacceett< M
Myy__m
meessssaaggeess >(llooccaallee())) {
cceerrrr << "nnoo m
meessssaaggeess ffaacceett ffoouunndd iinn " << llooccaallee().nnaam
mee() << ´\\nn´;
eexxiitt(11);
}
ccoonnsstt m
meessssaaggeess<cchhaarr>& m = uussee__ffaacceett< M
Myy__m
meessssaaggeess >(llooccaallee());
eexxtteerrnn ssttrriinngg m
meessssaaggee__ddiirreeccttoorryy;
// where I keep my messages
iinntt ccaatt = m
m.ooppeenn(m
meessssaaggee__ddiirreeccttoorryy,llooccaallee());
iiff (ccaatt<00) {
cceerrrr << "nnoo ccaattaalloogg ffoouunndd\\nn";
eexxiitt(11);
}
ccoouutt << m
m.ggeett(ccaatt,00,00,"M
Miisssseedd aaggaaiinn!") << eennddll;
ccoouutt << m
m.ggeett(ccaatt,11,22,"M
Miisssseedd aaggaaiinn!") << eennddll;
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.4.7
Messages
931
ccoouutt << m
m.ggeett(ccaatt,11,33,"M
Miisssseedd aaggaaiinn!") << eennddll;
ccoouutt << m
m.ggeett(ccaatt,33,00,"M
Miisssseedd aaggaaiinn!") << eennddll;
}
If the catalog is
<<<
hheelllloo
ggooooddbbyyee
>>>
<<<
yyeess
nnoo
m
maayybbee
>>>
this program prints
hheelllloo
m
maayybbee
M
Miisssseedd aaggaaiinn!
M
Miisssseedd aaggaaiinn!
D.4.7.1 Using Messages from Other Facets
In addition to being a repository for llooccaallee-dependent strings used to communicate with users, messages can be used to hold strings for other facets. For example, the SSeeaassoonn__iioo facet (§D.3.2) could
have been written like this:
ccllaassss SSeeaassoonn__iioo : ppuubblliicc llooccaallee::ffaacceett {
ccoonnsstt m
meessssaaggeess<cchhaarr>& m
m;
// message directory
iinntt ccaatt;
// message catalog
ppuubblliicc:
ccllaassss M
Miissssiinngg__m
meessssaaggeess { };
SSeeaassoonn__iioo(iinntt i = 00)
: llooccaallee::ffaacceett(ii),
m
m(uussee__ffaacceett<SSeeaassoonn__m
meessssaaggeess>(llooccaallee())),
ccaatt(m
m.ooppeenn(m
meessssaaggee__ddiirreeccttoorryy,llooccaallee()))
{ iiff (ccaatt<00) tthhrroow
w M
Miissssiinngg__m
meessssaaggeess(); }
˜SSeeaassoonn__iioo() { }
// to make it possible to destroy Season_io objects (§D.3)
ccoonnsstt ssttrriinngg& ttoo__ssttrr(SSeeaassoonn xx) ccoonnsstt;
// string representation of x
bbooooll ffrroom
m__ssttrr(ccoonnsstt ssttrriinngg& ss, SSeeaassoonn& xx) ccoonnsstt; // place Season corresponding to s in x
ssttaattiicc llooccaallee::iidd iidd; // facet identifier object (§D.2, §D.3, §D.3.1)
};
llooccaallee::iidd SSeeaassoonn__iioo::iidd; // define the identifier object
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
932
Locales
Appendix D
ccoonnsstt ssttrriinngg& SSeeaassoonn__iioo::ttoo__ssttrr(SSeeaassoonn xx) ccoonnsstt
{
rreettuurrnn m
m->ggeett(ccaatt,xx,"nnoo-ssuucchh-sseeaassoonn");
}
bbooooll SSeeaassoonn__iioo::ffrroom
m__ssttrr(ccoonnsstt ssttrriinngg& ss, SSeeaassoonn& xx) ccoonnsstt
{
ffoorr (iinntt i = SSeeaassoonn::sspprriinngg; ii<=SSeeaassoonn::w
wiinntteerr; ii++)
iiff (m
m->ggeett(ccaatt,ii,"nnoo-ssuucchh-sseeaassoonn") == ss) {
x = SSeeaassoonn(ii);
rreettuurrnn ttrruuee;
}
rreettuurrnn ffaallssee;
}
This m
meessssaaggeess-based solution differs from the original solution (§D.3.2) in that the implementer of
a set of SSeeaassoonn strings for a new locale needs to be able to add them to a m
meessssaaggeess directory. This
is easy for someone adding a new locale to an execution environment. However, since m
meessssaaggeess
provides only a read-only interface, adding a new set of season names may be beyond the scope of
an application programmer.
A __bbyynnaam
mee version (§D.4, §D.4.1) of m
meessssaaggeess is provided:
tteem
mppllaattee <ccllaassss C
Chh>
ccllaassss ssttdd::m
meessssaaggeess__bbyynnaam
mee : ppuubblliicc m
meessssaaggeess<C
Chh> { /* ... */ };
D.5 Advice
[1] Expect that every nontrivial program or system that interacts directly with people will be used
in several different countries; §D.1.
[2] Don’t assume that everyone uses the same character set as you do; §D.4.1.
[3] Prefer using llooccaallees to writing ad hoc code for culture-sensitive I/O; §D.1.
[4] Avoid embedding locale name strings in program text; §D.2.1.
[5] Minimize the use of global format information; §D.2.3, §D.4.4.7.
[6] Prefer locale-sensitive string comparisons and sorts; §D.2.4, §D.4.1.
[7] Make ffaacceetts immutable; §D.2.2, §D.3.
[8] Keep changes of llooccaallee to a few places in a program; §D.2.3.
[9] Let llooccaallee handle the lifetime of ffaacceetts; §D.3.
[10] When writing locale-sensitive I/O functions, remember to handle exceptions from usersupplied (overriding) functions; §D.4.2.2.
[11] Use a simple M
Moonneeyy type to hold monetary values; §D.4.3.
[12] Use simple user-defined types to hold values that require locale-sensitive I/O (rather than casting to and from values of built-in types); §D.4.3.
[13] Don’t believe timing figures until you have a good idea of all factors involved; §D.4.4.1.
[14] Be aware of the limitations of ttiim
mee__tt; §D.4.4.1, §D.4.4.5.
[15] Use a date-input routine that accepts a range of input formats; §D.4.4.5.
[16] Prefer the character classification functions in which the locale is explicit; §D.4.5, §D.4.5.1.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section D.6
Exercises
933
D.6 Exercises
1. (∗2.5) Define a SSeeaassoonn__iioo (§D.3.2) for a language other than American English.
2. (∗2) Define a SSeeaassoonn__iioo (§D.3.2) class that takes a set of name strings as a constructor argument so that SSeeaassoonn names for different locales can be represented as objects of this class.
3. (∗3) Write a ccoollllaattee<cchhaarr>::ccoom
mppaarree() that gives dictionary order. Preferably, do this for a
language, such as German or French, that has more letters in its alphabet than English does.
4. (∗2) Write a program that reads and writes bboooolls as numbers, as English words, and as words in
another language of your choice.
5. (∗2.5) Define a T
Tiim
mee type for representing time of day. Define a D
Daattee__aanndd__ttiim
mee type by using
T
Tiim
mee and a D
Daattee type. Discuss the pros and cons of this approach compared to the D
Daattee from
(§D.4.4). Implement llooccaallee-sensitive I/O for T
Tiim
mee and D
Daattee__aanndd__ttiim
mee.
6. (∗2.5) Design and implement a postal code (zip code) facet. Implement it for at least two countries with dissimilar conventions for writing addresses. For example: N
NJJ 0077993322 and C
CB
B221QA .
7. (∗2.5) Design and implement a phone number facet. Implement it for at least two countries
with dissimilar conventions for writing phone numbers. For example, (997733) 336600-88000000 and
11222233 334433000000.
8. (∗2.5) Experiment to find out what input and output formats your implementation uses for date
information.
9. (∗2.5) Define a ggeett__ttiim
mee() that ‘‘guesses’’ about the meaning of ambiguous dates, such as
12/5/1995, but still rejects all or almost all mistakes. Be precise about what ‘‘guesses’’ are
accepted, and discuss the likelihood of a mistake.
10. (∗2) Define a ggeett__ttiim
mee() that accepts a greater variety of input formats than the one in
§D.4.4.5.
11. (∗2) Make a list of the locales supported on your system.
12. (∗2.5) Figure out where named locales are stored on your system. If you have access to the part
of the system where locales are stored, make a new named locale. Be very careful not to break
existing locales.
13. (∗2) Compare the two SSeeaassoonn__iioo implementations (§D.3.2 and §D.4.7.1).
14. (∗2) Write and test a D
Daattee__oouutt facet that writes D
Daattees using a format supplied as a constructor
argument. Discuss the pros and cons of this approach compared to the global date format provided by ddaattee__ffm
mtt (§D.4.4.6).
15. (∗2.5) Implement I/O of Roman numerals (such as X
XII and M
MD
DC
CL
LIIII).
16. (∗2.5) Implement and test C
Cvvtt__ttoo__uuppppeerr (§D.4.6).
17. (∗2.5) Use cclloocckk() to determine average cost of (1) a function call, (2) a virtual function call,
(3) reading a cchhaarr, (4) reading a 1-digit iinntt, (5) reading a 5-digit iinntt, (6) reading a 5-digit ddoouu-bbllee, (7) a 1-character ssttrriinngg, (8) a 5-character ssttrriinngg,and (9) a 40-character ssttrriinngg.
18. (∗6.5) Learn another natural language.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
E
Appendix
________________________________________
________________________________________________________________________________________________________________________________________________________________
Standard-Library Exception Safety
Everything will work just as you expect it to,
unless your expectations are incorrect.
– Hyman Rosen
Exception safety — exception-safe implementation techniques — representing resources
— assignment — ppuusshh__bbaacckk() — constructors and invariants — standard container
guarantees — insertion and removal of elements — guarantees and tradeoffs — ssw
waapp()
— initialization and iterators — references to elements — predicates — ssttrriinnggss, streams,
algorithms, vvaallaarrrraayy, and ccoom
mpplleexx — the C standard library — implications for library
users — advice — exercises.
E.1 Introduction
Standard-library functions often invoke operations that a user supplies as function or template arguments. Naturally, some of these user-supplied operations will occasionally throw exceptions.
Other functions, such as allocator functions, can also throw exceptions. Consider:
vvooiidd ff(vveeccttoorr<X
X>& vv, ccoonnsstt X
X& gg)
{
vv[22] = gg;
vv.ppuusshh__bbaacckk(gg);
ssoorrtt(vv.bbeeggiinn(),vv.eenndd());
vveeccttoorr<X
X> u = vv;
// ...
// X’s assignment might throw an exception
// vector<X>’s allocator might throw an exception
// X’s less-than operation might throw an exception
// X’s copy constructor might throw an exception
// u destroyed here: we must ensure that X’s destructor can work correctly
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
936
Standard-Library Exception Safety
Appendix E
What happens if the assignment throws an exception while trying to copy gg? Will v be left with an
invalid element? What happens if the constructor that vv.ppuusshh__bbaacckk() uses to copy g throws
ssttdd::bbaadd__aalllloocc? Has the number of elements changed? Has an invalid element been added to the
container? What happens if X
X’s less-than operator throws an exception during the sort? Have the
elements been partially sorted? Could an element have been removed from the container by the
sorting algorithm and not put back?
Finding the complete list of possible exceptions in this example is left as an exercise (§E.8[1]).
Explaining how this example is well behaved for every well-defined type X – even an X that throws
exceptions – is part of the aim of this appendix. Naturally, a major part of this explanation involves
giving meaning and effective terminology to the notions of ‘‘well behaved’’ and ‘‘well defined’’ in
the context of exceptions.
The purpose of this appendix is to
[1] identify how a user can design types that meet the standard library’s requirements,
[2] state the guarantees offered by the standard library,
[3] state the standard-library requirements on user-supplied code,
[4] demonstrate effective techniques for crafting exception-safe and efficient containers, and
[5] present a few general rules for exception-safe programming.
The discussion of exception safety necessarily focuses on worst-case behavior. That is, where
could an exception cause the most problems? How does the standard library protect itself and its
users from potential problems? And, how can users help prevent problems? Please don’t let this
discussion of exception-handling techniques distract from the central fact that throwing an exception is the best method for reporting an error (§14.1, §14.9). The discussion of concepts, techniques, and standard-library guarantees is organized like this:
§E.2 discusses the notion of exception safety.
§E.3 presents techniques for implementing efficient exception-safe containers and operations.
§E.4 outlines the guarantees offered for standard-library containers and their operations.
§E.5 summarizes exception-safety issues for the non-container parts of the standard library.
§E.6 reviews exception safety from the point of view of a standard-library user.
As ever, the standard library provides examples of the kinds of concerns that must be addressed in
demanding applications. The techniques used to provide exception safety for the standard library
can be applied to a wide range of problems.
E.2 Exception Safety
An operation on an object is said to be exception safe if that operation leaves the object in a valid
state when the operation is terminated by throwing an exception. This valid state could be an error
state requiring cleanup, but it must be well defined so that reasonable error-handling code can be
written for the object. For example, an exception handler might destroy the object, repair the
object, repeat a variant of the operation, just carry on, etc.
In other words, the object will have an invariant (§24.3.7.1), its constructors will establish that
invariant, all further operations maintain that invariant even if an exception is thrown, and its
destructor will do final cleanup. An operation should take care that the invariant is maintained
before throwing an exception, so that the object is in a valid state. However, it is quite possible for
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.2
Exception Safety
937
that valid state to be one that doesn’t suit the application. For example, a string may have been left
as the empty string or a container may have been left unsorted. Thus, ‘‘repair’’ means giving an
object a value that is more appropriate/desirable for the application than the one it was left with
after an operation failed. In the context of the standard library, the most interesting objects are containers.
Here, we consider under which conditions operations on standard-library containers can be considered exception safe. There can be only two conceptually really simple strategies:
[1] ‘‘No guarantees:’’ If an exception is thrown, any container being manipulated is possibly
corrupted.
[2] ‘‘Strong guarantee:’’ If an exception is thrown, any container being manipulated remains in
the state in which it was before the standard-library operation started.
Unfortunately, both answers are too simple for real use. Alternative [1] is unacceptable because it
implies that after an exception is thrown from a container operation, the container cannot be
accessed; it can’t even be destroyed without fear of run-time errors. Alternative [2] is unacceptable
because it imposes the cost of roll-back semantics on every individual standard-library operation.
To resolve this dilemma, the C++ standard library provides a set of exception-safety guarantees
that share the burden of producing correct programs between implementers of the standard library
and users of the standard library:
[3a] ‘‘Basic guarantee for all operations:’’ The basic invariants of the standard library are
maintained, and no resources, such as memory, are leaked.
[3b] ‘‘Strong guarantee for key operations:’’ In addition to providing the basic guarantee, either
the operation succeeds, or has no effects. This guarantee is provided for key library operations, such as ppuusshh__bbaacckk(), single-element iinnsseerrtt() on a lliisstt, and uunniinniittiiaalliizzeedd__ccooppyy()
(§E.3.1, §E.4.1).
[3c] ‘‘Nothrow guarantee for some operations:’’ In addition to providing the basic guarantee,
some operations are guaranteed not to throw an exception This guarantee is provided for a
few simple operations, such as ssw
waapp() and ppoopp__bbaacckk() (§E.4.1).
Both the basic guarantee and the strong guarantee are provided on the condition that user-supplied
operations (such as assignments and ssw
waapp() functions) do not leave container elements in invalid
states, that user-supplied operations do not leak resources, and that destructors do not throw exceptions. For example, consider these ‘‘handle-like’’ (§25.7) classes:
tteem
mppllaattee<ccllaassss T
T> ccllaassss SSaaffee {
T
T* pp;
// p points to a T allocated using new
ppuubblliicc:
SSaaffee() :pp(nneew
w T
T) { }
˜SSaaffee() { ddeelleettee pp; }
SSaaffee& ooppeerraattoorr=(ccoonnsstt SSaaffee& aa) { *pp = *aa.pp; rreettuurrnn *tthhiiss; }
// ...
};
tteem
mppllaattee<ccllaassss T
T> ccllaassss U
Unnssaaffee {
// sloppy and dangerous code
T
T* pp;
// p points to a T
ppuubblliicc:
U
Unnssaaffee(T
T* pppp) :pp(pppp) { }
˜U
Unnssaaffee() { iiff (!pp->ddeessttrruuccttiibbllee()) tthhrroow
w E
E(); ddeelleettee pp; }
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
938
Standard-Library Exception Safety
Appendix E
U
Unnssaaffee& ooppeerraattoorr=(ccoonnsstt U
Unnssaaffee& aa)
{
pp->˜T
T();
// destroy old value (§10.4.11)
nneew
w(pp) T
T(aa.pp);
// construct copy of a.p in *p (§10.4.11)
rreettuurrnn *tthhiiss;
}
// ...
};
vvooiidd ff(vveeccttoorr< SSaaffee<SSoom
mee__ttyyppee> >&vvgg, vveeccttoorr< U
Unnssaaffee<SSoom
mee__ttyyppee> >&vvbb)
{
vvgg.aatt(11) = SSaaffee<SSoom
mee__ttyyppee>();
vvbb.aatt(11) = U
Unnssaaffee<SSoom
mee__ttyyppee>(nneew
w SSoom
mee__ttyyppee);
// ...
}
In this example, construction of a SSaaffee succeeds only if a T is successfully constructed. The construction of a T can fail because allocation might fail (and throw ssttdd::bbaadd__aalllloocc) and because T
T’s
constructor might throw an exception. However, in every successfully constructed SSaaffee, p will
point to a successfully constructed T
T; if a constructor fails, no T object (or SSaaffee object) is created.
Similarly, T
T’s assignment operator may throw an exception, causing SSaaffee’s assignment operator to
implicitly re-throw that exception. However, that is no problem as long as T
T’s assignment operator
always leaves its operands in a good state. Therefore, SSaaffee is well behaved, and consequently every
standard-library operation on a SSaaffee will have a reasonable and well-defined result.
On the other hand, U
Unnssaaffee() is carelessly written (or rather, it is carefully written to demonstrate undesirable behavior). The construction of an U
Unnssaaffee will not fail. Instead, the operations
on U
Unnssaaffee, such as assignment and destruction, are left to deal with a variety of potential problems.
The assignment operator may fail by throwing an exception from T
T’s copy constructor. This would
leave a T in an undefined state because the old value of *pp was destroyed and no new value
replaced it. In general, the results of that are unpredictable. U
Unnssaaffee’s destructor contains an illconceived attempt to protect against undesirable destruction. However, throwing an exception during exception handling will cause a call of tteerrm
miinnaattee() (§14.7), and the standard library requires
that a destructor return normally after destroying an object. The standard library does not – and
cannot – make any guarantees when a user supplies objects this badly behaved.
From the point of view of exception handling, SSaaffee and U
Unnssaaffee differ in that SSaaffee uses its constructor to establish an invariant (§24.3.7.1) that allows its operations to be implemented simply
and safely. If that invariant cannot be established, an exception is thrown before an invalid object
is constructed. U
Unnssaaffee, on the other hand, muddles along without a meaningful invariant, and the
individual operations throw exceptions without an overall error-handling strategy. Naturally, this
results in violations of the standard library’s (reasonable) assumptions about the behavior of types.
For example, U
Unnssaaffee can leave invalid elements in a container after throwing an exception from
T
T::ooppeerraattoorr=() and may throw an exception from its destructor.
Note that the standard-library guarantees relative to ill-behaved user-supplied operations are
analogous to the language guarantees relative to violations of the basic type system. If a basic
operation is not used according to its specification, the resulting behavior is undefined. For
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.2
Exception Safety
939
example, if you throw an exception from a destructor for a vveeccttoorr element, you have no more reason to hope for a reasonable result than if you dereference a pointer initialized to a random number:
ccllaassss B
Boom
mbb {
ppuubblliicc:
// ...
˜B
Boom
mbb() { tthhrroow
w T
Trroouubbllee(); };
};
vveeccttoorr<B
Boom
mbb> bb(1100);
// leads to undefined behavior
vvooiidd ff()
{
iinntt* p = rreeiinntteerrpprreett__ccaasstt<iinntt*>(rraanndd()); // leads to undefined behavior
*pp = 77;
}
Stated positively: If you obey the basic rules of the language and the standard library, the library
will behave well even when you throw exceptions.
In addition to achieving pure exception safety, we usually prefer to avoid resource leaks. That
is, an operation that throws an exception should not only leave its operands in well-defined states
but also ensure that every resource that it acquired is (eventually) released. For example, at the
point where an exception is thrown, all memory allocated must be either deallocated or owned by
some object, which in turn must ensure that the memory is properly deallocated.
The standard-library guarantees the absence of resource leaks provided that user-supplied operations called by the library also avoid resource leaks. Consider:
vvooiidd lleeaakk(bbooooll aabboorrtt)
{
vveeccttoorr<iinntt> vv(1100);
vveeccttoorr<iinntt>* p = nneew
w vveeccttoorr<iinntt>(1100);
aauuttoo__ppttrr< vveeccttoorr<iinntt> > qq(nneew
w vveeccttoorr<iinntt>(1100));
// no leak
// potential memory leak
// no leak (§14.4.2)
iiff (aabboorrtt) tthhrroow
w U
Upp();
// ...
ddeelleettee pp;
}
Upon throwing the exception, the vveeccttoorr called v and the vveeccttoorr held by q will be correctly
destroyed so that their resources are released. The vveeccttoorr pointed to by p is not guarded against
exceptions and will not be destroyed. To make this piece of code safe, we must either explicitly
delete p before throwing the exception or make sure it is owned by an object – such as an aauuttoo__ppttrr
(§14.4.2) – that will properly destroy it if an exception is thrown.
Note that the language rules for partial construction and destruction ensure that exceptions
thrown while constructing sub-objects and members will be handled correctly without special attention from standard-library code (§14.4.1). This rule is an essential underpinning for all techniques
dealing with exceptions.
Also, remember that memory isn’t the only kind of resource that can leak. Opened files, locks,
network connections, and threads are examples of system resources that a function may have to
release or hand over to an object before throwing an exception.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
940
Standard-Library Exception Safety
Appendix E
E.3 Exception-Safe Implementation Techniques
As usual, the standard library provides examples of problems that occur in many other contexts and
of solutions that apply widely. The basic tools available for writing exception-safe code are
[1] the try-block (§8.3.1), and
[2] the support for the ‘‘resource acquisition is initialization’’ technique (§14.4).
The general principles to follow are to
[3] never let go of a piece of information before we can store its replacement, and
[4] always leave objects in valid states when throwing or re-throwing an exception.
That way, we can always back out of an error situation. The practical difficulty in following these
principles is that innocent-looking operations (such as <, =, and ssoorrtt()) might throw exceptions.
Knowing what to look for in an application takes experience.
When you are writing a library, the ideal is to aim at the strong exception-safety guarantee
(§E.2) and always to provide the basic guarantee. When writing a specific program, there may be
less concern for exception safety. For example, if I write a simple data analysis program for my
own use, I’m usually quite willing to have the program terminate in the unlikely event of virtual
memory exhaustion. However, correctness and basic exception safety are closely related.
The techniques for providing basic exception safety, such as defining and checking invariants
(§24.3.7.1), are similar to the techniques that are useful to get a program small and correct. It follows that the overhead of providing basic exception safety (the basic guarantee; §E.2) – or even the
strong guarantee – can be minimal or even insignificant; see §E.8[17].
Here, I will consider an implementation of the standard container vveeccttoorr (§16.3) to see what it
takes to achieve that ideal and where we might prefer to settle for more conditional safety.
E.3.1 A Simple Vector
A typical implementation of vveeccttoorr (§16.3) will consist of a handle holding pointers to the first element, one-past-the-last element, and one-past-the-last allocated space (§17.1.3) (or the equivalent
information represented as a pointer plus offsets):
vveeccttoorr:
ffiirrsstt
ssppaaccee
llaasstt
.
elements
... . . . . . . . . . . . . . .
.
.
. extra space .
.
.
................
Here is a declaration of vveeccttoorr simplified to present only what is needed to discuss exception safety
and avoidance of resource leaks:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss vveeccttoorr {
pprriivvaattee:
T
T* vv;
// start of allocation
T
T* ssppaaccee; // end of element sequence, start of space allocated for possible expansion
T
T* llaasstt; // end of allocated space
A aalllloocc; // allocator
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.1
A Simple Vector
941
ppuubblliicc:
eexxpplliicciitt vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall = T
T(), ccoonnsstt A
A& = A
A());
vveeccttoorr(ccoonnsstt vveeccttoorr& aa);
// copy constructor
vveeccttoorr& ooppeerraattoorr=(ccoonnsstt vveeccttoorr& aa); // copy assignment
˜vveeccttoorr();
ssiizzee__ttyyppee ssiizzee() ccoonnsstt { rreettuurrnn ssppaaccee-vv; }
ssiizzee__ttyyppee ccaappaacciittyy() ccoonnsstt { rreettuurrnn llaasstt-vv; }
vvooiidd ppuusshh__bbaacckk(ccoonnsstt T
T&);
// ...
};
Consider first a naive implementation of a constructor:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>::vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall, ccoonnsstt A
A& aa) // warning: naive implementation
:aalllloocc(aa)
// copy the allocator
{
v = aalllloocc.aallllooccaattee(nn);
// get memory for elements (§19.4.1)
ssppaaccee = llaasstt = vv+nn;
ffoorr (T
T* p = vv; pp!=llaasstt; ++pp) aa.ccoonnssttrruucctt(pp,vvaall);
// construct copy of val in *p (§19.4.1)
}
There are three sources of exceptions here:
[1] aallllooccaattee() throws an exception indicating that no memory is available;
[2] the allocator’s copy constructor throws an exception;
[3] the copy constructor for the element type T throws an exception because it can’t copy vvaall.
In all cases, no object is created, so vveeccttoorr’s destructor is not called (§14.4.1).
When aallllooccaattee() fails, the tthhrroow
w will exit before any resources are acquired, so all is well.
When T
T’s copy constructor fails, we have acquired some memory that must be freed to avoid
memory leaks. A more difficult problem is that the copy constructor for T might throw an exception after correctly constructing a few elements but before constructing them all.
To handle this problem, we could keep track of which elements have been constructed and
destroy those (and only those) in case of an error:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>::vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall, ccoonnsstt A
A& aa)
:aalllloocc(aa)
// copy the allocator
{
v = aalllloocc.aallllooccaattee(nn);
// get memory for elements
// elaborate implementation
iitteerraattoorr pp;
ttrryy {
iitteerraattoorr eenndd = vv+nn;
ffoorr (pp=vv; pp!=eenndd; ++pp) aalllloocc.ccoonnssttrruucctt(pp,vvaall);
llaasstt = ssppaaccee = pp;
// construct element (§19.4.1)
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
942
Standard-Library Exception Safety
ccaattcchh (...) {
ffoorr (iitteerraattoorr q = vv; qq!=pp; ++qq) aalllloocc.ddeessttrrooyy(qq);
aalllloocc.ddeeaallllooccaattee(vv,nn);
// free memory
tthhrroow
w;
// re-throw
}
Appendix E
// destroy constructed elements
}
The overhead here is the overhead of the try-block. In a good C++ implementation, this overhead is
negligible compared to the cost of allocating memory and initializing elements. For implementations where entering a try-block incurs a cost, it may be worthwhile to add a test iiff(nn) before the
ttrryy and handle the empty vector case separately.
The main part of this constructor is an exception-safe implementation of uunniinniittiiaalliizzeedd__ffiillll():
tteem
mppllaattee<ccllaassss F
Foorr, ccllaassss T
T>
vvooiidd uunniinniittiiaalliizzeedd__ffiillll(F
Foorr bbeegg, F
Foorr eenndd, ccoonnsstt T
T& xx)
{
F
Foorr pp;
ttrryy {
ffoorr (pp=bbeegg; pp!=eenndd; ++pp)
nneew
w(ssttaattiicc__ccaasstt<vvooiidd*>(&*pp)) T
T(xx);
// construct copy of x in *p (§10.4.11)
}
ccaattcchh (...) { // destroy constructed elements and rethrow:
ffoorr (F
Foorr q = bbeegg; qq!=pp; ++qq) (&*qq)->˜T
T(); // (§10.4.11)
tthhrroow
w;
}
}
The curious construct &*pp takes care of iterators that are not pointers. In that case, we need to take
the address of the element obtained by dereference to get a pointer. The explicit cast to vvooiidd*
ensures that the standard library placement function is used (§19.4.5), and not some user-defined
ooppeerraattoorr nneew
w() for T
T*s. This code is operating at a rather low level where writing truly general
code can be difficult.
Fortunately, we don’t have to reimplement uunniinniittiiaalliizzeedd__ffiillll(), because the standard library
provides the desired strong guarantee for it (§E.2). It is often essential to have initialization operations that either complete successfully, having initialized every element, or fail leaving no constructed elements behind. Consequently, the standard-library algorithms uunniinniittiiaalliizzeedd__ffiillll(),
uunniinniittiiaalliizzeedd__ffiillll__nn(), and uunniinniittiiaalliizzeedd__ccooppyy() (§19.4.4) are guaranteed to have this strong
exception-safety property (§E.4.4).
Note that the uunniinniittiiaalliizzeedd__ffiillll() algorithm does not protect against exceptions thrown by element destructors or iterator operations (§E.4.4). Doing so would be prohibitively expensive (see
§E.8[16-17]).
The uunniinniittiiaalliizzeedd__ffiillll() algorithm can be applied to many kinds of sequences. Consequently,
it takes a forward iterator (§19.2.1) and cannot guarantee to destroy elements in the reverse order of
their construction.
Using uunniinniittiiaalliizzeedd__ffiillll(), we can write:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.1
A Simple Vector
943
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>::vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall, ccoonnsstt A
A& aa) // messy implementation
:aalllloocc(aa)
// copy the allocator
{
v = aalllloocc.aallllooccaattee(nn);
// get memory for elements
ttrryy {
uunniinniittiiaalliizzeedd__ffiillll(vv,vv+nn,vvaall); // copy elements
ssppaaccee = llaasstt = vv+nn;
}
ccaattcchh (...) {
aalllloocc.ddeeaallllooccaattee(vv,nn);
// free memory
tthhrroow
w;
// re-throw
}
}
However, I wouldn’t call that pretty code. The next section will demonstrate how it can be made
much simpler.
Note that the constructor re-throws a caught exception. The intent is to make vveeccttoorr transparent
to exceptions so that the user can determine the exact cause of a problem. All standard-library containers have this property. Exception transparency is often the best policy for templates and other
‘‘thin’’ layers of software. This is in contrast to major parts of a system (‘‘modules’’) that generally need to take responsibility for all exceptions thrown. That is, the implementer of such a module must be able to list every exception that the module can throw. Achieving this may involve
grouping exceptions (§14.2), mapping exceptions from lower-level routines into the module’s own
exceptions (§14.6.3), or exception specification (§14.6).
E.3.2 Representing Memory Explicitly
Experience revealed that writing correct exception-safe code using explicit try-blocks is more difficult than most people expect. In fact, it is unnecessarily difficult because there is an alternative:
The ‘‘resource acquisition is initialization’’ technique (§14.4) can be used to reduce the amount of
code needing to be written and to make the code more stylized. In this case, the key resource
required by the vveeccttoorr is memory to hold its elements. By providing an auxiliary class to represent
the notion of memory used by a vveeccttoorr, we can simplify the code and decrease the chance of accidentally forgetting to release it:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> >
ssttrruucctt vveeccttoorr__bbaassee {
A aalllloocc; // allocator
T
T* vv;
// start of allocation
T
T* ssppaaccee; // end of element sequence, start of space allocated for possible expansion
T
T* llaasstt; // end of allocated space
vveeccttoorr__bbaassee(ccoonnsstt A
A& aa, ttyyppeennaam
mee A
A::ssiizzee__ttyyppee nn)
: aalllloocc(aa), vv(aa.aallllooccaattee(nn)), ssppaaccee(vv+nn), llaasstt(vv+nn) { }
˜vveeccttoorr__bbaassee() { aalllloocc.ddeeaallllooccaattee(vv,llaasstt-vv); }
};
As long as v and llaasstt are correct, vveeccttoorr__bbaassee can be destroyed. Class vveeccttoorr__bbaassee deals with
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
944
Standard-Library Exception Safety
Appendix E
memory for a type T
T, not objects of type T
T. Consequently, a user of vveeccttoorr__bbaassee must destroy all
constructed objects in a vveeccttoorr__bbaassee before the vveeccttoorr__bbaassee itself is destroyed.
Naturally, vveeccttoorr__bbaassee itself is written so that if an exception is thrown (by the allocator’s copy
constructor or aallllooccaattee() function) no vveeccttoorr__bbaassee object is created and no memory is leaked.
We want to be able to ssw
waapp() vveeccttoorr__bbaassees. However, the default ssw
waapp() doesn’t suit our
needs because it copies and destroys a temporary. Because vveeccttoorr__bbaassee is a special-purpose class
that wasn’t given fool-proof copy semantics, that destructions would lead to undesirable sideeffects. Consequently, we provide a specialization:
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssw
waapp(vveeccttoorr__bbaassee<T
T>& aa, vveeccttoorr__bbaassee<T
T>& bb)
{
ssw
waapp(aa.aa,bb.aa); ssw
waapp(aa.vv,bb.vv); ssw
waapp(aa.ssppaaccee,bb.ssppaaccee); ssw
waapp(aa.llaasstt,bb.llaasstt);
}
Given vveeccttoorr__bbaassee, vveeccttoorr can be defined like this:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> >
ccllaassss vveeccttoorr : pprriivvaattee vveeccttoorr__bbaassee<T
T,A
A> {
vvooiidd ddeessttrrooyy__eelleem
meennttss() { ffoorr (T
T* p = vv; pp!=ssppaaccee; ++pp) pp->˜T
T(); } // §10.4.11
ppuubblliicc:
eexxpplliicciitt vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall = T
T(), ccoonnsstt A
A& = A
A());
vveeccttoorr(ccoonnsstt vveeccttoorr& aa);
// copy constructor
vveeccttoorr& ooppeerraattoorr=(ccoonnsstt vveeccttoorr& aa); // copy assignment
˜vveeccttoorr() { ddeessttrrooyy__eelleem
meennttss(); }
ssiizzee__ttyyppee ssiizzee() ccoonnsstt { rreettuurrnn ssppaaccee-vv; }
ssiizzee__ttyyppee ccaappaacciittyy() ccoonnsstt { rreettuurrnn llaasstt-vv; }
vvooiidd ppuusshh__bbaacckk(ccoonnsstt T
T&);
// ...
};
The vveeccttoorr destructor explicitly invokes the T destructor for every element. This implies that if an
element destructor throws an exception, the vveeccttoorr destruction fails. This can be a disaster if it happens during stack unwinding caused by an exception and tteerrm
miinnaattee() is called (§14.7). In the case
of normal destruction, throwing an exception from a destructor typically leads to resource leaks and
unpredictable behavior of code relying on reasonable behavior of objects. There is no really good
way to protect against exceptions thrown from destructors, so the library makes no guarantees if an
element destructor throws (§E.4).
Now the constructor can be simply defined:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>::vveeccttoorr(ssiizzee__ttyyppee nn, ccoonnsstt T
T& vvaall, ccoonnsstt A
A& aa)
:vveeccttoorr__bbaassee<T
T,A
A>(aa,nn)
// allocate space for n elements
{
uunniinniittiiaalliizzeedd__ffiillll(vv,vv+nn,vvaall); // copy elements
}
The copy constructor differs by using uunniinniittiiaalliizzeedd__ccooppyy() instead of uunniinniittiiaalliizzeedd__ffiillll():
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.2
Representing Memory Explicitly
945
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>::vveeccttoorr(ccoonnsstt vveeccttoorr<T
T,A
A>& aa)
:vveeccttoorr__bbaassee<T
T,A
A>(aa.aalllloocc,aa.ssiizzee())
{
uunniinniittiiaalliizzeedd__ccooppyy(aa.bbeeggiinn(),aa.eenndd(),vv);
}
Note that this style of constructor relies on the fundamental language rule that when an exception is
thrown from a constructor, sub-objects (such as bases) that have already been completely constructed will be properly destroyed (§14.4.1). The uunniinniittiiaalliizzeedd__ffiillll() algorithm and its cousins
(§E.4.4) provide the equivalent guarantee for partially constructed sequences.
E.3.3 Assignment
As usual, assignment differs from construction in that an old value must be taken care of. Consider
a straightforward implementation:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>& vveeccttoorr<T
T,A
A>::ooppeerraattoorr=(ccoonnsstt vveeccttoorr& aa) // offers the strong guarantee (§E.2)
{
vveeccttoorr__bbaassee<T
T,A
A> bb(aalllloocc,aa.ssiizzee());
// get memory
uunniinniittiiaalliizzeedd__ccooppyy(aa.bbeeggiinn(),aa.eenndd(),bb.vv); // copy elements
ddeessttrrooyy__eelleem
meennttss();
aalllloocc.ddeeaallllooccaattee(vv,llaasstt-vv);
// free old memory
vveeccttoorr__bbaassee::ooppeerraattoorr=(bb);
// install new representation
bb.vv = 00;
// prevent deallocation
rreettuurrnn *tthhiiss;
}
This assignment is safe, but it repeats a lot of code from constructors and destructors. To avoid
this, we could write:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>& vveeccttoorr<T
T,A
A>::ooppeerraattoorr=(ccoonnsstt vveeccttoorr& aa) // offers the strong guarantee (§E.2)
{
vveeccttoorr tteem
mpp(aa);
// copy a
ssw
waapp< vveeccttoorr__bbaassee<T
T,A
A> >(*tthhiiss,tteem
mpp);
// swap representations
rreettuurrnn *tthhiiss;
}
The old elements are destroyed by tteem
mpp’s destructor, and the memory used to hold them is deallocated by tteem
mpp’s vveeccttoorr__bbaassee’s destructor.
The performance of the two versions ought to be equivalent. Essentially, they are just two different ways of specifying the same set of operations. However, the second implementation is
shorter and doesn’t replicate code from related vveeccttoorr functions, so writing the assignment that way
ought to be less error prone and lead to simpler maintenance.
Note the absence of the traditional test for self-assignment (§10.4.4). These assignment implementations work by first constructing a copy and then swapping representations. This obviously
handles self-assignment correctly. I decided that the efficiency gained from the test in the rare case
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
946
Standard-Library Exception Safety
Appendix E
of self-assignment was more than offset by its cost in the common case where a different vveeccttoorr is
assigned.
In either case, two potentially significant optimizations are missing:
[1] If the capacity of the vector assigned to is large enough to hold the assigned vector, we don’t
need to allocate new memory.
[2] An element assignment may be more efficient than an element destruction followed by an
element construction.
Implementing these optimizations, we get:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vveeccttoorr<T
T,A
A>& vveeccttoorr<T
T,A
A>::ooppeerraattoorr=(ccoonnsstt vveeccttoorr& aa) // optimized, basic guarantee (§E.2)
{
iiff (ccaappaacciittyy() < aa.ssiizzee()) {
// allocate new vector representation:
vveeccttoorr tteem
mpp(aa);
// copy a
ssw
waapp< vveeccttoorr__bbaassee<T
T,A
A> >(*tthhiiss,tteem
mpp);
// swap representations
rreettuurrnn *tthhiiss;
}
iiff (tthhiiss == &aa) rreettuurrnn *tthhiiss;
// protect against self assignment (§10.4.4)
// assign to old elements:
ssiizzee__ttyyppee sszz = ssiizzee();
ssiizzee__ttyyppee aasszz = aa.ssiizzee();
aalllloocc = aa.ggeett__aallllooccaattoorr();
// copy the allocator
iiff (aasszz<=sszz) {
ccooppyy(aa.bbeeggiinn(),aa.bbeeggiinn()+aasszz,vv);
ffoorr (T
T* p = vv+aasszz; pp!=ssppaaccee; ++pp) pp->˜T
T(); // destroy surplus elements (§10.4.11)
}
eellssee {
ccooppyy(aa.bbeeggiinn(),aa.bbeeggiinn()+sszz,vv);
uunniinniittiiaalliizzeedd__ccooppyy(aa.bbeeggiinn()+sszz,aa.eenndd(),ssppaaccee); // construct extra elements
}
ssppaaccee = vv+aasszz;
rreettuurrnn *tthhiiss;
}
These optimizations are not free. The ccooppyy() algorithm (§18.6.1) does not offer the strong
exception-safety guarantee. It does not guarantee that it will leave its target unchanged if an exception is thrown during copying. Thus, if T
T::ooppeerraattoorr=() throws an exception during ccooppyy(), the
vveeccttoorr being assigned to need not be a copy of the vector being assigned, and it need not be
unchanged. For example, the first five elements might be copies of elements of the assigned vector
and the rest unchanged. It is also plausible that an element – the element that was being copied
when T
T::ooppeerraattoorr=() threw an exception – ends up with a value that is neither the old value nor a
copy of the corresponding element in the vector being assigned. However, if T
T::ooppeerraattoorr=()
leaves its operands in a valid state if it throws an exception, the vveeccttoorr is still in a valid state – even
if it wasn’t the state we would have preferred.
Here, I have copied the allocator using an assignment. It is actually not required that every allocator support assignment (§19.4.3); see also §E.8[9].
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.3
Assignment
947
The standard-library vveeccttoorr assignment offers the weaker exception-safety property of this last
implementation – and its potential performance advantages. That is, vveeccttoorr assignment provides
the basic guarantee, so it meets most people’s idea of exception safety. However, it does not provide the strong guarantee (§E.2). If you need an assignment that leaves the vveeccttoorr unchanged if an
exception is thrown, you must either use a library implementation that provides the strong guarantee or provide your own assignment operation. For example:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vvooiidd ssaaffee__aassssiiggnn(vveeccttoorr<T
T,A
A>& aa, ccoonnsstt vveeccttoorr<T
T,A
A>& bb) // "obvious" a = b
{
vveeccttoorr<T
T,A
A> tteem
mpp(aa.ggeett__aallllooccaattoorr());
tteem
mpp.rreesseerrvvee(bb.ssiizzee());
ffoorr (ttyyppeennaam
mee vveeccttoorr<T
T,A
A>::iitteerraattoorr p = bb.bbeeggiinn(); pp!=bb.eenndd(); ++pp)
tteem
mpp.ppuusshh__bbaacckk(*pp);
ssw
waapp(aa,tteem
mpp);
}
If there is insufficient memory for tteem
mpp to be created with room for bb.ssiizzee() elements,
ssttdd::bbaadd__aalllloocc is thrown before any changes are made to aa. Similarly, if ppuusshh__bbaacckk() fails for
any reason, a will remain untouched because we apply ppuusshh__bbaacckk() to tteem
mpp rather than to aa. In
that case, any elements of tteem
mpp created by ppuusshh__bbaacckk() will be destroyed before the exception
that caused the failure is re-thrown.
Swap does not copy vveeccttoorr elements. It simply swaps the data members of a vveeccttoorr; that is, it
swaps vveeccttoorr__bbaassees (§E.3.2). Consequently, it does not throw exceptions even if operations on the
elements might (§E.4.3). Consequently, ssaaffee__aassssiiggnn() does not do spurious copies of elements
and is reasonably efficient.
As is often the case, there are alternatives to the obvious implementation. We can let the library
perform the copy into the temporary for us:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vvooiidd ssaaffee__aassssiiggnn(vveeccttoorr<T
T,A
A>& aa, ccoonnsstt vveeccttoorr<T
T,A
A>& bb) // simple a = b
{
vveeccttoorr<T
T,A
A> tteem
mpp(bb);
// copy the elements of b into a temporary
ssw
waapp(aa,tteem
mpp);
}
Indeed, we could simply use call-by-value (§7.2):
tteem
mppllaattee<ccllaassss T
T, ccllaassss A
A>
vvooiidd ssaaffee__aassssiiggnn(vveeccttoorr<T
T,A
A>& aa, vveeccttoorr<T
T,A
A> bb)
{
ssw
waapp(aa,bb);
}
// simple a = b (note: b is passed by value)
E.3.4 ppuusshh__bbaacckk(())
From an exception-safety point of view, ppuusshh__bbaacckk() is similar to the assignment in that we must
take care that the vveeccttoorr remains unchanged if we fail to add a new element:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
948
Standard-Library Exception Safety
Appendix E
tteem
mppllaattee< ccllaassss T
T, ccllaassss A
A>
vvooiidd vveeccttoorr<T
T,A
A>::ppuusshh__bbaacckk(ccoonnsstt T
T& xx)
{
iiff (ssppaaccee == llaasstt) { // no more free space; relocate:
vveeccttoorr__bbaassee bb(aalllloocc,ssiizzee()?22*ssiizzee():22); // double the allocation
uunniinniittiiaalliizzeedd__ccooppyy(vv,ssppaaccee,bb.vv);
nneew
w(bb.ssppaaccee) T
T(xx);
// place a copy of x in *b.space (§10.4.11)
++bb.ssppaaccee;
ddeessttrrooyy__eelleem
meennttss();
ssw
waapp<vveeccttoorr__bbaassee<T
T,A
A> >(bb,*tthhiiss);
// swap representations
rreettuurrnn;
}
nneew
w(ssppaaccee) T
T(xx);
// place a copy of x in *space (§10.4.11)
++ssppaaccee;
}
Naturally, the copy constructor used to initialize *ssppaaccee might throw an exception. If that happens,
the value of the vveeccttoorr remains unchanged, with ssppaaccee left unincremented. In that case, the vveeccttoorr
elements are not reallocated so that iterators referring to them are not invalidated. Thus, this implementation implements the strong guarantee that an exception thrown by an allocator or even a
user-supplied copy constructor leaves the vveeccttoorr unchanged. The standard library offers that guarantee for ppuusshh__bbaacckk() (§E.4.1).
Note the absence of a try-block (except for the one hidden in uunniinniittiiaalliizzeedd__ccooppyy()). The
update was done by carefully ordering the operations so that if an exception is thrown, the vveeccttoorr
remains unchanged.
The approach of gaining exception safety through ordering and the ‘‘resource acquisition is
initialization’’ technique (§14.4) tends to be more elegant and more efficient than explicitly handling errors using try-blocks. More problems with exception safety arise from a programmer ordering code in unfortunate ways than from lack of specific exception-handling code. The basic rule of
ordering is not to destroy information before its replacement has been constructed and can be
assigned without the possibility of an exception.
Exceptions introduce possibilities for surprises in the form of unexpected control flows. For a
piece of code with a simple local control flow, such as the ooppeerraattoorr=(), ssaaffee__aassssiiggnn(), and
ppuusshh__bbaacckk() examples, the opportunities for surprises are limited. It is relatively simple to look at
such code and ask oneself ‘‘can this line of code throw an exception, and what happens if it does?’’
For large functions with complicated control structures, such as complicated conditional statements
and nested loops, this can be hard. Adding try-blocks increases this local control structure complexity and can therefore be a source of confusion and errors (§14.4). I conjecture that the effectiveness of the ordering approach and the ‘‘resource acquisition is initialization’’ approach compared to more extensive use of try-blocks stems from the simplification of the local control flow.
Simple, stylized code is easier to understand and easier to get right.
Note that the vveeccttoorr implementation is presented as an example of the problems that exceptions
can pose and of techniques for addressing those problems. The standard does not require an implementation to be exactly like the one presented here. What the standard does guarantee is the subject of §E.4.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.5
Constructors and Invariants
949
E.3.5 Constructors and Invariants
From the point of view of exception safety, other vveeccttoorr operations are either equivalent to the ones
already examined (because they acquire and release resources in similar ways) or trivial (because
they don’t perform operations that require cleverness to maintain valid states). However, for most
classes, such ‘‘trivial’’ functions constitute the majority of code. The difficulty of writing such
functions depends critically on the environment that a constructor established for them to operate
in. Said differently, the complexity of ‘‘ordinary member functions’’ depends critically on choosing a good class invariant (§24.3.7.1). By examining the ‘‘trivial’’ vveeccttoorr functions, it is possible
to gain insight into the interesting question of what makes a good invariant for a class and how constructors should be written to establish such invariants.
Operations such as vveeccttoorr subscripting (§16.3.3) are easy to write because they can rely on the
invariant established by the constructors and maintained by all functions that acquire or release
resources. In particular, a subscript operator can rely on v referring to an array of elements:
tteem
mppllaattee< ccllaassss T
T, ccllaassss A
A>
T
T& vveeccttoorr<T
T,A
A>::ooppeerraattoorr[](ssiizzee__ttyyppee ii)
{
rreettuurrnn vv[ii];
}
It is important and fundamental to have constructors acquire resources and establish a simple
invariant. To see why, consider an alternative definition of vveeccttoorr__bbaassee:
tteem
mppllaattee<ccllaassss T
T, ccllaassss A = aallllooccaattoorr<T
T> >
// clumsy use of constructor
ccllaassss vveeccttoorr__bbaassee {
ppuubblliicc:
A aalllloocc; // allocator
T
T* vv;
// start of allocation
T
T* ssppaaccee; // end of element sequence, start of space allocated for possible expansion
T
T* llaasstt; // end of allocated space
vveeccttoorr__bbaassee(ccoonnsstt A
A& aa, ttyyppeennaam
mee A
A::ssiizzee__ttyyppee nn) : aalllloocc(aa), vv(00), ssppaaccee(00), llaasstt(00)
{
v = aalllloocc.aallllooccaattee(nn);
ssppaaccee = llaasstt = vv+nn;
}
˜vveeccttoorr__bbaassee() { iiff (vv) aalllloocc.ddeeaallllooccaattee(vv,llaasstt-vv); }
};
Here, I construct a vveeccttoorr__bbaassee in two stages: First, I establish a ‘‘safe state’’ where vv, ssppaaccee, and
llaasstt are set to 00. Only after that has been done do I try to allocate memory. This is done out of
misplaced fear that if an exception happens during element allocation, a partially constructed object
could be left behind. This fear is misplaced because a partially constructed object cannot be ‘‘left
behind’’ and later accessed. The rules for static objects, automatic objects, member objects, and
elements of the standard-library containers prevent that. However, it could/can happen in prestandard libraries that used/use placement new (§10.4.11) to construct objects in containers
designed without concern for exception safety. Old habits can be hard to break.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
950
Standard-Library Exception Safety
Appendix E
Note that this attempt to write safer code complicates the invariant for the class: It is no longer
guaranteed that v points to allocated memory. Now v might be 00. This has one immediate cost.
The standard-library requirements for allocators do not guarantee that we can safely deallocate a
pointer with the value 0 (§19.4.1). In this, allocators differ from ddeelleettee (§6.2.6). Consequently, I
had to add a test in the destructor. Also, each element is first initialized and then assigned. The
cost of doing that extra work can be significant for element types for which assignment is nontrivial, such as ssttrriinngg and lliisstt.
This two-stage construct is not an uncommon style. Sometimes, it is even made explicit by
having the constructor do only some ‘‘simple and safe’’ initialization to put the object into a
destructible state. The real construction is left to an iinniitt() function that the user must explicitly
call. For example:
tteem
mppllaattee<ccllaassss T
T>
// archaic (pre-standard, pre-exception) style
ccllaassss vveeccttoorr__bbaassee {
ppuubblliicc:
T
T* vv;
// start of allocation
T
T* ssppaaccee; // end of element sequence, start of space allocated for possible expansion
T
T* llaasstt; // end of allocated space
vveeccttoorr__bbaassee() : vv(00), ssppaaccee(00), llaasstt(00) { }
˜vveeccttoorr__bbaassee() { ffrreeee(vv); }
bbooooll iinniitt(ssiizzee__tt nn) // return true if initialization succeeded
{
iiff (vv = (T
T*)m
maalllloocc(ssiizzeeooff(T
T)*nn)) {
uunniinniittiiaalliizzeedd__ffiillll(vv,vv+nn,T
T());
ssppaaccee = llaasstt = vv+nn;
rreettuurrnn ttrruuee;
}
rreettuurrnn ffaallssee;
}
};
The perceived value of this style is
[1] The constructor can’t throw an exception, and the success of an initialization using iinniitt()
can be tested by ‘‘usual’’ (that is, non-exception) means.
[2] There exists a trivial valid state. In case of a serious problem, an operation can give an
object that state.
[3] The acquisition of resources is delayed until a fully initialized object is actually needed.
The following subsections examine these points and shows why this two-stage construction technique doesn’t deliver its expected benefits. It can also be a source of problems.
E.3.5.1 Using iinniitt(()) Functions
The first point (using an iinniitt() function in preference to a constructor) is bogus. Using constructors and exception handling is a more general and systematic way of dealing with resource acquisition and initialization errors (§14.1, §14.4). This style is a relic of pre-exception C++.
Carefully written code using the two styles are roughly equivalent. Consider:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.5.1
Using iinniitt(()) Functions
951
iinntt ff11(iinntt nn)
{
vveeccttoorr<X
X> vv;
// ...
iiff (vv.iinniitt(nn)) {
// use v as vector of n elements
}
eellssee {
// handle_problem
}
}
and
iinntt ff22(iinntt nn)
ttrryy {
vveeccttoorr vv<X
X> vv(nn);
// ...
// use v as vector of n elements
}
ccaattcchh (...) {
// handle problem
}
However, having a separate iinniitt() function is an opportunity to
[1] forget to call iinniitt() (§10.2.3),
[2] forget to test on the success of iinniitt(),
[3] call iinniitt() more than once,
[4] forget that iinniitt() might throw an exception, and
[5] use the object before calling iinniitt().
The definition of vveeccttoorr<T
T>::iinniitt() illustrates [4].
In a good C++ implementation, ff22() will be marginally faster than ff11() because it avoids the
test in the common case.
E.3.5.2 Relying on a Default Valid State
The second point (having an easy-to-construct ‘‘default’’ valid state) is correct in general, but in the
case of vveeccttoorr, it is achieved at an unnecessary cost. It is now possible to have a vveeccttoorr__bbaassee with
vv==00, so the vector implementation must protect against that possibility throughout. For example:
tteem
mppllaattee< ccllaassss T
T>
T
T& vveeccttoorr<T
T>::ooppeerraattoorr[](ssiizzee__tt ii)
{
iiff (vv) rreettuurrnn vv[ii];
// handle error
}
Leaving the possibility of vv==00 open makes the cost of non-range-checked subscripting equivalent
to range-checked access:
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
952
Standard-Library Exception Safety
Appendix E
tteem
mppllaattee< ccllaassss T
T>
T
T& vveeccttoorr<T
T>::aatt(ssiizzee__tt ii)
{
iiff (ii<vv.ssiizzee()) rreettuurrnn vv[ii];
tthhrroow
w oouutt__ooff__rraannggee("vveeccttoorr iinnddeexx");
}
What fundamentally happened here was that I complicated the basic invariant for vveeccttoorr__bbaassee by
introducing the possibility of vv==00. In consequence, the basic invariant for vveeccttoorr was similarly
complicated. The end result of this is that all code in vveeccttoorr and vveeccttoorr__bbaassee must be more complicated to cope. This is a source of potential errors, maintenance problems, and run-time overhead. Note that conditional statements can be surprisingly costly on modern machine architectures.
Where efficiency matters, it can be crucial to implement a key operation, such as vector subscripting, without conditional statements.
Interestingly, the original definition of vveeccttoorr__bbaassee already did have an easy-to-construct valid
state. No vveeccttoorr__bbaassee object could exist unless the initial allocation succeeded. Consequently, the
implementer of vveeccttoorr could write an ‘‘emergency exit’’ function like this:
tteem
mppllaattee< ccllaassss T
T, ccllaassss A
A>
vvooiidd vveeccttoorr<T
T,A
A>::eem
meerrggeennccyy__eexxiitt()
{
ssppaaccee = vv;
// set the size of *this to 0
tthhrroow
w T
Toottaall__ffaaiilluurree();
}
This is a bit drastic because it fails to call element destructors and to deallocate the space for elements held by the vveeccttoorr__bbaassee. That is, it fails to provide the basic guarantee (§E.2). If we are
willing to trust the values of v and ssppaaccee and the element destructors, we can avoid potential
resource leaks:
tteem
mppllaattee< ccllaassss T
T, ccllaassss A
A>
vvooiidd vveeccttoorr<T
T,A
A>::eem
meerrggeennccyy__eexxiitt()
{
ddeessttrrooyy__eelleem
meennttss();
// clean up
tthhrroow
w T
Toottaall__ffaaiilluurree();
}
Please note that the standard vveeccttoorr is such a clean design that it minimizes the problems caused by
two-phase construction. The iinniitt() function is roughly equivalent to rreessiizzee(), and in most places
the possibility of vv==00 is already covered by ssiizzee()==00 tests. The negative effects described for
two-phase construction become more marked when we consider application classes that acquire
significant resources, such as network connections and files. Such classes are rarely part of a
framework that guides their use and their implementation in the way the standard-library requirements guide the definition and use of vveeccttoorr. The problems also tend to increase as the mapping
between the application concepts and the resources required to implement them becomes more
complex. Few classes map as directly onto system resources as does vveeccttoorr.
The idea of having a ‘‘safe state’’ is in principle a good one. If we can’t put an object into a
valid state without fear of throwing an exception before completing that operation, we do indeed
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.3.5.2
Relying on a Default Valid State
953
have a problem. However, this ‘‘safe state’’ should be one that is a natural part of the semantics of
the class rather than an implementation artifact that complicates the class invariant.
E.3.5.3 Delaying resource acquisition
Like the second point (§E.3.5.2), the third (to delay acquisition until a resource is needed) misapplies a good idea in a way that imposes cost without yielding benefits. In many cases, notably in
containers such as vveeccttoorr, the best way of delaying resource acquisition is for the programmer to
delay the creation of objects until they are needed. Consider a naive use of vveeccttoorr:
vvooiidd ff(iinntt nn)
{
vveeccttoorr<X
X> vv(nn);
// ...
vv[33] = X
X(9999);
// ...
// make n default objects of type X
// real ‘‘initialization’’ of v[3]
}
Constructing an X only to assign a new value to it later is wasteful – especially if an X assignment
is expensive. Therefore, two-phase construction of X can seem attractive. For example, the type X
may itself be a vveeccttoorr, so we might consider two-phase construction of vveeccttoorr to optimize creation
of empty vveeccttoorrs. However, creating default (empty) vectors is already efficient, so complicating
the implementation with a special case for the empty vector seems futile. More generally, the best
solution to spurious initialization is rarely to remove complicated initialization from the element
constructors. Instead, a user can create elements only when needed. For example:
vvooiidd ff22(iinntt nn)
{
vveeccttoorr<X
X> vv;
// ...
vv.ppuusshh__bbaacckk(X
X(9999));
// ...
// make empty vector
// construct element when needed
}
To sum up: the two-phase construction approach leads to more complicated invariants and typically
to less elegant, more error-prone, and harder-to-maintain code. Consequently, the languagesupported ‘‘constructor approach’’ should be preferred to the ‘‘iinniitt()-function approach’’ whenever feasible. That is, resources should be acquired in constructors whenever delayed resource
acquisition isn’t mandated by the inherent semantics of a class.
E.4 Standard Container Guarantees
If a library operation itself throws an exception, it can – and does – make sure that the objects on
which it operates are left in a well-defined state. For example, aatt() throwing oouutt__ooff__rraannggee for a
vveeccttoorr (§16.3.3) is not a problem with exception safety for the vveeccttoorr. The writer of aatt() has no
problem making sure that a vveeccttoorr is in a well-defined state before throwing. The problems – for
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
954
Standard-Library Exception Safety
Appendix E
library implementers, for library users, and for people trying to understand code – come when a
user-supplied function throws an exception.
The standard-library containers offer the basic guarantee (§E.2): The basic invariants of the
library are maintained, and no resources are leaked as long as user code behaves as required. That
is, user-supplied operations should not leave container elements in invalid states or throw exceptions from destructors. By ‘‘operations,’’ I mean operations used by the standard-library implementation, such as constructors, assignments, destructors, and operations on iterators (§E.4.4).
It is relatively easy for the programmer to ensure that such operations meet the library’s expectations. In fact, much naively written code conforms to the library’s requirements. The following
types clearly meet the standard library’s requirements for container element types:
[1] built-in types – including pointers,
[2] types without user-defined operations,
[3] classes with operations that neither throw exceptions nor leave operands in invalid states,
[4] classes with destructors that don’t throw exceptions and for which it is simple to verify that
operations used by the standard library (such as constructors, assignments, <, ==, and
ssw
waapp()) don’t leave operands in invalid states.
In each case, we must also make sure that no resource is leaked. For example:
vvooiidd ff(C
Ciirrccllee* ppcc, T
Trriiaannggllee* pptt, vveeccttoorr<SShhaappee*>& vv22)
{
vveeccttoorr<SShhaappee*> vv(1100);
// either create vector or throw bad_alloc
vv[33] = ppcc;
// no exception thrown
vv.iinnsseerrtt(vv.bbeeggiinn()+44,pptt);
// either insert pt or no effect on v
vv22.eerraassee(vv22.bbeeggiinn()+33);
// either erase v2[3] or no effect on v2
vv22 = vv;
// copy v or no effect on v2
// ...
}
When ff() exits, v will be properly destroyed, and vv22 will be in a valid state. This fragment does
not indicate who is responsible for deleting ppcc and pptt. If ff() is responsible, it can either catch
exceptions and do the required deletion, or assign the pointers to local aauuttoo__ppttrrs.
The more interesting question is: When do the library operations offer the strong guarantee that
an operation either succeeds or has no effect on its operands? For example:
vvooiidd ff(vveeccttoorr<X
X>& vvxx)
{
vvxx.iinnsseerrtt(vvxx.bbeeggiinn()+44,X
X(77));
}
// add element
In general, X
X’s operations and vveeccttoorr<X
X>’s allocator can throw an exception. What can we say
about the elements of vvxx when ff() exits because of an exception? The basic guarantee ensures that
no resources have been leaked and that vvxx has a set of valid elements. However, exactly what elements? Is vvxx unchanged? Could a default X have been added? Could an element have been
removed because that was the only way for iinnsseerrtt() to recover while maintaining the basic guarantee? Sometimes, it is not enough to know that a container is in a good state; we also want to know
exactly what state that is. After catching an exception, we typically want to know that the elements
are exactly those we intended, or we will have to start error recovery.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.4.1
Insertion and Removal of Elements
955
E.4.1 Insertion and Removal of Elements
Inserting an element into a container and removing one are obvious examples of operations that
might leave a container in an unpredictable state if an exception is thrown. The reason is that insertions and deletions invoke many operations that may throw exceptions:
[1] A new value is copied into a container.
[2] An element deleted from (erased from) a container must be destroyed.
[3] Sometimes, memory must be allocated to hold a new element.
[4] Sometimes, vveeccttoorr and ddeeqquuee elements must be copied to new locations.
[5] Associative containers call comparison functions for elements.
[6] Many insertions and deletions involve iterator operations.
Each of these cases can cause an exception to be thrown.
If a destructor throws an exception, no guarantees are made (§E.2). Making guarantees in this
case would be prohibitively expensive. However, the library can and does protect itself – and its
users – from exceptions thrown by other user-supplied operations.
When manipulating a linked data structure, such as a lliisstt or a m
maapp, elements can be added and
removed without affecting other elements in the container. This is not the case for a container
implemented using contiguous allocation of elements, such as a vveeccttoorr or a ddeeqquuee. There, elements
sometimes need to be moved to new locations.
In addition to the basic guarantee, the standard library offers the strong guarantee for a few
operations that insert or remove elements. Because containers implemented as linked data structures behave differently from containers with contiguous allocation of elements, the standard provides slightly different guarantees for different kinds of containers:
[1] Guarantees for vveeccttoorr (§16.3) and ddeeqquuee (§17.2.3):
– If an exception is thrown by a ppuusshh__bbaacckk() or a ppuusshh__ffrroonntt(), that function has no
effect.
– Unless thrown by the copy constructor or the assignment operator of the element type, if
an exception is thrown by an iinnsseerrtt(), that function has no effect.
– Unless thrown by the copy constructor or the assignment operator of the element type,
no eerraassee() throws an exception.
– No ppoopp__bbaacckk() or ppoopp__ffrroonntt() throws an exception.
[2] Guarantees for lliisstt (§17.2.2):
– If an exception is thrown by a ppuusshh__bbaacckk() or a ppuusshh__ffrroonntt(), that function has no
effect.
– If an exception is thrown by an iinnsseerrtt(), that function has no effect.
– No eerraassee(), ppoopp__bbaacckk(), ppoopp__ffrroonntt(), sspplliiccee(), or rreevveerrssee() throws an exception.
– Unless thrown by a predicate or a comparison function, the lliisstt member functions
rreem
moovvee(), rreem
moovvee__iiff(), uunniiqquuee(), ssoorrtt(), and m
meerrggee() do not throw exceptions.
[3] Guarantees for associative containers (§17.4):
– If an exception is thrown by an iinnsseerrtt() while inserting a single element, that function
has no effect.
– No eerraassee() throws an exception.
Note that where the strong guarantee is provided for an operation on a container, all iterators,
pointers to elements, and references to elements remain valid if an exception is thrown.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
956
Standard-Library Exception Safety
Appendix E
These rules can be summarized in a table:
_______________________________________________________________________
________________________________________________________________________
Container-Operation Guarantees
______________________________________________________________________
vector
deque
list
map
_______________________________________________________________________
nothrow nothrow nothrow
nothrow
cclleeaarr(())
(copy)
(copy)
nothrow nothrow nothrow
eerraassee(())
nothrow
(copy)
(copy)
meenntt iinnsseerrtt(()) strong
strong
strong
strong
11--eelleem
(copy)
(copy)
N
N--eelleem
meenntt iinnsseerrtt(()) strong
strong
strong
basic
(copy)
(copy)
m
meerrggee(())
—
—
nothrow
—
(comparison)
strong
strong
strong
—
ppuusshh__bbaacckk(())
—
ppuusshh__ffrroonntt(())
strong
strong
—
—
nothrow nothrow nothrow
ppoopp__bbaacckk(())
—
ppoopp__ffrroonntt(())
nothrow nothrow
—
rreem
moovvee(())
—
—
nothrow
—
(comparison)
—
moovvee__iiff(())
—
nothrow
—
rreem
(predicate)
—
rreevveerrssee(())
—
nothrow
—
—
nothrow
—
—
sspplliiccee(())
nothrow nothrow nothrow
ssw
waapp(())
nothrow
(copy-of-comparison)
uunniiqquuee(())
—
nothrow
—
—
(comparison)
_______________________________________________________________________
In this table:
basic
means that the operation provides only the basic guarantee (§E.2)
strong means that the operation provides the strong guarantee (§E.2)
nothrow means that the operation does not throw an exception (§E.2)
—
means that the operation is not provided as a member of this container
Where a guarantee requires that some user-supplied operations not throw exceptions, those
operations are indicated in parentheses under the guarantee. These requirements are precisely
stated in the text preceding the table.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.4.1
Insertion and Removal of Elements
957
The ssw
waapp() functions differ from the other functions mentioned by not being members.
The guarantee for cclleeaarr() is derived from that offered by eerraassee() (§16.3.6). This table lists
guarantees offered in addition to the basic guarantee. Consequently this table does not list operations, such as rreevveerrssee() and uunniiqquuee() for vveeccttoorr, that are provided only as algorithms for all
sequences without additional guarantees.
The ‘‘almost container’’ bbaassiicc__ssttrriinngg (§17.5, §20.3) offers the basic guarantee for all operations (§E.5.1). The standard also guarantees that bbaassiicc__ssttrriinngg’s eerraassee() and ssw
waapp() don’t
throw, and offers the strong guarantee for bbaassiicc__ssttrriinngg’s iinnsseerrtt() and ppuusshh__bbaacckk().
In addition to ensuring that a container is unchanged, an operation providing the strong
guarantee also leaves all iterators, pointers, and references valid. For example:
vvooiidd uuppddaattee(m
maapp<ssttrriinngg,X
X>& m
m, m
maapp<ssttrriinngg,X
X>::iitteerraattoorr ccuurrrreenntt)
{
X xx;
ssttrriinngg ss;
w
whhiillee (cciinn>>ss>>xx)
ttrryy {
ccuurrrreenntt = m
m.iinnsseerrtt(ccuurrrreenntt,m
maakkee__ppaaiirr(ss,xx));
}
ccaattcchh(...) {
// here current still denotes the current element
}
}
E.4.2 Guarantees and Tradeoffs
The patchwork of additional guarantees reflects implementation realities. Programmers prefer
the strong guarantee with as few conditions as possible, but they also tend to insist that each
individual standard-library operation be optimally efficient. Both concerns are reasonable, but
for many operations, it is not possible to satisfy both simultaneously. To give a better idea of
the tradeoffs involved, I’ll examine ways of adding of single and multiple elements to lliisstts,
vveeccttoorrs, and m
maapps.
Consider adding a single element to a lliisstt or a vveeccttoorr. As ever, ppuusshh__bbaacckk() provides the
simplest way of doing that:
vvooiidd ff(lliisstt<X
X>& llsstt, vveeccttoorr<X
X>& vveecc, ccoonnsstt X
X& xx)
{
ttrryy {
llsstt.ppuusshh__bbaacckk(xx);
// add to list
}
ccaattcchh (...) {
// lst is unchanged
rreettuurrnn;
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
958
Standard-Library Exception Safety
Appendix E
ttrryy {
vveecc.ppuusshh__bbaacckk(xx);
}
ccaattcchh (...) {
// vec is unchanged
rreettuurrnn;
}
// add to vector
// lst and vec each have a new element with the value x
}
Providing the strong guarantee in these cases is simple and cheap. It is also very useful because
it provides a completely exception-safe way of adding elements. However, ppuusshh__bbaacckk() isn’t
defined for associative containers – a m
maapp has no bbaacckk(). After all, the last element of an
associative container is defined by the order relation rather than by position.
The guarantees for iinnsseerrtt() are a bit more complicated. The reason is that sometimes
iinnsseerrtt() has to place an element in ‘‘the middle’’ of a container. This is no problem for a
linked data structure, such as lliisstt or m
maapp. However, if there is free reserved space in a vveeccttoorr,
the obvious implementation of vveeccttoorr<X
X>::iinnsseerrtt() copies the elements after the insertion
point to make room. This is optimally efficient, but there is no simple way of restoring a vveeccttoorr
if X
X’s copy assignment or copy constructor throws an exception (see §E.8[10-11]). Consequently, vveeccttoorr provides a guarantee that is conditional upon element copy operations not
throwing exceptions. However, lliisstt and m
maapp don’t need such a condition; they can simply link
in new elements after doing any necessary copying.
As an example, assume that X
X’s copy assignment and copy constructor throw
X
X::ccaannnnoott__ccooppyy if they cannot successfully create a copy:
vvooiidd ff(lliisstt<X
X>& llsstt, vveeccttoorr<X
X>& vveecc, m
maapp<ssttrriinngg,X
X>& m
m, ccoonnsstt X
X& xx, ccoonnsstt ssttrriinngg& ss)
{
ttrryy {
llsstt.iinnsseerrtt(llsstt.bbeeggiinn(),xx);
// add to list
}
ccaattcchh (...) {
// lst is unchanged
rreettuurrnn;
}
ttrryy {
vveecc.iinnsseerrtt(vveecc.bbeeggiinn(),xx);
// add to vector
}
ccaattcchh (X
X::ccaannnnoott__ccooppyy) {
// oops: vec may or may not have a new element
rreettuurrnn;
}
ccaattcchh (...) {
// vec is unchanged
rreettuurrnn;
}
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.4.2
Guarantees and Tradeoffs
959
ttrryy {
m
m.iinnsseerrtt(m
maakkee__ppaaiirr(ss,xx));
}
ccaattcchh (...) {
// m is unchanged
rreettuurrnn;
}
// add to map
// lst and vec each have a new element with the value x
// m has an element with the value (s,x)
}
If X
X::ccaannnnoott__ccooppyy is caught, a new element may or may not have been inserted into vveecc. If a
new element was inserted, it will be an object in a valid state, but it is unspecified exactly what
the value is. It is possible that after X
X::ccaannnnoott__ccooppyy, some element will have been ‘‘mysteriously’’ duplicated (see §E.8[11]). Alternatively, iinnsseerrtt() may be implemented so that it deletes
some ‘‘trailing’’ elements to be certain that no invalid elements are left in a container.
Unfortunately, providing the strong guarantee for vveeccttoorr’s iinnsseerrtt() without the caveat
about exceptions thrown by copy operations is not feasible. The cost of completely protecting
against an exception while moving elements in a vveeccttoorr would be significant compared to simply providing the basic guarantee in that case.
Element types with copy operations that can throw exceptions are not uncommon. Examples from the standard library itself are vveeccttoorr<ssttrriinngg>, vveeccttoorr< vveeccttoorr<ddoouubbllee> >, and
m
maapp<ssttrriinngg,iinntt>.
The lliisstt and vveeccttoorr containers provide the same guarantees for iinnsseerrtt() of single and multiple elements. The reason is simply that for vveeccttoorr and lliisstt, the same implementation strategies
apply to both single-element and multiple-element iinnsseerrtt(). However, m
maapp provides the
strong guarantee for single-element iinnsseerrtt(), but only the basic guarantee for multiple-element
iinnsseerrtt(). A single-element iinnsseerrtt() for m
maapp that provides the strong guarantee is easily
implemented. However, the obvious strategy for implementing multiple-element iinnsseerrtt() for a
m
maapp is to insert the new elements one after another, and it is not easy to provide the strong guarantee for that. The problem with this is that there is no simple way of backing out of previous
successful insertions if the insertion of an element fails.
If we want an insertion function that provides the strong guarantee that either every element
was successfully added or the operation had no effect, we can build it by constructing a new
container and then ssw
waapp():
tteem
mppllaattee<ccllaassss C
C, ccllaassss IItteerr>
vvooiidd ssaaffee__iinnsseerrtt(C
C& cc, ttyyppeennaam
mee C
C::ccoonnsstt__iitteerraattoorr ii, IItteerr bbeeggiinn, IItteerr eenndd)
{
C ttm
mpp(cc.bbeeggiinn(),ii);
// copy leading elements to temporary
ccooppyy(bbeeggiinn,eenndd,iinnsseerrtteerr(ttm
mpp,ttm
mpp.eenndd())); // copy new elements
ccooppyy(ii,cc.eenndd(),iinnsseerrtteerr(ttm
mpp,ttm
mpp.eenndd())); // copy trailing elements
ssw
waapp(cc,ttm
mpp);
}
As ever, this code may misbehave if the element destructor throws an exception. However, if
an element copy operation throws an exception, the argument container is unchanged.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
960
Standard-Library Exception Safety
Appendix E
E.4.3 Swap
Like copy constructors and assignments, ssw
waapp() operations are essential to many standard
algorithms and are often supplied by users. For example, ssoorrtt() and ssttaabbllee__ssoorrtt() typically
reorder elements, using ssw
waapp(). Thus, if a ssw
waapp() function throws an exception while
exchanging values from a container, the container could be left with unchanged elements or a
duplicate element rather than a pair of swapped elements.
Consider the obvious definition of the standard-library ssw
waapp() function (§18.6.8):
tteem
mppllaattee<ccllaassss T
T> vvooiidd ssw
waapp(T
T& aa, T
T& bb)
{
T ttm
mpp = aa;
a = bb;
b = ttm
mpp;
}
Clearly, ssw
waapp() doesn’t throw an exception unless the element type’s copy constructor or copy
assignment does.
With one minor exception for associative containers, standard container ssw
waapp() functions
are guaranteed not to throw exceptions. Basically, containers are swapped by exchanging the
data structures that act as handles for the elements (§13.5, §17.1.3). Since the elements themselves are not moved, element constructors and assignments are not invoked, so they don’t get
an opportunity to throw an exception. In addition, the standard guarantees that no standardlibrary ssw
waapp() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped. This leaves only one potential source of exceptions:
The comparison object in an associative container is copied as part of the handle. The only possible exception from a ssw
waapp() of standard containers is the copy constructor and assignment of
the container’s comparison object (§17.1.4.1). Fortunately, comparison objects usually have
trivial copy operations that do not have opportunities to throw exceptions.
A user-supplied ssw
waapp() should be written to provide the same guarantees. This is relatively simple to do as long as one remembers to swap types represented as handles by swapping
their handles, rather than slowly and elaborately copying the information referred to by the handles (§13.5, §16.3.9, §17.1.3).
E.4.4 Initialization and Iterators
Allocation of memory for elements and the initialization of such memory are fundamental parts
of every container implementation (§E.3). Consequently, the standard algorithms for constructing objects in uninitialized memory – uunniinniittiiaalliizzeedd__ffiillll(), uunniinniittiiaalliizzeedd__ffiillll__nn(), and
uunniinniittiiaalliizzeedd__ccooppyy() (§19.4.4) – are guaranteed to leave no constructed objects behind if they
throw an exception. They provide the strong guarantee (§E.2). This sometimes involves
destroying elements, so the requirement that destructors not throw exceptions is essential to
these algorithms; see §E.8[14]. In addition, the iterators supplied as arguments to these algorithms are required to be well behaved. That is, they must be valid iterators, refer to valid
sequences, and iterator operations (such as ++ and != and *) on a valid iterator are not allowed
to throw exceptions.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.4.4
Initialization and Iterators
961
Iterators are examples of objects that are copied freely by standard algorithms and operations on standard containers. Thus, copy constructors and copy assignments of iterators should
not throw exceptions. In particular, the standard guarantees that no copy constructor or assignment operator of an iterator returned from a standard container throws an exception. For example, an iterator returned by vveeccttoorr<T
T>::bbeeggiinn() can be copied without fear of exceptions.
Note that ++ and -- on an iterator can throw exceptions. For example, an
iissttrreeaam
mbbuuff__iitteerraattoorr (§19.2.6) could reasonably throw an exception to indicate an input error,
and a range-checked iterator could throw an exception to indicate an attempt to move outside its
valid range (§19.3). However, they cannot throw exceptions when moving an iterator from one
element of a sequence to another, without violating the definition of ++ and -- on an iterator.
Thus, uunniinniittiiaalliizzeedd__ffiillll(), uunniinniittiiaalliizzeedd__ffiillll__nn(), and uunniinniittiiaalliizzeedd__ccooppyy() assume that ++
and -- on their iterator arguments will not throw; if they do throw, either those ‘‘iterators’’
weren’t iterators according to the standard, or the ‘‘sequence’’ specified by them wasn’t a
sequence. Again, the standard containers do not protect the user from the user’s own undefined
behavior (§E.2).
E.4.5 References to Elements
When a reference, a pointer, or an iterator to an element of a container is handed to some code,
that code can corrupt the container by corrupting the element. For example:
vvooiidd ff(ccoonnsstt X
X& xx)
{
lliisstt<X
X> llsstt;
llsstt.ppuusshh__bbaacckk(xx);
lliisstt<X
X>::iitteerraattoorr i = llsstt.bbeeggiinn();
*ii = xx;
// copy x into list
// ...
}
If x is corrupted, lliisstt’s destructor may not be able to properly destroy llsstt. For example:
ssttrruucctt X {
iinntt* pp;
X
X() { p = nneew
w iinntt; }
˜X
X() { ddeelleettee pp; }
// ...
};
vvooiidd m
maalliicciioouuss()
{
X xx;
xx.pp = rreeiinntteerrpprreett__ccaasstt<iinntt*>(77);
ff(xx);
}
// corrupt x
// time bomb
When the execution reaches the end on ff(), the lliisstt<X
X> destructor is called, and that will in
turn invoke X
X’s destructor for the corrupted value. The effect of executing ddeelleettee p when p
isn’t 0 and doesn’t point to an X is undefined and could be an immediate crash. Alternatively, it
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
962
Standard-Library Exception Safety
Appendix E
might leave the free store corrupted in a way that causes difficult-to-track problems much later
on in an apparently unrelated part of a program.
This possibility of corruption should not stop people from manipulating container elements
through references and iterators; it is often the simplest and most efficient way of doing things.
However, it is wise to take extra care with such references into containers. When the integrity
of a container is crucial, it might be worthwhile to offer safer alternatives to less experienced
users. For example, we might provide an operation that checks the validity of a new element
before copying it into an important container. Naturally, such checking can only be done with
knowledge of the application types.
In general, if an element of a container is corrupted, subsequent operations on the container
can fail in nasty ways. This is not particular to containers. Any object left in a bad state can
cause subsequent failure.
E.4.6 Predicates
Many standard algorithms and many operations on standard containers rely on predicates that
can be supplied by users. In particular, all associative containers depend on predicates for both
lookup and insertion.
A predicate used by a standard container operation may throw an exception. In that case,
every standard-library operation provides the basic guarantee, and some operations, such as
iinnsseerrtt() of a single element, provide the strong guarantee (§E.4.1). If a predicate throws an
exception from an operation on a container, the resulting set of elements in the container may
not be exactly what the user wanted, but it will be a set of valid elements. For example, if ==
throws an exception when invoked from lliisstt::uunniiqquuee() (§17.2.2.3), the user cannot assume
that no duplicates are in the list. All the user can safely assume is that every element on the list
is valid (see §E.5.3).
Fortunately, predicates rarely do anything that might throw an exception. However, userdefined <, ==, and != predicates must be taken into account when considering exception safety.
The comparison object of an associative container is copied as part of a ssw
waapp() (§E.4.3).
Consequently, it is a good idea to ensure that the copy operations of predicates that might be
used as comparison objects do not throw exceptions.
E.5 The Rest of the Standard Library
The crucial issue in exception safety is to maintain the consistency of objects; that is, we must
maintain the basic invariants for individual objects and the consistency of collections of objects.
In the context of the standard library, the objects for which it is the most difficult to provide
exception safety are the containers. From the point of view of exception safety, the rest of the
standard library is less interesting. However, note that from the perspective of exception safety,
a built-in array is a container that might be corrupted by an unsafe operation.
In general, standard-library functions throw only the exceptions that they are specified to
throw, plus any thrown by user-supplied operations that they may call. In addition, any function that (directly or indirectly) allocates memory can throw an exception to indicate memory
exhaustion (typically, ssttdd::bbaadd__aalllloocc).
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.5.1
Strings
963
E.5.1 Strings
The operations on ssttrriinnggs can throw a variety of exceptions. However, bbaassiicc__ssttrriinngg manipulates its characters through the functions provided by cchhaarr__ttrraaiittss (§20.2), and these functions
are not allowed to throw exceptions. That is, the cchhaarr__ttrraaiittss supplied by the standard library
do not throw exceptions, and no guarantees are made if an operation of a user-defined
cchhaarr__ttrraaiittss throws an exception. In particular, note that a type used as the element (character)
type for a bbaassiicc__ssttrriinngg is not allowed to have a user-defined copy constructor or a user-defined
copy assignment. This removes a significant potential source of exception throws.
A bbaassiicc__ssttrriinngg is very much like a standard container (§17.5, §20.3). In fact, its elements
constitute a sequence that can be accessed using bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>::iitteerraattoorrs and
bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>::ccoonnsstt__iitteerraattoorrs. Consequently, a string implementation offers the
basic guarantee (§E.2), and the guarantees for eerraassee(), iinnsseerrtt(), ppuusshh__bbaacckk() and ssw
waapp()
(§E.4.1) apply to bbaassiicc__ssttrriinnggs. For example, bbaassiicc__ssttrriinngg<C
Chh,T
Trr,A
A>::ppuusshh__bbaacckk()
offers the strong guarantee.
E.5.2 Streams
If required to do so, iostream functions throw exceptions to signal state changes (§21.3.6). The
semantics of this are well defined and pose no exception-safety problems. If a user-defined
ooppeerraattoorr<<() or ooppeerraattoorr>>() throws an exception, it may appear to the user as if the iostream library threw an exception. However, such an exception will not affect the stream state
(§21.3.3). Further operations on the stream may not find the expected data – because the previous operation threw an exception instead of completing normally – but the stream itself is
uncorrupted. As ever after an I/O problem, a cclleeaarr() may be needed before doing further
reads/writes (§21.3.3, §21.3.5).
Like bbaassiicc__ssttrriinngg, the iostreams rely on cchhaarr__ttrraaiittss to manipulate characters (§20.2.1,
§E.5.1). Thus, an implementation can assume that operations on characters do not throw exceptions, and no guarantees are made if the user violates that assumption.
To allow for crucial optimizations, llooccaallees (§D.2) and ffaacceetts (§D.3) are assumed not to
throw exceptions. If they do, a stream using them could be corrupted. However, the most
likely exception, a ssttdd::bbaadd__ccaasstt from a uussee__ffaacceett (§D.3.1), can occur only in user-supplied
code outside the standard stream implementation. At worst, this will produce incomplete output
or cause a read to fail rather than corrupt the oossttrreeaam
m (or iissttrreeaam
m) itself.
E.5.3 Algorithms
Aside from uunniinniittiiaalliizzeedd__ccooppyy(), uunniinniittiiaalliizzeedd__ffiillll(), and uunniinniittiiaalliizzeedd__ffiillll__nn() (§E.4.4),
the standard offers only the basic guarantee (§E.2) for algorithms. That is, provided that usersupplied objects are well behaved, the algorithms will maintain all standard-library invariants
and leak no resources. To avoid undefined behavior, user-supplied operations should always
leave their operands in valid states, and destructors should not throw exceptions.
The algorithms themselves do not throw exceptions. Instead, they report errors and failures
through their return values. For example, search algorithms generally return the end of a
sequence to indicate ‘‘not found’’ (§18.2). Thus, exceptions thrown from a standard algorithm
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
964
Standard-Library Exception Safety
Appendix E
must originate from a user-supplied operation. That is, the exception must come from an operation on an element – such as a predicate (§18.4), an assignment, or a ssw
waapp() – or from an allocator (§19.4).
If such an operation throws an exception, the algorithm terminates immediately, and it is up
to the functions that invoked the algorithm to handle the exception. For some algorithms, it is
possible for an exception to occur at a point where the container is not in a state that the user
would consider good. For example, some sorting algorithms temporarily copy elements into a
buffer and later put them back into the container. Such a ssoorrtt() might copy elements out of the
container (planning to write them back in proper order later), overwrite them, and then throw an
exception. From a user’s point of view, the container was corrupted. However, all elements are
in a valid state, so recovery should be reasonably straightforward.
Note that the standard algorithms access sequences through iterators. That is, the standard
algorithms never operate on containers directly, only on elements in a container. The fact that a
standard algorithm never directly adds or removes elements from a container simplifies the
analysis of the impact of exceptions. Similarly, if a data structure is accessed only through iterators, pointers, and references to ccoonnsstt (for example, through a ccoonnsstt R
Reecc*), it is usually trivial
to verify that an exception has no undesired effects.
E.5.4 Valarray and Complex
The numeric functions do not explicitly throw exceptions (Chapter 22). However, vvaallaarrrraayy
needs to allocate memory and thus might throw ssttdd::bbaadd__aalllloocc. Furthermore, vvaallaarrrraayy or
ccoom
mpplleexx may be given an element type (scalar type) that throws exceptions. As ever, the standard library provides the basic guarantee (§E.2), but no specific guarantees are made about the
effects of a computation terminated by an exception.
Like bbaassiicc__ssttrriinngg (§E.5.1), vvaallaarrrraayy and ccoom
mpplleexx are allowed to assume that their template
argument type does not have user-defined copy operations so that they can be bitwise copied.
Typically, these standard-library numeric types are optimized for speed, assuming that their element type (scalar type) does not throw exceptions.
E.5.5 The C Standard Library
A standard-library operation without an exception specification may throw exceptions in an
implementation-defined manner. However, functions from the standard C library do not throw
exceptions unless they take a function argument that does. After all, these functions are shared
with C, and C doesn’t have exceptions. An implementation may declare a standard C function
with an empty exception-specification, tthhrroow
w(), to help the compiler generate better code.
Functions such as qqssoorrtt() and bbsseeaarrcchh() (§18.11) take a pointer to function as argument.
They can therefore throw an exception if their arguments can. The basic guarantee (§E.2) covers these functions.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.6
Implications for Library Users
965
E.6 Implications for Library Users
One way to look at exception safety in the context of the standard library is that we have no
problems unless we create them for ourselves: The library will function correctly as long as
user-supplied operations meet the standard library’s basic requirements (§E.2). In particular, no
exception thrown by a standard container operation will cause memory leaks from containers or
leave a container in an invalid state. Thus, the problem for the library user becomes: How can I
define my types so that they don’t cause undefined behavior or leak resources?
The basic rules are:
[1] When updating an object, don’t destroy its old representation before a new representation is completely constructed and can replace the old one without risk of exceptions.
For example, see the implementations of vveeccttoorr::ooppeerraattoorr=(), ssaaffee__aassssiiggnn(), and
vveeccttoorr::ppuusshh__bbaacckk() in §E.3.
[2] Before throwing an exception, release every resource acquired that is not owned by
some (other) object.
[2a] The ‘‘resource acquisition is initialization’’ technique (§14.4) and the language rule
that partially constructed objects are destroyed to the extent that they were constructed (§14.4.1) can be most helpful here. For example, see lleeaakk() in §E.2.
[2b] The uunniinniittiiaalliizzeedd__ccooppyy() algorithm and its cousins provide automatic release of
resources in case of failure to complete construction of a set of objects (§E.4.4).
[3] Before throwing an exception, make sure that every operand is in a valid state. That is,
leave each object in a state that allows it to be accessed and destroyed without causing
undefined behavior or an exception to be thrown from a destructor. For example, see
vveeccttoorr’s assignment in §E.3.2.
[3a] Note that constructors are special in that when an exception is thrown from a constructor, no object is left behind to be destroyed later. This implies that we don’t
have to establish an invariant and that we must be sure to release all resources
acquired during a failed construction before throwing an exception.
[3b] Note that destructors are special in that an exception thrown from a destructor
almost certainly leads to violation of invariants and/or calls to tteerrm
miinnaattee().
In practice, it can be surprisingly difficult to follow these rules. The primary reason is that
exceptions can be thrown from places where people don’t expect them. A good example is
ssttdd::bbaadd__aalllloocc. Every function that directly or indirectly uses nneew
w or an aallllooccaattoorr to acquire
memory can throw bbaadd__aalllloocc. In some programs, we can solve this particular problem by not
running out of memory. However, for programs that are meant to run for a long time or to
accept arbitrary amounts of input, we must expect to handle various failures to acquire
resources. Thus, we must assume every function capable of throwing an exception until we
have proved otherwise.
One simple way to try to avoid surprises is to use containers of elements that do not throw
exceptions (such as containers of pointers and containers of simple concrete types) or linked
containers (such as lliisstt) that provide the strong guarantee (§E.4). Another, complementary,
approach is to rely primarily on operations, such as ppuusshh__bbaacckk(), that offer the strong guarantee that an operation either succeeds or has no effect (§E.2). However, these approaches are by
themselves insufficient to avoid resource leaks and can lead to an ad hoc, overly restrictive, and
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
966
Standard-Library Exception Safety
Appendix E
pessimistic approach to error handling and recovery. For example, a vveeccttoorr<T
T*> is trivially
exception safe if operations on T don’t throw exceptions. However, unless the objects pointed
to are deleted somewhere, an exception from the vveeccttoorr will lead to a resource leak. Thus,
introducing a H
Haannddllee class to deal with deallocation (§25.7) and using vveeccttoorr<Handle<T> >
rather than the plain vveeccttoorr<T
T*> will probably improve the resilience of the code.
When writing new code, it is possible to take a more systematic approach and make sure
that every resource is represented by a class with an invariant that provides the basic guarantee
(§E.2). Given that, it becomes feasible to identify the critical objects in an application and provide roll-back semantics (that is, the strong guarantee – possibly under some specific conditions) for operations on such objects.
Most applications contain data structures and code that are not written with exception safety
in mind. Where necessary, such code can be fitted into an exception-safe framework by either
verifying that it doesn’t throw exceptions (as was the case for the C standard library; §E.5.5) or
through the use of interface classes for which the exception behavior and resource management
can be precisely specified.
When designing types intended for use in an exception-safe environment, we must pay special attention to the operations used by the standard library: constructors, destructors, assignments, comparisons, swap functions, functions used as predicates, and operations on iterators.
This is best done by defining a class invariant that can be simply established by all constructors.
Sometimes, we must design our class invariants so that we can put an object into a state where
it can be destroyed even when an operation suffers a failure at an ‘‘inconvenient’’ point. Ideally, that state isn’t an artifact defined simply to aid exception handling, but a state that follows
naturally from the semantics of the type (§E.3.5).
When considering exception safety, the emphasis should be on defining valid states for
objects (invariants) and on proper release of resources. It is therefore important to represent
resources directly as classes. The vveeccttoorr__bbaassee (§E.3.2) is a simple example of this. The constructors for such resource classes acquire lower-level resources (such as the raw memory for
vveeccttoorr__bbaassee) and establish invariants (such as the proper initialization of the pointers of a
vveeccttoorr__bbaassee). The destructors of such classes implicitly free lower-level resources. The rules
for partial construction (§14.4.1) and the ‘‘resource acquisition is initialization’’ technique
(§14.4) support this way of handling resources.
A well-written constructor establishes the class invariant for an object (§24.3.7.1). That is,
the constructor gives the object a value that allows subsequent operations to be written simply
and to complete successfully. This implies that a constructor often needs to acquire resources.
If that cannot be done, the constructor can throw an exception so that we can deal with that
problem before an object is created. This approach is directly supported by the language and
the standard library (§E.3.5).
The requirement to release resources and to place operands in valid states before throwing
an exception means that the burden of exception handling is shared among the function throwing, the functions on the call chain to the handler, and the handler. Throwing an exception does
not make handling an error ‘‘somebody else’s problem.’’ It is the obligation of functions
throwing or passing along an exception to release resources that they own and to put operands
in consistent states. Unless they do that, an exception handler can do little more than try to terminate gracefully.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
Section E.7
Advice
967
E.7 Advice
[1] Be clear about what degree of exception safety you want; §E.2.
[2] Exception safety should be part of an overall strategy for fault tolerance; §E.2.
[3] Provide the basic guarantee for all classes. That is, maintain an invariant, and don’t leak
resources; §E.2, §E.3.2, §E.4.
[4] Where possible and affordable, provide the strong guarantee that an operation either succeeds or leaves all operands unchanged; §E.2, §E.3.
[5] Don’t throw an exception from a destructor; §E.2, §E.3.2, §E.4.
[6] Don’t throw an exception from an iterator navigating a valid sequence; §E.4.1, §E.4.4.
[7] Exception safety involves careful examination of individual operations; §E.3.
[8] Design templates to be transparent to exceptions; §E.3.1.
[9] Prefer the constructor approach to resource requisition to using iinniitt() functions; §E.3.5.
[10] Define an invariant for a class to make it clear what is a valid state; §E.2, §E.6.
[11] Make sure that an object can always be put into a valid state without fear of an exception
being thrown; §E.3.2, §E.6.
[12] Keep invariants simple; §E.3.5.
[13] Leave all operands in valid states before throwing an exception; §E.2, §E.6.
[14] Avoid resource leaks; §E.2, §E.3.1, §E.6.
[15] Represent resources directly; §E.3.2, §E.6.
[16] Remember that ssw
waapp() can sometimes be an alternative to copying elements; §E.3.3.
[17] Where possible, rely on ordering of operations rather than on explicit use of try-blocks;
§E.3.4.
[18] Don’t destroy ‘‘old’’ information until its replacement has been safely produced; §E.3.3,
§E.6.
[19] Rely on the ‘‘resource acquisition is initialization’’ technique; §E.3, §E.3.2, §E.6.
[20] Make sure that comparison operations for associative containers can be copied; §E.3.3.
[21] Identify critical data structures and provide them with operations that provide the strong
guarantee; §E.6
E.8 Exercises
1.
2.
3.
4.
5.
6.
7.
8.
(∗1) List all exceptions that could possibly be thrown from ff() in §E.1.
(∗1) Answer the questions after the example in §E.1.
(∗1) Define a class T
Teesstteerr that occasionally throws exceptions from basic operations, such
as copy constructors. Use T
Teesstteerr to test your standard-library containers.
(∗1) Find the error in the ‘‘messy’’ version of vveeccttoorr’s constructor (§E.3.1), and write a
program to get it to crash. Hint: First implement vveeccttoorr’s destructor.
(∗2) Implement a simple list providing the basic guarantee. Be very specific about what
the list requires of its users to provide the guarantee.
(∗3) Implement a simple list providing the strong guarantee. Carefully test this list. Give
an argument why people should believe it to be safe.
(∗2.5) Reimplement SSttrriinngg from §11.12 to be as safe as a standard container.
(∗2) Compare the run time of the various versions of vveeccttoorr’s assignment and
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
968
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
Standard-Library Exception Safety
Appendix E
ssaaffee__aassssiiggnn() (§E.3.3).
(∗1.5) Copy an allocator without using an assignment operator (as needed to improve
ooppeerraattoorr=() in §E.3.3).
(∗2) Add single-element and multiple-element eerraassee() and iinnsseerrtt() that provide the
basic guarantee to vveeccttoorr (§E.3.2).
(∗2) Add single-element and multiple-element eerraassee() and iinnsseerrtt() that provide the
strong guarantee to vveeccttoorr (§E.3.2). Compare the cost and complexity of these solutions to
the solutions to exercise 10.
(∗2) Write a ssaaffee__iinnsseerrtt() (§E.4.2) that inserts elements into the existing vveeccttoorr (rather
than copying to a temporary). What constraints do you have to impose on operations?
(∗2) Write a ssaaffee__iinnsseerrtt() (§E.4.2) that inserts elements into the existing m
maapp (rather than
copying to a temporary). What constraints do you have to impose on operations?
(∗2.5) Compare the size, complexity, and performance of the ssaaffee__iinnsseerrtt() functions
from exercises 12 and 13 to the ssaaffee__iinnsseerrtt() from §E.4.2.
(∗2.5) Write a better (simpler and faster) ssaaffee__iinnsseerrtt() for associative containers only.
Use traits to write a ssaaffee__iinnsseerrtt() that automatically selects the optimal ssaaffee__iinnsseerrtt() for
a container. Hint: §19.2.3.
(∗2.5) Try to rewrite uunniinniittiiaalliizzeedd__ffiillll() (§19.4.4, §E.3.1) to handle destructors that
throw exceptions. Is that possible? If so, at what cost? If not, why not?
(∗2.5) Try to rewrite uunniinniittiiaalliizzeedd__ffiillll() (§19.4.4, §E.3.1) to handle iterators that throw
exceptions for ++ and --. Is that possible? If so, at what cost? If not, why not?
(∗3) Take a container from a library different from the standard library. Examine its documentation to see what exception-safety guarantees it provides. Do some tests to see how
resilient it is against exceptions thrown by memory allocation and user-supplied code.
Compare it to a corresponding standard-library container.
(∗3) Try to optimize the vveeccttoorr from §E.3 by disregarding the possibility of exceptions.
For example, remove all try-blocks. Compare the performance against the version from
§E.3 and against a standard-library vveeccttoorr implementation. Also, compare the size and the
complexity of the source code of these different vveeccttoorrs.
(∗1) Define invariants for vveeccttoorr (§E.3) with and without the possibility of vv==00 (§E.3.5) .
(∗2.5) Read the source of an implementation of vveeccttoorr. What guarantees are implemented
for assignment, multi-element iinnsseerrtt(), and rreessiizzee()?
(∗3) Write a version of hhaasshh__m
maapp (§17.6) that is as safe as a standard container.
The C++ Programming Language, Special Edition by Bjarne Stroustrup. Copyright 2000 by AT&T.
Published by Addison Wesley Inc. ISBN 0-201-70073-5. All rights reserved.
________________________________________
________________________________________________________________________________________________________________________________________________________________
I
________________________________________
________________________________________________________________________________________________________________________________________________________________
Index
Is there another word for synonym?
– anon
## 162
** 263
-1 831
->* 853
.* 853
16-bit character 580
7-bit char 580
8-bit char 580
bitset 494
,
and [] 838
operator 123
predefined 264
prohibiting 264
!
for basic_ios 616
logical_not 516
valarray 664
!=
bitset 494
complex 680
generated 468
iterator 551
not_equal_to 516
string 591
valarray 667
#, preprocessing directive 813
$ character 81
%
modulus 517
valarray 667
%: digraph 829
%:%: digraph 829
%=, valarray 664
%> digraph 829
&
bitset 495
bitwise and operator 124
predefined 264
prohibiting 264
valarray 667
&&
logical and operator 123
logical_and 516
valarray 667
&=
of bitset 494
valarray 664
’, character literal 73
*
and [], -> and 290
complex 680
iterator 551
multiplies 517
valarray 667
*=
complex 679
valarray 664
+
complex 680
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
870
Index
iterator 551
plus 517
string 593
user-defined 265
user-defined operator 281
valarray 667
++
increment operator 125
iterator 551
user-defined operator 264, 291
+=
advance() and 551
complex 679
iterator 551
operator 109
string 592
user-defined operator 264, 268, 281
valarray 664
complex 680
distance() and 551, 554
iterator 551
minus 517
negate 517
valarray 664, 667
-decrement operator 125
iterator 551
user-defined operator 291
-=
complex 679
iterator 551
operator 109
valarray 664
->
and * and [] 290
iterator 551
member access operator 102
user-defined operator 289
->*, pointer to member 418
.
floating-point 74
member access operator 101
.*, pointer to member 418
..., ellipsis 154
/
complex 680
divides 517
valarray 667
/* comment 161
//
comment 10
difference from C 816
/=
complex 679
valarray 664
::
and virtual function, operator 312
explicit qualification 847
namespace and 169
operator 305
scope resolution operator 82, 228
::*, pointer to member 418
:> digraph 829
<
comparison 467
iterator 551
less 516
string 591
template syntax 811
valarray 667
vector 457
<% digraph 829
<: digraph 829
<<
bitset 494
bitset 495
complex 680
for output why 607
inserter 608
of char 611
of complex 612
of pointer to function 631
of streambuf 642
ostream 609
output cout 46
output operator 607
precedence 608
put to 607
string 598
valarray 667
virtual 612
<<=
of bitset 494
valarray 664
<=
generated 468
iterator 551
less_equal 516
string 591
valarray 667
=
map 484
predefined 264
prohibiting 264
string 587
user-defined operator 281
valarray 663
vector 447
==
bitset 494
complex 680
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
Index
equal_to 516
equality without 468
iterator 551
string 591
user-defined 534
valarray 667
vector 457
>
and >> 812
generated 468
greater 516
iterator 551
string 591
valarray 667
>=
generated 468
greater_equal 516
iterator 551
string 591
valarray 667
>>
> and 812
bitset 494
bitset 495
char 615
complex 680
extractor 608
get from 607
input cin 50, 112
istream 614
of complex 621
of pointer to function 632
string 598
valarray 667
>>=
of bitset 494
valarray 664
?:, arithmetic-if 134
[]
, and 838
-> and * and 290
and insert() 488
bitset 494
design of 295
iterator 551
map 482
of vector 445
on string 584
valarray 663
\
backslash 830
escape character 73, 830
\’, single quote 830
^
bitset 495
bitwise exclusive or operator 124
valarray 667
^=
of bitset 494
valarray 664
_ character 81
|
bitset 495
bitwise or operator 124
valarray 667
|=
of bitset 494
valarray 664
||
logical or operator 123
logical_or 516
valarray 667
~, valarray 664
0
constant-expression 835
false and 71
null integer value 830
string and 587
zero null 88
-1 and size_t 448
1, true and 71
A
Aarhus 536
abort() 218, 380
abs() 660– 661, 680
valarray 667
abstract
and concrete type 771
class 708
class 313
class and design 318
class, class hierarchy and 324
iterator 435
node class 774
type 34, 767, 769
abstraction
classes and 733
data 30
abstraction, late 437
abstraction, levels of 733
access 278
checked 445
control 225, 402
control and base class 405
control and multiple-inheritance 406
control, cast and 414
control, run-time 785
control, using-declaration and 407
element 445
operator, design of 295
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
871
872
Index
to base 850
to member 849
to member class 851
unchecked 445
accumulate() 682
acos(), valarray 667
acquisition, resource 364
action 776
Ada 10, 725
adapter
member function 520
pointer to function 521
adapters, container 469
add element to sequence 529
adding
to container 555
to sequence 555
to standard library 434
address of element 454
addressing, united of 88
adjacent_difference() 684
adjacent_find() 525
adjustfield 626, 630
adoption of C++, gradual 718
advance() and += 551
aims
and means 694
design 700
Algol68 10
algorithm 56
C-style function and 522
and member function 520
and polymorphic object 63
and polymorphism 520
and sequence 508
and string 584
container and 507
conventions 508
design 510
exception container 566
generalized numeric 682
generic 41
modifying sequence 529
nonmodifying sequence 523
on array 544
return value 508
summary 509
<algorithm> 432, 509
algorithms, standard library 64
alias, namespace 178
alignment 102
all, catch 362
allocate array 128
allocate() 567
allocation
C-style 577
– A–
and de-allocation 127
unit of 88
allocator 567
Pool_alloc 572
general 573
nothrow 823
use of 568
user-defined 570
allocator 567
allocator_type 443, 480
alternative
design 710
error handling 192, 355
implementation 320
interface 173
return 357
to macro 161
ambiguity
dynamic_cast and 412
resolution, multiple-inheritance 391
ambiguous type conversion 276
ambition 693
analogy
bridge 724
car factory 698
plug 728
proof by 692
units 728
analysis
design and 696
error 711
experimentation and 710
method, choosing an 696
stage 697
and C-style string, string 579
and
keyword 829
operator &, bitwise 124
operator &&, logical 123
and_eq keyword 829
Annemarie 92
anomaly, constructor and destructor 245
anonymous union 841
ANSI
C 13
C++ 11
any() 494
app append to file 639
append to file, app 639
append(), string 592
application framework 731, 786
apply() to valarray 664
architecture 696
arg() 680
argc, main() argv 117
argument
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– A–
array 147
command line 117
deducing template 335, 855
default 153
depend on template 861
explicit template 335
function template 335
passing 283
passing, function 145
reference 98
template 331
type check, function 145
type conversion, function 145
type, difference from C 817
types, virtual function 310
undeclared 154
value, example of default 227
argv argc, main() 117
arithmetic
conversions, usual 122
conversions, usual 836
function object 517
mixed-mode 268
pointer 88, 93, 125
type 70
vector 65, 662
arithmetic-if ?: 134
array 26, 88
algorithm on 544
allocate 128
argument 147
array of 837
as container 496
assignment 92
associative 286, 480
by string, initialization of 89
deallocate 128
delete 250
element, constructor for 250
element object 244
initializer 89
initializer, difference from C 818
layout 669
multidimensional 668, 677, 836
new and 423
of array 837
of objects 250
passing multidimensional 839
pointer and 91, 147
string and 589
valarray and 663
valarray and vector and 662
arrays, numeric 662
ASCII 580, 829
character set 73, 601
asin() 660
Index
valarray 667
asm assembler 806
assembler 8, 10
asm 806
Assert() 750
assert() 750
<assert.h> 432
assertion checking 750
assign()
char_traits 581
string 588
vector 447
assignment
and initialization 283
array 92
copy 246, 283
function call and 99
map 484
of class object 245
operator 110, 268
string 587
to self 246
valarray 663
Assoc example 286
associative
array 286, 480
array – see map
container 480
container, sequence and 461
associativity of operator 121
asynchronous event 357
at() 53, 445
on string 585
out_of_range and 385
atan() 660
valarray 667
atan2() 660
valarray 667
ate 639
atexit()
and destructors 218
and exception 382
atof() 600
atoi() 589, 600
atol() 600
AT&T Bell Laboratories 11
auto 843
automatic
garbage collection 247, 844
memory 843
memory management 844
object 244
auto_ptr 367
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
873
874
Index
B
\b, backspace 830
back() 445
of queue 476
back_inserter() 57, 555
back_insert_iterator 555
backslash \ 830
backspace \b 830
bad() 616
bad_alloc 129
and new 384
exception 576
missing 823
badbit 617
bad_cast 410
and dynamic_cast 384
bad_exception 378, 384
bad_typeid and typeid() 384
balance 695
base
access to 850
and derived class 39, 737
class 303
class, access control and 405
class, initialization of 306
class, overriding from virtual 401
class, private 743
class, private member of 305
class, protected 743
class, replicated 394
class, universal 438
class, virtual 396
member or 740
override private 738
private 405, 742
protected 319, 405
basefield 626– 627
Basic 725
basic_filebuf, class 648
basic_ios 608, 616, 622, 629
! for 616
format state 606
stream state 606
basic_iostream 637
formatting 606
basic_istream 613
basic_ofstream 638
basic_ostream 608– 609
basic_streambuf 645
buffering 606
basicstring
begin() 584
end() 584
rbegin() 584
rend() 584
basic_string 582
– B–
const_iterator 583
const_pointer 583
const_reference 583
const_reverse_iterator 583
difference_type 583
iterator 583
member type 582
pointer 583
reference 583
reverse_iterator 583
size_type 583
traits_type 583
value_type 583
basic_stringstream 640
BCPL 10
before() 415
beg, seekdir and
begin() 54, 481
basicstring 584
iterator 444
behavior, undefined 828
Bell Laboratories, AT&T 11
Bi 511
bibliography, design 719
bidirectional iterator 550
bidirectional_iterator_tag 553
big-O notation 464
binary
mode, binary 639
operator, user-defined 263
search 540, 546
binary binary mode 639
binary_function 515
binary_negate 518
not2() and 522
binary_search() 540
bind1st() 518
and binder1st 520
bind2nd() 518
binder1st 518
bind1st() and 520
binder2nd 518– 519
binding
name 860
strength, operator 121, 607
BinOp 511
BinPred 511
bit
field 125, 840
field, bitset and 492
pattern 73
position 492
reference to 492
vector 124
bitand keyword 829
bitor keyword 829
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– B–
bits
in char 658
in float 658
in int 658
<bitset> 431
bitset 492
494
!= 494
& 495
&= of 494
<< 495
<< 494
<<= of 494
== 494
>> 495
>> 494
>>= of 494
[] 494
^ 495
^= of 494
and bit field 492
and enum 492
and set 492
and vector<bool> 492
constructor 493
flip() 494
input 495
operation 494
output 495
reset() 494
set() 494
| 495
|= of 494
bitset(), invalid_argument and 385
bitwise
and operator & 124
exclusive or operator ^ 124
logical operators 124
or operator | 124
blackboard as design tool 703
BLAS 668
bool 71
conversion to 835
input of 615
output of 610
vector of 458
boolalpha 610, 625
boolalpha() 633
break 109, 116
case and 134
statement 116
bridge analogy 724
bsearch() 546
buffer
memory 575
ostream and 642
Index
position in 642
Buffer 331, 335
example 738
buffering 642
I/O 645
basic_streambuf 606
built-in
feature vs technique 43
type 70
type, constructor for 131
type, input of 614
type, output of 609
type, user-defined operator and 265
by
reference, catch 360
value, catch 359
byte 76
C
C
//, difference from 816
ANSI 13
C++ and 13– 14, 199
and C++ 7
and C++ compatibility 816
and C++, mixing 719
and exception 383
and, learning 7
argument type, difference from 817
array initializer, difference from 818
declaration and definition, difference from 818
difference from 816
enum, difference from 817
function call, difference from 816
function definition, difference from 817
initialization and goto, difference from 818
input and output 651
int implicit, difference from 817
jump past initialization, difference from 818
linkage to 205
macros, difference from 818
programmer 14
scope, difference from 816
sizeof, difference from 816
standard library 599
struct name, difference from 818
struct scope, difference from 818
void* assignment, difference from 818
with Classes 10
with classes 686
C++ 21
.c file 202
cache example 232
calculator example 107, 165, 190, 208
call
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
875
876
Index
by reference 98, 146
by value 146
function 145
of destructor, explicit 256
callback 371
stream 650
call-by reference 282
callC() example 384
call_from_C() example 384
calloc() 577
capacity() 455
car factory analogy 698
Car example 772
card, CRC 703
c_array 496
carriage return \r 830
CASE 711, 725, 730
case and break 134
case-sensitive compare 591
<cassert> 432
cast
C-style 131
and access control 414
cross 408
deprecated C-style 819
down 408
up 408
casting away const 414
catch all 362
catch 186, 361
all 362
by reference 360
by value 359
every exception 54
catch(...) 54
category, iterator 553
<cctype> 432, 601
ceil() 660
cerr 609
and clog 624
initialization 637
<cerrno> 432
<cfloat> 433, 660
C-function and exception 382
C++ 10
ANSI 11
C and 7
ISO 11
and C 13– 14, 199
compatibility, C and 816
design of 7, 10
feature summary 819
functional decomposition and 726
gradual adoption of 718
gradual approach to learning 7
introducing 718
– C–
large programs and 9
learning 6, 718, 820
library, first 686
misuse of 725
mixing C and 719
procedural programming and 725
programmer 14
properties of 724
standardization 11
style subscript 674
teaching and 12
use of 12
change 700
incremental 684
response to 698
size of sequence 529
changing interface 774
char 73, 76
7-bit 580
8-bit 580
<< of 611
>> 615
bits in 658
character type 71
get() 620
input 618
input of 615
output of 610
signed 831
unsigned 831
char*, specialization and 344
character 580
$ 81
16-bit 580
\, escape 73, 830
_ 81
buffer, streambuf and 642
classification, wide 601
literal ’ 73
name, universal 831
national 829
set 829
set, ASCII 73, 601
set, large 831
set, restricted 829
special 830
string 432
traits 580
type 580
type char 71
value of 580
characters in name 81
CHAR_BIT 660
char_traits 580
assign() 581
char_type 580
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– C–
compare() 581
copy() 581
eof() 581
eq() 581
eq_int_type() 581
find() 581
get_state() 581
int_type() 581
length() 581
lt() 581
move() 581
not_eof() 581
off_type 581
pos_type 581
state_type 581
to_char_type() 581
to_int_type() 581
char_traits<char> 580
char_traits<wchart> 581
char_type 608
char_traits 580
check, range 445, 561
checked
access 445
iterator 561
pointer 291
Checked example 565
Checked_iter example 561
checking
assertion 750
for, wild pointer 722
invariant 749
checking, late, missing 823
checking
of exception-specification 376
range 275, 781
choosing
a design method 696
an analysis method 696
cin 614
>>, input 50, 112
cout and 624
initialization 637
value of 276
circle and ellipse 703
class
abstract 313
and design, abstract 318
and type 724
base 303
basic_filebuf 648
concept and 301
constructor for derived 306
conversion of pointer to 304
derived 15, 303
destructor for derived 306
Index
forward reference to 278
friend 279
function 776
handle 782
hierarchy 15, 307, 734
hierarchy and abstract class 324
hierarchy and template 345
hierarchy design 314
hierarchy, reorganization of 707
initialization of base 306
interface 778
member, constructor for 247
member of derived 304
member, private 225
member, public 225
node 772
object, assignment of 245
operations, set of 237
overriding from virtual base 401
pointer to 304
private base 743
private member of base 305
protected base 743
storage 244
use of 725
class 16, 32
abstract 708
abstract node 774
access to member 851
and concept 223
base and derived 39, 737
concrete 236, 241, 766
concrete node 774
declaration 225
definition 225
free-standing 732
function-like 514
helper 293
hierarchy 38, 389
hierarchy navigation 411
kind of 765
lattice 389
leaf 774
member 293
namespace and 849
nested 293
not a 705
random number 685
string 292
struct and 234
template and 348
union and 843
universal base 438
user-defined type 224
classes
and abstraction 733
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
877
878
Index
and concepts 732
and real-world 734
design and 732
find the 702
finding the 734
stream 637
use of 733
classic() locale 649
classification 703
cleanup, initialization and 364
clear goal 698
clear() , 487, 616
failure and 385
<climits> 433, 660
<clocale> 433, 650
Clock example 398
clog 609
cerr and 624
initialization 637
clone 424
clone() 426
close() 639
closing
of file 638
of stream 639
closure 676
cloud example 700
Clu 10
Club_eq 516
<cmath> 434, 660
Cmp 339, 511
Cobol 725
code
bloat, curbing 342
uniformity of 767
coders and designers 695
coercion 267
collaboration, design 708
collating sequence 338
collector,
conservative 846
copying 846
comma and subscripting 838
command line argument 117
comment 138
/* 161
// 10
common
code and constructor 624
code and destructor 624
commonality 301
communication 694– 695, 717
compare, case-sensitive 591
compare()
char_traits 581
string 590
– C–
comparison
< 467
default 467
equality and 457
in map 484
requirement 467
string 590
user-supplied 467
compatibility 13
C and C++ 816
compilation
separate 27, 198
template separate 351
unit of 197
compile time, header and 211
compile-time polymorphism 347
compl keyword 829
complete encapsulation 283
complex 64, 267
!= 680
* 680
*= 679
+ 680
+= 679
- 680
-= 679
/ 680
/= 679
<< 680
<< of 612
== 680
>> 680
>> of 621
and complex<> 680
conversion 681
cos() 680
cosh() 680
expr() 680
input 680
log() 680
log10() 680
mathematical functions 680
operations 679
output 680
pow() 680
sin() 680
sinh() 680
sqrt() 680
tanh() 680
complex<>, complex and 680
complexity divide and conquer 693
component 701, 755
standard 698, 714
composite operator 268
composition of namespace 181
compositor 677
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– C–
computation, numerical 64
concatenation, string 592– 593
concept 15
and class 301
class and 223
independent 327
concepts, classes and 732
concrete
class 236, 241, 766
class, derive from 780
node class 774
type 33, 236, 766– 767
type, abstract and 771
type and derivation 768
type, problems with 37
type, reuse of 241
type, reuse of 768
condition 753
declaration in 135
conditional expression 134
conj() 680
connection between input and output 623
const 94
C-style string and 90
and linkage 199
and overloading 600
casting away 414
function, inspector 706
iterator 443
member 249
member function 229
physical and logical 231
pointer 96
pointer to 96
constant
expression 833
in-class definition of 249
member 249
time 464
constant-expression 0 835
const_cast 131, 232
const_iterator 54, 443, 480
basic_string 583
const_mem_fun1_ref_t 518, 521
const_mem_fun1_t 518, 521
const_mem_fun_ref_t 518, 521
const_mem_fun_t 518, 521
const_pointer 443
basic_string 583
const_reference 443, 480
basic_string 583
const_reverse_iterator 443, 480
basic_string 583
construct() 567
construction
and destruction 244
Index
and destruction, order or 414
order of 248, 252
valarray 662
constructor 32– 33, 226, 706
and C-style initialization 270
and conversion 272
and destructor 242, 246– 247
and destructor anomaly 245
and initializer list 270
and template, copy 348
and type conversion 269, 275
and union 257
and virtual base 397
bitset 493
common code and 624
copy 246, 283
default 243
default copy 271
exception in 367
exceptions in 371
explicit 284
for array element 250
for built-in type 131
for class member 247
for derived class 306
for free store object 246
for global variable 252
for local variable 245
map 484
pointer to 424
string 585
vector 447
virtual 323, 424
constructors, exceptions and 366
cont iterator 508
container 40, 52
STL 441
Simula-style 438
Smalltalk-style 438
adapters 469
adding to 555
algorithm, exception 566
and algorithm 507
and iterator 435, 444
and polymorphism 520
array as 496
associative 480
container, based 438
container
design 434, 441
implementation of 465
input into 451
intrusive 438
iterator 464
kind of 461
memory management 455, 567
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
879
880
Index
operation on 464
optimal 435
representation of 465
sequence and 512
standard library 56
standard-library 442
string as 491
summary 464
user-defined 497
valarray as 492
containers 431
containment 738
and inheritance 740
context
of template definition 860
of template instantiation 860
continue 116
statement 116
contravariance 420
control, format 625
controlled statement 136
convenience
and orthogonality 431
vs. safety 847
conventions
algorithm 508
lexical 794
national 649
conversion 706
ambiguous type 276
complex 681
constructor and 272
constructor and type 269, 275
explicit 284
floating-point 834
implicit 275, 281, 284
implicit type 76, 276, 833
integer 834
of pointer to class 304
of string, implicit 590
operator, type 275
pointer 834
signed unsigned integer 834
string 589
to bool 835
to floating-point 835
to integer type 835
to integral 835
undefined enum 77
user-defined 347
user-defined pointer 349
user-defined type 267, 281
conversions 747
usual arithmetic 122
conversions, usual, arithmetic 836
cookbook method 692
– C–
copy 229, 245, 271
assignment 246, 283
constructor 246, 283
constructor and template 348
constructor, default 271
delayed 295
memberwise 283
of exception 362
requirement 466
copy() 42, 529, 589
char_traits 581
_copy suffix 533
copy_backward() 529
copyfmt() 627
copyfmt_event 651
copyfmt_event, copyfmt() 651
copy_if() not standard 530
copying, elimination of 675
copy-on-write 295
cos() 660
complex 680
valarray 667
cosh() 660
complex 680
valarray 667
cost of exception 381
count() 57, 494, 526
in map 485
count_if() 62, 526
counting, reference 783
coupling, efficiency and 768
cout 609
<<, output 46
and cin 624
initialization 637
Cowboy example 778
__cplusplus 206
CRC card 703
create dependency 745
creation
localization of object 322
object 242
criteria
sorting 534
standard library 430
cross cast 408
<csetjmp> 433
cshift() 664
<csignal> 433
<cstdarg> 433
<cstdio> 202, 432
<cstdlib> 219, 432, 434, 546, 577, 600, 661
c_str() 589
<cstring> 432, 577, 599
C-style
allocation 577
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– C–
cast 131
cast, deprecated 819
error handling 661
function and algorithm 522
initialization, constructor and 270
string and const 90
string, string and 579
string, string and 589
<ctime> 431, 433
<ctype.h> 432, 601
cur, seekdir and
curbing code bloat 342
Currying 520
<cwchar> 432
<cwtype> 432, 601
<cwtype.h> 601
cycle, development 698
D
data
abstraction 30
abstraction vs inheritance 727
member, pointer to 853
per-object 573
per-type 573
data() 589
date, format of 649
Date example 236
DBL_MINEXP 660
deallocate array 128
deallocate() 567
de-allocation, allocation and 127
debugging 226
dec 626– 627, 634
decimal 73
output 626
decision, delaying 706
declaration 23, 78– 79
and definition, difference from C 818
and definition, namespace member 167
class 225
friend 279
function 143
in condition 135
in for statement 137
of member class, forward 293
point of 82
declaration 803
declarations, keeping consistent 201
declarator operator 80
declarator 807
decomposition, functional 725
decrement
increment and 291
operator -- 125
Index
deducing template argument 335, 855
default
argument 153
argument value, example of 227
comparison 467
constructor 243
copy constructor 271
template argument 340, 824
value 239
value, supplying 500
default 109
#define 160
definition 78
class 225
context of template 860
difference from C declaration and 818
function 144
in-class 235
namespace member declaration and 167
of constant, in-class 249
of virtual function 310
point of 861
using-directive and 180
delayed copy 295
delaying decision 706
delegation 290
delete
element from sequence 529, 534
from hash_map 501
delete
and garbage collection 845
array 250
delete[] and 250
operator 127
size and 421
delete(), operator 129, 576
delete[] 128
and delete 250
delete[](), operator 423, 576
delete_ptr() example 531
denorm_min() 659
depend on template argument 861
dependencies 724
dependency 15
create 745
inheritance 737
minimize 173
use 745
deprecated
C-style cast 819
static 818
<deque> 431
deque, double-ended queue 474
derivation, concrete type and 768
derive
from concrete class 780
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
881
882
Index
without virtual 780
derived
and friend 852
class 15, 303
class, base and 39, 737
class, constructor for 306
class, destructor for 306
class, member of 304
exceptions 359
design 696
I/O 605
abstract class and 318
aims 700
algorithm 510
alternative 710
and analysis 696
and classes 732
and language 724
and language, gap between 725
and programming 692
bibliography 719
class hierarchy 314
collaboration 708
container 434, 441
error 711
for testing 712
how to start a 708
hybrid 718
inheritance and 707
integrity of 716
language and programming language 730
method 694
method, choosing a 696
object-oriented 692, 726
of C++ 7, 10
of [] 295
of access operator 295
reuse 709
stability of 708
stage 697
standard library 429– 430
steps 701
string 579
template in 757
tool, blackboard as 703
tool, presentation as 704
tool, tutorial as 708
tools 711
unit of 755
designers, coders and 695
destroy() 567
destruction
construction and 244
order or construction and 414
destructor 33, 283
and garbage collection 846
– D–
and union 257
anomaly, constructor and 245
common code and 624
constructor and 242, 246– 247
exception in 373
explicit call of 256
for derived class 306
virtual 319
destructors
atexit() and 218
exceptions and 366
development
cycle 698
process 696
software 692
stage 697
diagnostics 432
diamond-shaped inheritance 399
dictionary 480
– see map
difference
from C 816
from C // 816
from C argument type 817
from C array initializer 818
from C declaration and definition 818
from C enum 817
from C function call 816
from C function definition 817
from C initialization and goto 818
from C int implicit 817
from C jump past initialization 818
from C macros 818
from C scope 816
from C sizeof 816
from C struct name 818
from C struct scope 818
from C void* assignment 818
difference_type 443, 480, 552
basic_string 583
digits 658
digits10 659
digraph
%: 829
%:%: 829
%> 829
:> 829
<% 829
<: 829
direct manipulation 730
directed acyclic graph 308
direction
of seek, seekdir
of seekg()
of seekp()
directive
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– D–
#, preprocessing 813
template instantiation 866
discrimination of exception 188
disguised pointer 844
dispatch, double 326
distance() and - 551, 554
distribution
exponential 685
uniform 685
div() 661
divide and conquer, complexity 693
divides / 517
div_t 661
do statement 114, 137
documentation 714– 715
do_it() example 777
domain error 661
dominance 401
Donald Knuth 713
dot product 684
double
dispatch 326
quote 830
double 74
output 626
double-ended queue deque 474
doubly-linked list 470
down cast 408
draw_all() example 520
Duff’s device 141
dynamic
memory 127, 576, 843
store 34
type checking 727
type checking, mis-use of 439
dynamic_cast 407– 408
and ambiguity 412
and polymorphism 409
and static_cast 413
bad_cast and 384
implementation of 409
to reference 410
use of 774
E
eatwhite() 620
eback() 645
EDOM 661
efficiency 8, 713
and coupling 768
and generality 431
of operation 464
egptr() 645
element
access 445
Index
access, list 472
access, map 482
address of 454
constructor for array 250
first 445
from sequence, delete 529, 534
last 445
object, array 244
requirements for 466
to sequence, add 529
eliminate_duplicates() example 534
eliminating programmers 730
elimination
of copying 675
of temporary 675
ellipse, circle and 703
ellipsis ... 154
else 134
emphasis, examples and 5
Employee example 302
empty string 585
empty() 455, 489
string 598
encapsulation 754
complete 283
end, seekdir and
end() 54, 481
basicstring 584
iterator 444
#endif 162
endl 634
ends 634
engineering, viewgraph 704
enum 76
and integer 77
bitset and 492
conversion, undefined 77
difference from C 817
member 249
sizeof 78
user-defined operator and 265
enumeration 76
switch on 77
enumerator 76
EOF 620, 653
eof() 616
char_traits 581
eofbit 617
epptr() 645
epsilon() 659
eq(), char_traits 581
eq_int_type(), char_traits 581
equal() 527
equality
and comparison 457
hash_map 497
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
883
884
Index
without == 468
equal_range() 540
in map 485
equal_to == 516
equivalence, type 104
Erand 685
ERANGE 601, 661
erase()
from map 487
from vector 452
in string 595
errno 383, 601, 661
<errno.h> 432
error
analysis 711
design 711
domain 661
exception and 622
handling 115, 186, 383, 566
handling, C-style 661
handling alternative 192, 355
handling, multilevel 383
linkage 199
loop and 523
range 661
recovery 566
reporting 186
run-time 29, 355
sequence 512
string 586
errors, exceptions and 355, 374
escape character \ 73, 830
essential operators 283
evaluation
lazy 707
order of 122
short-circuit 123, 134
event
asynchronous 357
driven simulation 326
event 651
event_callback 651
example
(bad), Shape 417
Assoc 286
Buffer 738
Car 772
Checked 565
Checked_iter 561
Clock 398
Cowboy 778
Date 236
Employee 302
Expr 424
Extract_officers 524
Filter 786
– E–
Form 635
Hello, world! 46
Io 776
Io_circle 775
Io_obj 774
Ival_box 315, 407
Lock_ptr 366
Math_container 346
Matrix 282
Object 417
Plane 729
Pool 570
Range 781
Rational 747
Saab 728
Set 769
Set_controller 785
Shape 774
Slice_iter 670
Stack 27
Storable 396
String 328
Substring 596
Table 243
Vector 341, 780
Vehicle 734
Window 398
cache 232
calculator 107, 165, 190, 208
callC() 384
call_from_C() 384
cloud 700
delete_ptr() 531
do_it() 777
draw_all() 520
eliminate_duplicates() 534
identity() 531
iocopy() 617
iosbase::Init 639
iseq() 513
of default argument value 227
of input 114
of operator overloading 292
of reference 292
of user-defined memory management 292
of virtual function 646
oseq() 556
scrollbar 743
sort() 158, 334
example:, member template 349
examples and emphasis 5
exception 29, 186, 355
C and 383
C-function and 382
I/O 622
and error 622
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– E–
and function 375
and interface 375
and main() 54
and member initialization 373
and multiple inheritance 360
and new 367, 369
and recursive function 374
atexit() and 382
bad_alloc 576
catch every 54
container algorithm 566
copy of 362
cost of 381
discrimination of 188
handler 812
in constructor 367
in destructor 373
mapping 378
new and 576
qsort() and 382
standard 384
type of 379
<exception> 379– 380, 384– 385, 433
exception hierarchy 385
exceptions 357
and constructors 366
and destructors 366
and errors 355, 374
derived 359
grouping of 358
in constructor 371
uncaught 380
unexpected 377
exceptions() 622
exception-specification 375
checking of 376
exclusive or operator ^, bitwise 124
exhaustion
free store 129
resource 369
exit() 116, 218
exp(), valarray 667
experimentation and analysis 710
explicit
call of destructor 256
conversion 284
qualification :: 847
template argument 335
template instantiation 866
type conversion 130
explicit constructor 284
exponent, size of 659
exponential distribution 685
exponentiation, vector 667
export 205
Expr example 424
Index
expr() 660
complex 680
expression
conditional 134
constant 833
expression, full 254
expression 798
extended type information 416
extensibility 700
extensible I/O 605
extern 205
extern 198
external linkage 199
Extract_officers example 524
extractor, >> 608
F
\f, formfeed 830
fabs() 660
facilities, standard library 66, 429
factory 323
fail() 616
failbit 617
failure 709, 716
failure and clear() 385
false and 0 71
fat interface 439, 761
fault tolerance 383
feature
summary, C++ 819
vs technique, built-in 43
features, portability and 815
feedback 695, 698
field
bit 125, 840
output 629– 630
type of 75
fields, order of 75
file
.c 202
.h 201
and stream 637
closing of 638
header 27, 201
input from 637
mode of 639
opening of 638
output to 637
position in 642
source 197
filebuf 649
fill() 537, 629
fill_n() 537
Filter example 786
finally 362
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
885
886
Index
find the classes 702
find() 57, 525
char_traits 581
in map 485
in string 594
find_end() 528
find_first_not_of() in string 594
find_first_of() 525
in string 594
find_if() 62, 525
finding the classes 734
find_last() 444
find_last_of() in string 594
firewall 383
first
C++ library 686
element 445
first-time switch 253, 640
fixed 626, 628
fixed() 634
flag manipulation 626
flags() 626
flexibility 700
flip() bitset 494
float 74
bits in 658
output 626
float_denorm_style 659
floatfield 626, 628
<float.h> 433
floating
point output 626, 628
point type 74
floating-point
. 74
conversion 834
conversion to 835
literal 74
promotion 833
float_round_style 659
floor() 660
FLT_RADIX 660
flush 634
flush() 631, 642
flushing of output 626
fmod() 660
For 511
for
statement 26, 136
statement, declaration in 137
for(;;) 109
for_each() 62, 523
Form example 635
formal
methods 711
model 730
– F–
format
control 625
information, locale 606
object 635
of date 649
of integer 649
state 625
state, basic_ios 606
state, ios_base 606
string 652
formatted output 625
formatting
basic_iostream 606
in core 641
formfeed \f 830
for-statement initializer 825
Fortran
style subscript 674
vector 668
forward
and output iterator 554
declaration of member class 293
iterator 550
reference to class 278
forwarding function 778, 780
forward_iterator_tag 553
foundation operator 706
fragmentation, memory 846
framework, application 731, 786
free
store 34, 127, 421, 576, 843
store exhaustion 129
store object 244
store object, constructor for 246
free() 577
free-standing
class 732
function 732
frexp() 660
friend 16, 278, 852
and member 265, 280
class 279
declaration 279
derived and 852
function 279
of friend 852
template and 854
front operation 472
front() 445, 472
of queue 476
front_inserter() 555
front_insert_iterator 555
<fstream> 432, 638
fstream 638
function
adapter, pointer to 521
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– F–
and algorithm, C-style 522
argument passing 145
argument type check 145
argument type conversion 145
argument types, virtual 310
call 145
call and assignment 99
call, difference from C 816
class 776
const member 229
declaration 143
definition 144
definition, difference from C 817
definition of virtual 310
definition, old-style 817
example of virtual 646
exception and 375
forwarding 778, 780
free-standing 732
friend 279
get() 759
helper 273
higher-order 518
implementation of virtual 36
inline 144
inline member 235
inspector const 706
member 224, 238
name, overloaded 149
nested 806
object 287, 514, 776
object, arithmetic 517
only, instantiate used 866
operator :: and virtual 312
pointer to 156
pointer to member 418
pure virtual 313
set() 759
specialization 344
static member 278
template 334
template argument 335
template overloading 336
type of overriding 424
value return 148
virtual 310, 390, 706
virtual 15
virtual output 612
functional
decomposition 725
decomposition and C++ 726
<functional> 431, 516– 519, 521
function-like class 514
functions, list of operator 262
functor 514
fundamental
Index
sequence 469
type 23, 70
G
game 685
gap between design and language 725
garbage
collection, automatic 247, 844
collection, delete and 845
collection, destructor and 846
collector 128, 130
gargantuanism 713
gbump() 645
gcount() 618
general allocator 573
generality
efficiency and 431
of sequence 512
of solution 701
generalized
numeric algorithm 682
slice 677
general-purpose programming-language 21
generate() 537
generated
!= 468
<= 468
> 468
>= 468
specialization 859
generate_n() 537
generator
random number 537
type 348
generic
algorithm 41
programming 40, 757– 758
programming, template and 327
get
area 645
from, >> 607
position, tellp() 642
get() 618, 643
char 620
function 759
get_allocator() 457
from string 598
getchar() 653
getline() 51, 618
into string 598
getloc() 646, 650
get_state(), char_traits 581
get_temporary_buffer() 575
global 16
initialization of 217
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
887
888
Index
namespace 847
object 244, 252
objects 640
scope 82, 847
variable 200, 228
variable, constructor for 252
variable, use of 111
global() locale 649
goal, clear 698
good() 616
goodbit 617
goto
difference from C initialization and 818
nonlocal 357
statement 137
gptr() 645
gradual
adoption of C++ 718
approach to learning C++ 7
grammar 793
graph, directed acyclic 308
greater > 516
greater_equal >= 516
grouping of exceptions 358
growing system 711
gslice 677
gslice_array 677
guarantees, standard 827
H
.h
file 201
header 821
hack, struct 809
half-open sequence 512
handle
class 782
intrusive 783
handler, exception 812
hardware 75
has-a 741
has_denorm 659
has_denorm_loss 659
hash
function 502
function, hash_map 497
table 497
hashing 502
hash_map 497
delete from 501
equality 497
hash function 497
lookup 500
representation 498
resize() 502
– G–
has_infinity 659
has_quiet_NaN 659
has_signaling_NaN 659
header 117, 201
.h 821
and compile time 211
file 27, 201
standard 431
standard library 202
heap 34, 543, 576
and priority_queue 543
memory 843
store 127
heap, priority_queue and 479
Hello, world! example 46
helper
class 293
function 273
function and namespace 240
hex 626– 627, 634
hexadecimal 73
output 626
hiding
information 27
name 82
hierarchies, interface 708
hierarchy 732
class 38, 389
class 15, 307, 734
design, class 314
exception 385
navigation, class 411
object 739, 748
reorganization of class 707
stream 637
traditional 315
higher-order function 518
high-level language 7
Histogram 455
horizontal tab \t 830
how to start a design 708
human activity, programming as a 693
hybrid design 718
I
ideas, real-world as source of 734
identifier 81
identity() example 531
IEC-559, is_iec559 659
if
statement 133
switch and 134
_if suffix 525
#ifdef 162
#ifndef 216
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– I–
ifstream 638
ignore() 618
imag() 679– 680
imbue() 645, 647, 650
imbue_event 651
imbue_event, imbue() 651
implementation
alternative 320
and interface 317
dependency type of integer literal 832
inheritance 400, 743
interface and 224, 314, 399, 758, 771
iterator 59
of I/O 606
of RTTI 409
of container 465
of dynamic_cast 409
of virtual function 36
pre-standard 820
priority_queue 478
stack 475– 476
stage 697
implementation-defined 827
implicit
conversion 275, 281, 284
conversion of string 590
type conversion 76, 276, 833
implicit_cast 335
in core formatting 641
In 511
in open for reading 639
in_avail() 644, 646
in-class
definition 235
definition of constant 249
#include guard 216
include directory, standard 201
#include 27, 117, 183, 201
includes() 542
inclusion, template 350
increment
and decrement 291
operator ++ 125
incremental change 684
indentation 138
independent concept 327
index 454
indirect_array 679
indirection 290
individual 716
inertia, organizational 713
infinity() 659
information hiding 27
inheritance 39, 303, 307, 703
and design 707
and template 349
Index
containment and 740
data abstraction vs 727
dependency 737
diamond-shaped 399
implementation 400, 743
interface 400, 743
multiple 308, 390, 735
template and 347
using multiple 399
using-declaration and 392
using-directive and 392
initialization 79, 226, 244
and cleanup 364
and goto, difference from C 818
assignment and 283
cerr 637
cin 637
clog 637
constructor and C-style 270
cout 637
difference from C jump past 818
library 640
main() and 217
member 248
of array by string 89
of base class 306
of global 217
of reference 98
of structure 102
order of member 247
reference member 244, 250
run-time 217
initializer
array 89
for-statement 825
list, constructor and 270
member 247
initiative 695
inline
and linkage 199
function 144
member function 235
inner product 684
inner_product() 683
innovation 717
inplace_merge() 541
input
and output 432, 605
and output, C 651
and output, connection between 623
bitset 495
char 618
cin >> 50, 112
complex 680
example of 114
from file 637
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
889
890
Index
into container 451
into vector 451
iterator 550
manipulator 632
of bool 615
of built-in type 614
of char 615
of pointer 615
of user-defined type 621
sequence 513
string 598
unbuffered 642
unformatted 618
valarray 668
width() of 616
input_iterator_tag 553
insert() 55
[] and 488
into map 487
into vector 452
string 592
inserter, << 608
inserter() 555
insertion, overwriting vs 555
insert_iterator 555
inspector const function 706
inspiration 734
instantiate used function only 866
instantiation
context of template 860
directive, template 866
explicit template 866
multiple 867
point of 863
template 859
int 73, 76
bits in 658
implicit, difference from C 817
largest 658
output bits of 495
smallest 658
integer
conversion 834
conversion, signed unsigned 834
enum and 77
format of 649
literal 73, 76
literal, implementation dependency type of 832
literal, type of 832
output 627
type 70, 73
type, conversion to 835
value 0, null 830
integral
conversion to 835
promotion 833
– I–
type 70
integration 728
integrity of design 716
interface
alternative 173
and implementation 224, 314, 399, 758, 771
changing 774
class 778
exception and 375
fat 439, 761
hierarchies 708
implementation and 317
inheritance 400, 743
module and 165
multiple 172
public and protected 645
specifying 707
internal
linkage 199
structure 694
internal 625, 630
internal() 634
INT_MAX 660
introducing C++ 718
intrusive
container 438
handle 783
int_type 608
int_type(), char_traits 581
invalid_argument and bitset() 385
invariant 748
checking 749
I/O 47, 50
buffering 645
design 605
exception 622
extensible 605
implementation of 606
iterator and 60
object 774
sentry 624
system, organization of 606
type safe 607
unbuffered 647
wide character 608
Io example 776
Io_circle example 775
iocopy() example 617
<iomanip> 432, 633
Io_obj example 774
<ios> 432, 608
ios 625, 822
ios_base 626, 628– 629, 650
format state 606
iosbase::Init example 639
<iosfwd> 432, 607
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– I–
iostate 617, 822
io_state 822
<iostream> 46, 432, 609, 614
<istream> and 613
<ostream> and 608
iostream 637
sentry 624
is-a 741
isalnum() 601
isalpha() 113, 601
is_bounded 659
iscntrl() 601
isdigit() 601
Iseq 513
iseq() example 513
is_exact 659
isgraph() 601
is_iec559 IEC-559 659
is_integer 658
islower() 601
is_modulo 659
ISO
646 829
C++ 11
isprint() 601
is_signed 658
isspace() 601, 615
whitespace 114
is_specialized 658
<istream> 432
and <iostream> 613
istream 614, 643
>> 614
and iterator 559
istreambuf iterator 559
istreambuf_iterator 559
istream_iterator 60, 559
istringstream 641
istrstream 656
isupper() 601
isxdigit() 601
iterator 57, 549
!= 551
* 551
+ 551
++ 551
+= 551
- 551
-- 551
-= 551
-> 551
< 551
<= 551
== 551
> 551
>= 551
Index
STL 441
Sink 532
[] 551
abstract 435
and I/O 60
and sequence 550
begin() 444
bidirectional 550
category 553
checked 561
const 443
cont 508
container 464
container and 435, 444
end() 444
forward 550
forward and output 554
implementation 59
input 550
istream and 559
istreambuf 559
map 481
naming convention 511
operation 551
ostream and 558
ostreambuf 560
output 550
random-access 550
rbegin() 444
read through 551
rend() 444
reverse 443, 557
stream 558
string 584
user-defined 561
valarray 670
valid 550
write through 551
<iterator> 432
iterator 54, 443, 480
basic_string 583
iterator_category 552
iterator_traits 552
iter_swap() 538
itoa() 155
Itor 435
Ival_box example 315, 407
Ival_slider 399
iword() 650
J
jump past initialization, difference from C 818
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
891
892
Index
K
keeping consistent declarations 201
Kernighan and Ritchie 654
key 55
and value 480
key,
duplicate 480, 490
unique 480
key_comp() 485
key_compare 480, 485
key_type 480
keyword 793– 794
and 829
and_eq 829
bitand 829
bitor 829
compl 829
not 829
not_eq 829
or 829
or_eq 829
xor 829
xor_eq 829
kind
of class 765
of container 461
kinds of object 244
Knuth, Donald 713
L
L’, wide-character literal 73
labs() 661
lack of modularity 309
language
and library 45
design and 724
gap between design and 725
high-level 7
low-level 8
people and machines 9
programming 15
programming styles technique 6
support 433– 434
large
character set 831
program 211– 212
programs and C++ 9
largest int 658
last element 445
last-time switch 640
Latin-1 580
lattice, class 389
layout, array 669
lazy evaluation 707
– K–
ldexp() 660
ldiv() 661
ldiv_t 661
leaf class 774
learning
C and 7
C++ 6, 718, 820
C++, gradual approach to 7
left 625, 630
left() 634
legacy 708
length of valarray 664, 679
length()
char_traits 581
of string 598
string 586
less 515
< 516
less_equal <= 516
less_than 519
levels of abstraction 733
lexical conventions 794
lexicographical_compare() of sequence 544
libraries, standard 700
library 15, 701, 714, 755
C standard 599
algorithms, standard 64
container, standard 56
facilities, standard 66, 429
first C++ 686
header, standard 202
initialization 640
language and 45
non-standard 45
standard 45, 182
standard – see standard library
lifetime
of object 84
of temporary 254
limits, numeric 658
<limits> 433, 658
<limits.h> 433, 660
line, read 618
linear time 464
Link 394
linkage
and namespace 207
and pointer to function 207
const and 199
error 199
external 199
inline and 199
internal 199
to C 205
type-safe 198
linker 198
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– L–
Liskov substitution 743
Lisp 725
list
of operator functions 262
operation 452
<list> 431
List 435
list 54
doubly-linked 470
element access 472
merge() algorithm and 541
merge() stable 470
remove() 472
remove_if() 472
reverse() 472
sort() stable 470
unique() 472
literal
’, character 73
L’, wide-character 73
floating-point 74
implementation dependency type of integer 832
integer 73, 76
of user-defined type 273
string 294
string 46, 90
type of integer 832
loader 198
local
fix 697
scope 82
static 145
static store 251
variable, constructor for 245
<locale> 433, 649
locale 649
POSIX 649
classic() 649
format information 606
global() 649
locale() 649
<locale.h> 433, 650
locality 212
localization of object creation 322
locking 366, 785
Lock_ptr example 366
log() 660
complex 680
valarray 667
log10() 660
complex 680
valarray 667
logarithmic time 464
logical
and operator && 123
const, physical and 231
Index
operators, bitwise 124
or operator || 123
structure of program 198
logical_and && 516
logical_not 515
! 516
logical_or || 516
long namespace name 178
long 73
long double 74
longer term 699
lookup, hash_map 500
loop
and error 523
merging 675
statement 116
lower_bound() 540
in map 485
low-level language 8
lt(), char_traits 581
lvalue 84, 264, 281
lying 705
M
machines, language people and 9
macro 160
alternative to 161
macros, difference from C 818
main() 46, 116, 218
and initialization 217
argv argc 117
exception and 54
maintenance 212
software 712
make_heap() 543
make_pair() 482
malloc() 577
management 713
memory 843
manipulator
input 632
output 631
standard 633
user-defined 635
with argument 633
mantissa, size of 659
manual overload resolution 151
map 480
<map> 431
map 55, 480
= 484
[] 482
assignment 484
comparison in 484
constructor 484
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
893
894
Index
count() in 485
element access 482
equal_range() in 485
erase() from 487
find() in 485
insert() into 487
iterator 481
lower_bound() in 485
member type 480
modify a 487
subscripting 482
upper_bound() in 485
use of 774
mapped type, value 55
mapped_type 480
mapping exception 378
Marian 79
masks_array 678
Math_container example 346
mathematical
functions, complex 680
functions, standard 660
functions, valarray 667
functions, vector 667
model 711
<math.h> 434, 660
Matrix 672
example 282
max() 544, 658
valarray 664
max_element() of sequence 544
max_exponent 659
max_exponent10 659
max_size() 455, 489
of string 598
meaning for operator, predefined 264
means, aims and 694
measurement, productivity 716
member
->*, pointer to 418
.*, pointer to 418
::*, pointer to 418
access operator -> 102
access operator . 101
access to 849
and nonmember operators 267
class 293
class, access to 851
class, forward declaration of 293
const 249
constant 249
constructor for class 247
enum 249
friend and 265, 280
function 224, 238
function adapter 520
– M–
function, algorithm and 520
function, const 229
function, inline 235
function, pointer to 418
function, static 278
initialization 248
initialization, exception and 373
initialization, order of 247
initialization, reference 244, 250
initializer 247
object 244
object, union 244
of base class, private 305
of derived class 304
of template, static 854
or base 740
or pointer 738
pointer to data 853
private class 225
protected 404– 405
public class 225
reference 740
static 228, 421
template 330
template example: 349
template, missing 823
type, basic_string 582
type, map 480
type, vector 442
union 257, 843
member-declaration 808
memberwise copy 283
memchr() 577
memcmp() 577
memcpy() 577
mem_fun() 63, 518, 521
mem_fun1_ref_t 518, 521
mem_fun1_t 518, 521
mem_fun_ref() 518, 521
mem_fun_ref_t 518, 521
mem_fun_t 518, 520– 521
memmove() 577
memory
automatic 843
buffer 575
dynamic 127, 576, 843
fragmentation 846
heap 843
management 843
management, automatic 844
management, container 455, 567
management, example of user-defined 292
stack 843
static 843
uninitialized 574
<memory> 431, 574
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– M–
memset() 577
merge() 541
algorithm and list 541
stable, list 470
message queue 477
method 310
choosing a design 696
choosing an analysis 696
cookbook 692
design 694
methods, formal 711
min() 544, 658
valarray 664
min_element() of sequence 544
min_exponent 659
min_exponent10 659
minimalism 706
minimize dependency 173
minus - 517
mismatch() 516, 527
missing
bad_alloc 823
checking, late 823
member template 823
namespace 822
specialization partial 823
standard library 822
mis-use
of RTTI 439
of dynamic type checking 439
misuse
of C++ 725
of RTTI 417
mixed-mode arithmetic 268
mixin 402
mixing C and C++ 719
ML 10
mode of file 639
model
formal 730
mathematical 711
waterfall 697
models 708
modf() 660
modifier 706
modify a map 487
modifying sequence algorithm 529
modular programming 26
modularity 312
lack of 309
module
and interface 165
and type 30
modulus % 517
moron 713, 717
move(), char_traits 581
Index
multidimensional
array 668, 677, 836
array, passing 839
multilevel error handling 383
multimap 490
multi-method 326
multiple
inheritance 308, 390, 735
inheritance, exception and 360
inheritance, use of 776
inheritance, using 399
instantiation 867
interface 172
multiple-inheritance
access control and 406
ambiguity resolution 391
multiplies * 517
multiset 491
mutable 232
mutual reference 278
N
\n, newline 830
name 81
binding 860
binding, template 859
characters in 81
clash 176
hiding 82
long namespace 178
namespace qualified 169
short namespace 178
names, reserved 81
namespace
nested 848
transition to 182
namespace 27, 167, 847
alias 178
and :: 169
and class 849
and overloading 183
composition 179
composition of 181
global 847
helper function and 240
is open 184
linkage and 207
member declaration and definition 167
missing 822
name, long 178
name, short 178
operators and 265
purpose of 180
qualified name 169
relops 468
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
895
896
Index
selection from 180
std 46
unnamed 177, 200
using 183
naming convention, iterator 511
narrow() 645
n-ary operators 675
national
character 829
conventions 649
natural operation 767
NDEBUG 750
negate - 517
nested
class 293
function 806
namespace 848
nesting 756
<new> 384, 433, 576
new
and array 423
and exception 576
bad_alloc and 384
exception and 367, 369
operator 127
placement 255
size and 421
new()
operator 129, 576
placement 576
new[](), operator 423, 576
new_handler 129, 576
_new_handler 370
newline \n 830
next_permutation() 545
Nicholas 49
noboolalpha() 633
Nocase 467
node
class 772
class, abstract 774
class, concrete 774
non-C++ program 217
none() 494
nonlocal goto 357
nonmember operators, member and 267
nonmodifying sequence algorithm 523
non-standard library 45
non-type template parameter 331
norm() 680
noshowbase() 634
noshowpoint() 634
noshowpos() 634
noskipws() 634
not a class 705
not keyword 829
– N–
not1() 518
and unary_negate 522
not2() 518
and binary_negate 522
notation, value of 261
not_eof(), char_traits 581
not_eq keyword 829
not_equal_to != 516
nothrow 576
allocator 823
nouppercase() 634
npos 586
nth_element() 540
null
0 zero 88
integer value 0 830
NULL 88, 433
number, size of 75
numeric
algorithm, generalized 682
arrays 662
limits 658
<numeric> 434, 682
numerical computation 64
numeric_limits 658
O
O notation 464
object 32, 84
I/O 774
array element 244
automatic 244
constructor for free store 246
creation 242
creation, localization of 322
format 635
free store 244
function 287, 514, 776
global 244, 252
hierarchy 739, 748
kinds of 244
lifetime of 84
member 244
placement of 255
real-world 732
state of 748
static 244
temporary 244, 254
union member 244
variably-sized 243
Object 438
example 417
object-oriented
design 692, 726
programming 37– 38, 301
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– O–
pure 732
objects
array of 250
global 640
oct 626– 627
oct() 634
octal 73
output 626
ODR the one-definition-rule 203
offset, pointer to member and 419
off_type 608, 643
char_traits 581
ofstream 638
old-style function definition 817
one right way 693
one-beyond-last 512
one-definition-rule, ODR the 203
Op 511
open
for reading, in 639
for writing, out 639
namespace is 184
opening of file 638
openmode 639
operation
bitset 494
efficiency of 464
front 472
iterator 551
list 452
natural 767
on container 464
operations
complex 679
on references 97
on structure 102
selecting 705
set of class 237
valarray 664, 667
vector 664, 667
operator
, 123
&, bitwise and 124
&&, logical and 123
+, user-defined 281
++, increment 125
++, user-defined 264, 291
+= 109
+=, user-defined 264, 268, 281
--, decrement 125
--, user-defined 291
-= 109
->, member access 102
->, user-defined 289
., member access 101
:: 305
Index
:: and virtual function 312
::, scope resolution 82, 228
<<, output 607
=, user-defined 281
^, bitwise exclusive or 124
and built-in type, user-defined 265
and enum, user-defined 265
assignment 110, 268
associativity of 121
binding strength 121, 607
composite 268
declarator 80
delete 127
design of access 295
foundation 706
new 127
overloaded 241
overloading, example of 292
precedence 121
predefined meaning for 264
stack 450
summary 119
ternary 636
type conversion 275
user-defined 263
user-defined binary 263
user-defined unary 263
|, bitwise or 124
||, logical or 123
operator
delete() 129, 576
delete[]() 423, 576
functions, list of 262
new() 129, 576
new[]() 423, 576
void*() 616
operator() 287
operator[] 286
operator 810
operators
and namespace 265
bitwise logical 124
essential 283
member and nonmember 267
n-ary 675
optimal container 435
optimization 675
or
keyword 829
operator |, bitwise 124
operator ||, logical 123
order 467
of construction 248, 252
of evaluation 122
of fields 75
of member initialization 247
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
897
898
Index
of specialization 343
or construction and destruction 414
or_eq keyword 829
organization
of I/O system 606
standard library 431
organizational inertia 713
orthogonality, convenience and 431
oseq() example 556
<ostream> 432
and <iostream> 608
ostream 608, 642
<< 609
and buffer 642
and iterator 558
and streambuf 642
put() 609
template and 608
write() 609
ostreambuf iterator 560
ostreambuf_iterator 560
ostream_iterator 60, 558
ostringstream 641
ostrstream 656
Out 511
out open for writing 639
out_of_range 53, 446
and at() 385
string 586
output 47
C input and 651
bits of int 495
bitset 495
complex 680
connection between input and 623
cout << 46
decimal 626
double 626
field 629– 630
float 626
floating point 626, 628
flushing of 626
formatted 625
function, virtual 612
hexadecimal 626
input and 432, 605
integer 627
iterator 550
manipulator 631
octal 626
of bool 610
of built-in type 609
of char 610
of pointer 611
of user-defined type 612
operator << 607
– O–
padding 625
sequence 556
string 598
to file 637
unbuffered 642
valarray 668
why, << for 607
output_iterator_tag 553
overflow, stack 476
overflow() 647
overflow_error and to_ulong() 385
overhead 8
overlapping sequences 529
overload
resolution 149
resolution, manual 151
return type and 151
scope and 151
overloaded
function name 149
operator 241
overloading
const and 600
example of operator 292
function template 336
namespace and 183
override 313
private base 738
overriding 395
from virtual base class 401
function, type of 424
overwriting vs insertion 555
P
padding 630
output 625
pair 482
paradigm, programming 22
parameter
non-type template 331
template 331
parameterization
policy 757
template 707
parametric polymorphism 347
parentheses, uses of 123
parser, recursive decent 108
partial
sort 539
specialization 342
partial_sort() 539
partial_sort_copy() 539
partial_sum() 684
partition 542
partition() 542
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– P–
partitioning of program 208, 211
passing multidimensional array 839
pattern 709
specialization 342
pbackfail() 647
pbase() 645
pbump() 645
peek() 643
people and machines, language 9
perfection 43
permutation 545
per-object data 573
per-type data 573
phone_book example 52
physical
and logical const 231
structure of program 198
placement
new 255
new() 576
of object 255
Plane example 729
plug analogy 728
plus + 517
point
of declaration 82
of definition 861
of instantiation 863
pointer 26, 87
and array 91, 147
arithmetic 88, 93, 125
checked 291
checking for, wild 722
const 96
conversion 834
conversion, user-defined 349
disguised 844
input of 615
member or 738
output of 611
semantics 294
size of 75
smart 289, 291
to class 304
to class, conversion of 304
to const 96
to constructor 424
to data member 853
to function 156
to function, << of 631
to function, >> of 632
to function adapter 521
to function, linkage and 207
to member ->* 418
to member .* 418
to member ::* 418
Index
to member and offset 419
to member function 418
to void 100
type 569
pointer 443, 552, 567
basic_string 583
pointers and union 845
pointer_to_binary_function 521
pointer_to_unary_function 518, 521
polar() 680
policy parameterization 757
polymorphic 35
object, algorithm and 63
polymorphism 158, 312
algorithm and 520
compile-time 347
container and 520
dynamic_cast and 409
parametric 347
run-time 347
see virtual function
Pool example 570
Pool_alloc allocator 572
pop()
of priority_queue 478
of queue 476
of stack 475
pop_back() 450
pop_front() 472
pop_heap() 543
portability 9, 700, 828
and features 815
position
bit 492
in buffer 642
in file 642
POSIX locale 649
postcondition 753
pos_type 608, 643
char_traits 581
pow() 660
complex 680
valarray 667
pptr() 645
precedence
<< 608
operator 121
precision() 628
precondition 753
Pred 511
predefined
, 264
& 264
= 264
meaning for operator 264
predicate 61, 63, 515
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
899
900
Index
standard library 516
user-defined 516
prefix code 624
preprocessing directive # 813
presentation as design tool 704
pre-standard implementation 820
prev_permutation() 545
printf() 651
priority queue 478
priority_queue
and heap 479
heap and 543
implementation 478
pop() of 478
push() of 478
top() of 478
private
class member 225
member of base class 305
private 402
base 405, 742
base class 743
base, override 738
public protected 849– 850
private: 234
problems
of scale 715
with concrete type 37
procedural
programming 23
programming and C++ 725
process, development 696
product
dot 684
inner 684
productivity measurement 716
program 46, 798
large 211– 212
logical structure of 198
non-C++ 217
partitioning of 208, 211
physical structure of 198
size of 8
start 217
structure of 8
termination 218
programmed-in relationship 746
programmer
C 14
C++ 14
programmers, eliminating 730
programming 16
and C++, procedural 725
as a human activity 693
design and 692
generic 40, 757– 758
– P–
language 15
language, design language and 730
modular 26
object-oriented 37– 38, 301
paradigm 22
procedural 23
purpose of 694
style 22
styles technique language 6
template and generic 327
programming-language, general-purpose 21
prohibiting
, 264
& 264
= 264
promotion
floating-point 833
integral 833
standard 833
proof by analogy 692
properties of C++ 724
protected 402
base 319, 405
base class 743
interface, public and 645
member 404– 405
private, public 849– 850
protection 226
unit of 754
prototypes 710
proxy 785
Ptr 349
ptrdiff_t 122, 433
ptrfun() 518
ptr_fun() 521
pubimbue() 646
public class member 225
public 402
and protected interface 645
protected private 849– 850
public: 225, 234
pubseekoff()
pubseekpos()
pubsetbuf() 646
pubsync() 646
pure
object-oriented 732
virtual function 313
purpose
of namespace 180
of programming 694
push()
of priority_queue 478
of queue 476
of stack 475
push_back() 55, 450
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– P–
and realloc() 451
push_front() 55, 472
push_heap() 543
put
area 645
to, << 607
put(), ostream 609
putback() 643
pword() 650
Q
qsort() 158, 546
and exception 382
quadratic time 464
qualification ::, explicit 847
qualified name, namespace 169
qualifier, template as 858
quality 717
queue
deque, double-ended 474
priority 478
<queue> 431
queue
back() of 476
front() of 476
message 477
pop() of 476
push() of 476
quiet_NaN() 659
quote
\’, single 830
double 830
quotient 661
R
\r, carriage return 830
Ran 511
rand(), random number 685
Randint 685
RAND_MAX 685
random
number 538
number class 685
number generator 537
number rand() 685
random-access iterator 550
random_access_iterator_tag 553
random_shuffle() 537
range
check 445, 561
check of string 584
check, valarray 664
checking 275, 781
Index
checking Vec 53
error 661
sequence and 512
Range example 781
Rational example 747
raw storage 574
raw_storage_iterator 575
rbegin() 481
basicstring 584
iterator 444
rdbuf() 644
rdstate() 616
read
line 618
through iterator 551
read() 618
readsome() 643
real() 679– 680
realloc() 577
push_back() and 451
real-world
as source of ideas 734
classes and 734
object 732
rebind 567
use of 569
recursion 148
recursive
decent parser 108
function, exception and 374
reduce 683
reduction 683
redundancy 712
reference 97
argument 98
call by 98, 146
call-by 282
count 292
counting 783
dynamic_cast to 410
example of 292
initialization of 98
member 740
member initialization 244, 250
mutual 278
return by 148
return by 283
to class, forward 278
reference 443, 480, 552, 567
basic_string 583
to bit 492
references, operations on 97
register 806
register_callback() 651
reinterpret_cast 130, 256
relationship, programmed-in 746
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
901
902
Index
relationships between templates 348
relaxation of return type 424
release, resource 364
reliability 383
relops, namespace 468
remainder 661
remove() 536
list 472
remove_copy_if() 536
remove_if() 536
list 472
renaming virtual function 778
rend() 481
basicstring 584
iterator 444
reorganization of class hierarchy 707
replace() 535
in string 595
replace_copy() 535
replace_copy_if() 535
replace_if() 535
replicated base class 394
representation
hash_map 498
of container 465
requirement
comparison 467
copy 466
requirements for element 466
reserve() 455
reserved names 81
reset() bitset 494
resetiosflags() 634
resize() 52, 455
hash_map 502
of string 598
of valarray 664
valarray 666
resource
acquisition 364
exhaustion 369
release 364
response to change 698
responsibility 700, 706
restricted character set 829
restriction 9
result
of sizeof 122
type 122
resumption 370
re-throw 362, 379
return
\r, carriage 830
by reference 283
type and overload 151
type of virtual 424
– R–
type, relaxation of 424
value 283
value, algorithm 508
value type check 148
value type conversion 148
return
alternative 357
by reference 148
by value 148
function value 148
of void expression 148
return; 148
return_temporary_buffer() 575
reuse 714
design 709
of concrete type 241
of concrete type 768
reverse iterator 443, 557
reverse() 537
list 472
reverse_copy() 537
reverse_iterator 443, 480, 557
basic_string 583
reward 713
rfind() in string 594
right 625, 630
right() 634
Ritchie, Kernighan and 654
rotate() 537
rotate_copy() 537
round_error() 659
RTTI 407
implementation of 409
mis-use of 439
misuse of 417
use of 417
rule of two 741
run time support 8
run-time
access control 785
error 29, 355
initialization 217
polymorphism 347
type identification 407
type information 407, 774
S
Saab example 728
safety, convenience vs. 847
Satellite 390
saving space 840
sbumpc() 646
scale 212, 692
problems of 715
scaling 665
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– S–
scientific 626, 628
scientific() 634
scope 278
and overload 151
difference from C 816
global 82, 847
local 82
resolution operator :: 82, 228
scrollbar example 743
search, binary 540, 546
search() 528
search_n() 528
seekdir
and beg 643
and cur 643
and end 643
direction of seek
seekg()
direction of 643
seekoff()
seekp()
direction of 643
set position 642
seekpos()
selecting operations 705
selection from namespace 180
self, assignment to 246
self-reference this 230
semantics
pointer 294
value 294
sentry
I/O 624
iostream 624
separate
compilation 27, 198
compilation, template 351
separation of concerns 694
sequence 41, 469
add element to 529
adding to 555
algorithm and 508
algorithm, modifying 529
algorithm, nonmodifying 523
and associative container 461
and container 512
and range 512
change size of 529
delete element from 529, 534
error 512
fundamental 469
generality of 512
half-open 512
input 513
iterator and 550
lexicographical_compare() of 544
Index
max_element() of 544
min_element() of 544
output 556
set operation on 542
sorted 539
string 579
sequences, overlapping 529
set 124
of class operations 237
operation on sequence 542
position, seekp()
<set> 431
Set example 769
set 491
bitset and 492
of Shape* 348
set()
bitset 494
function 759
setbase() 634
setbuf() 647
Set_controller example 785
set_difference() 543
setf() 626, 630
setfill() 634
setg() 645
set_intersection() 542
setiosflags() 634
<setjmp.h> 433
set_new_handler() 129, 576
setp() 645
setprecision() 633– 634
setstate() 616
set_symmetric_difference() 543
set_terminate() 380
set_unexpected() 379
set_union() 542
setw() 634
sgetc() 646
sgetn() 646
Shakespeare 709
Shape
example 774
example 37
example (bad) 417
Shape*, set of 348
shift() 664
short namespace name 178
short 73
short-circuit evaluation 123, 134
showbase 626, 628
showbase() 634
showmanyc() 647
showpoint 626
showpoint() 634
showpos 626
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
903
904
Index
showpos() 634
shuffle 538
sign extension 831
signal 357
<signal.h> 157, 433
signaling_NaN() 659
signed
char 831
type 73
unsigned integer conversion 834
Simula 10, 38
Simula-style container 438
simulation 685, 711
event driven 326
sin() 660
complex 680
valarray 667
single quote \’ 830
sinh() 660
complex 680
valarray 667
Sink iterator 532
size
and delete 421
and new 421
of exponent 659
of mantissa 659
of number 75
of pointer 75
of program 8
of sequence, change 529
of string 147
of structure 102
size() 455, 489, 494
of string 598
of valarray 664
string 586
sizeof 75
difference from C 816
enum 78
result of 122
size_t 122, 433
-1 and 448
size_type 443, 480
basic_string 583
skipws 625
skipws() 634
slice, generalized 677
slice 664, 668
slice_array 671
Slice_iter example 670
slicing 307
smallest int 658
Smalltalk 725
style 417
Smalltalk-style container 438
– S–
smanip 633
smart pointer 289, 291
snextc() 646
software
development 692
maintenance 712
solution, generality of 701
sort 546
partial 539
stable 539
sort() 56, 539
example 158, 334
stable, list 470
sorted sequence 539
sort_heap() 543
sorting 338
criteria 534
source
code, template 350
file 197
of ideas, real-world as 734
space, saving 840
special character 830
specialization 859
and char* 344
and void* 341
function 344
generated 859
order of 343
partial 342
partial, missing 823
pattern 342
template 341
use of 865
user 859
specialized, more 343
specifying interface 707
splice() 470
sputbackc() 646
sputc() 646
sputn() 646
sqrt() 660
complex 680
valarray 667
srand() 685
<sstream> 119, 432, 640
stability of design 708
stable
list merge() 470
list sort() 470
sort 539
stable_partition() 542
stable_sort() 539
stack
memory 843
operator 450
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– S–
<stack> 431
Stack example 27
stack
implementation 475– 476
overflow 476
pop() of 475
push() of 475
top() of 475
underflow 476
stage
analysis 697
design 697
development 697
implementation 697
standard
component 698, 714
exception 384
guarantees 827
header 431
include directory 201
libraries 700
library 45, 182
library, C 599
library, adding to 434
library algorithms 64
library container 56
library criteria 430
library design 429– 430
library facilities 66, 429
library header 202
library, missing 822
library organization 431
library predicate 516
manipulator 633
mathematical functions 660
promotion 833
standardization, C++ 11
standard-library container 442
start, program 217
starting from scratch 708
state
format 625
machine 730
of object 748
stream 616
statement
break 116
continue 116
controlled 136
do 114, 137
for 26, 136
goto 137
if 133
loop 116
summary 132
switch 25, 133
Index
while 136
statement 802
state_type, char_traits 581
static
memory 843
type checking 727
static
anachronism 200
deprecated 818
local 145
member 228, 421
member function 278
member of template 854
object 244
store, local 251
static_cast 130, 159
dynamic_cast and 413
std, namespace 46
std:: 46
<stdarg.h> 155, 433
<stddef> 433
<stddef.h> 433
<stdexcept> 385, 432
<stdio.h> 182, 202, 432
<stdlib.h> 432, 434, 546, 577, 600, 661
steps, design 701
STL 66
container 441
iterator 441
Storable example 396
storage
class 244
raw 574
store
dynamic 34
free 34, 127, 421, 576, 843
heap 127
local static 251
strcat() 599
strchr() 599
strcmp() 599
strcpy() 599
strcspn() 599
stream 432
callback 650
classes 637
closing of 639
file and 637
hierarchy 637
iterator 558
state 616
state, basic_ios 606
string 640– 641
<streambuf> 432
streambuf 646– 647, 649
<< of 642
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
905
906
Index
and character buffer 642
ostream and 642
streamoff 609
streamsize 609
stride() 668
string
and const, C-style 90
character 432
format 652
initialization of array by 89
literal 46, 90
size of 147
<string> 48, 432, 580
String example 328
string 48, 582
!= 591
+ 593
+= 592
< 591
<< 598
<= 591
= 587
== 591
> 591
>= 591
>> 598
[] on 584
algorithm and 584
and 0 587
and C-style string 579
and C-style string 589
and array 589
append() 592
as container 491
assign() 588
assignment 587
at() on 585
class 292
compare() 590
comparison 590
concatenation 592– 593
constructor 585
conversion 589
design 579
empty 585
empty() 598
erase() in 595
error 586
find() in 594
find_first_not_of() in 594
find_first_of() in 594
find_last_of() in 594
get_allocator() from 598
getline() into 598
implicit conversion of 590
input 598
– S–
insert() 592
iterator 584
length() 586
length() of 598
literal 294
max_size() of 598
of user-defined type 583
out_of_range 586
output 598
range check of 584
replace() in 595
resize() of 598
rfind() in 594
sequence 579
size() 586
size() of 598
stream 640– 641
subscripting of 584
substr() of 596
swap() 599
unsigned 583
stringbuf 649
<string.h> 432, 577, 599
stringstream 641
strlen() 599
strncat() 599
strncmp() 599
strncpy() 599
strpbrk() 599
strrchr() 599
strstr() 599
<strstream.h> 656
struct 101
and class 234
hack 809
name, difference from C 818
scope, difference from C 818
structure 101
initialization of 102
internal 694
of program 8
operations on 102
size of 102
style, programming 22
subarray 663, 671, 677– 679
subarrays 668
subclass 303
superclass and 39
subrange 781
subscript
C++ style 674
Fortran style 674
subscripting 445, 454
comma and 838
map 482
of string 584
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– S–
user-defined 286
valarray 663
substitution, Liskov 743
substr() of string 596
substring 596
Substring example 596
subtype 743
subtyping 730, 742
successful large system 709
suffix
_copy 533
_if 525
code 624
Sum 514
sum() of valarray 664
summary
algorithm 509
container 464
syntax 793
sungetc() 646
superclass 303
and subclass 39
supplying default value 500
support 714
run time 8
swap() 344, 457– 458, 489, 538
string 599
swap_ranges() 538
switch
first-time 253, 640
last-time 640
on type 417
switch 109
and if 134
on enumeration 77
statement 25, 133
sync() 643, 647
sync_with_stdio() 651
synonym, see typedef
syntax
<, template 811
summary 793
system
growing 711
successful large 709
working 709
T
\t, horizontal tab 830
tab
\t, horizontal 830
\v, vertical 830
Table example 243
tan(), valarray 667
tanh() 660
Index
complex 680
valarray 667
Task 394
taxonomy 703
teaching and C++ 12
technique
built-in feature vs 43
language, programming styles 6
tellg() 643
tellp() get position 642
template, use of 776
template 16, 40, 328, 854
and class 348
and friend 854
and generic programming 327
and inheritance 347
and ostream 608
argument 331
argument, deducing 335, 855
argument, default 340, 824
argument, depend on 861
argument, explicit 335
argument, function 335
as qualifier 858
as template parameter 855
class hierarchy and 345
copy constructor and 348
definition, context of 860
example:, member 349
function 334
in design 757
inclusion 350
inheritance and 349
instantiation 859
instantiation, context of 860
instantiation directive 866
instantiation, explicit 866
member 330
missing member 823
name binding 859
overloading, function 336
parameter 331
parameter, non-type 331
parameter, template as 855
parameterization 707
separate compilation 351
source code 350
specialization 341
static member of 854
syntax < 811
template-declaration 811
templates, relationships between 348
temporary 98
elimination of 675
lifetime of 254
object 244, 254
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
907
908
Index
variable 244, 254
term, longer 699
terminate() 380
terminate_handler 380
termination 370
program 218
ternary operator 636
test() 494
testing 712
design for 712
this 278
self-reference 230
throw 186, 362, 379
tie() 623
time
constant 464
linear 464
logarithmic 464
quadratic 464
<time.h> 431, 433
Tiny 275
tinyness_before 659
to_char_type(), char_traits 581
to_int_type(), char_traits 581
tools, design 711
top()
of priority_queue 478
of stack 475
to_ulong() 494
overflow_error and 385
toupper() 591
traditional hierarchy 315
traits, character 580
traits_type 608
basic_string 583
transform() 530
transition 717– 718
and using-directive 183
to namespace 182
translation unit 197
traps 659
traversal 61
tree 307
trigraphs 829
true and 1 71
trunc truncate file 639
truncate file, trunc 639
truncation 835
try 187
try-block 187, 812
tutorial as design tool 708
two, rule of 741
type 23, 69
abstract 34, 767, 769
abstract and concrete 771
arithmetic 70
– T–
built-in 70
char, character 71
character 580
check, function argument 145
check, return value 148
checking, dynamic 727
checking, mis-use of dynamic 439
checking, static 727
class and 724
class user-defined 224
concrete 33, 236, 766– 767
constructor for built-in 131
conversion, ambiguous 276
conversion, constructor and 269, 275
conversion, explicit 130
conversion, function argument 145
conversion, implicit 76, 276, 833
conversion operator 275
conversion, return value 148
conversion, unions and 842
conversion, user-defined 267, 281
equivalence 104
floating point 74
fundamental 23, 70
generator 348
identification, run-time 407
information, extended 416
information, run-time 407, 774
input of built-in 614
input of user-defined 621
integer 70, 73
integral 70
literal of user-defined 273
module and 30
of exception 379
of field 75
of integer literal 832
of integer literal, implementation dependency 832
of overriding function 424
of virtual, return 424
output of built-in 609
output of user-defined 612
pointer 569
problems with concrete 37
relaxation of return 424
result 122
reuse of concrete 241
safe I/O 607
signed 73
string of user-defined 583
switch on 417
unsigned 73
user-defined 32, 70
user-defined operator and built-in 265
typedef 84
type-field 308
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– T–
typeid() 414
bad_typeid and 384
<typeinfo> 384, 415, 433
type_info 414
typename 443, 856
type-safe linkage 198
U
uflow() 647
unary operator, user-defined 263
unary_function 515
unary_negate 518
not1() and 522
unbuffered
I/O 647
input 642
output 642
uncaught exceptions 380
uncaught_exception() 373
unchecked access 445
undeclared argument 154
#undef 162
undefined
behavior 828
enum conversion 77
underflow, stack 476
underflow() 647
unexpected exceptions 377
unexpected() 375
unexpected_handler 379
unformatted input 618
unget() 643
Unicode 580
uniform distribution 685
uniformity of code 767
uninitialized memory 574
uninitialized_copy() 574
uninitialized_fill() 574
uninitialized_fill_n() 574
union 841
and class 843
anonymous 841
constructor and 257
destructor and 257
member 257, 843
member object 244
pointers and 845
unnamed 841
unions and type conversion 842
unique() 532
list 472
unique_copy() 56, 532
unit
of allocation 88
of compilation 197
Index
of design 755
of protection 754
translation 197
unitbuf 626
united of addressing 88
units analogy 728
universal
base class 438
character name 831
UNIX 8, 13
unnamed
namespace 177, 200
union 841
unsetf() 626
unsigned
char 831
integer conversion, signed 834
string 583
type 73
up cast 408
upper_bound() 540
in map 485
uppercase 626
uppercase() 634
Urand 685
use
case 704
count 292
dependency 745
of C++ 12
of RTTI 417
of allocator 568
of class 725
of classes 733
of dynamic_cast 774
of global variable 111
of map 774
of multiple inheritance 776
of rebind 569
of specialization 865
of template 776
used function only, instantiate 866
user specialization 859
user-defined
+ 265
== 534
allocator 570
binary operator 263
container 497
conversion 347
iterator 561
manipulator 635
memory management, example of 292
operator 263
operator + 281
operator ++ 264, 291
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
909
910
Index
operator += 264, 268, 281
operator -- 291
operator -> 289
operator = 281
operator and built-in type 265
operator and enum 265
pointer conversion 349
predicate 516
subscripting 286
type 32, 70
type, class 224
type conversion 267, 281
type, input of 621
type, literal of 273
type, output of 612
type, string of 583
unary operator 263
user-supplied comparison 467
uses of parentheses 123
using multiple inheritance 399
using
namespace 183
namespace, using vs. 847
vs. using namespace 847
using-declaration 169, 180
and access control 407
and inheritance 392
vs. using-directive 847
using-directive 171
and definition 180
and inheritance 392
transition and 183
using-declaration vs. 847
usual arithmetic conversions 122
utilities 431
<utility> 431, 468
V
\v, vertical tab 830
va_arg() 155
<valarray> 434, 662
valarray 65, 662
! 664
!= 667
% 667
%= 664
& 667
&& 667
&= 664
* 667
*= 664
+ 667
+= 664
- 664, 667
-= 664
– U–
/ 667
/= 664
< 667
<< 667
<<= 664
<= 667
= 663
== 667
> 667
>= 667
>> 667
>>= 664
[] 663
^ 667
^= 664
abs() 667
acos() 667
and array 663
and vector and array 662
apply() to 664
as container 492
asin() 667
assignment 663
atan() 667
atan2() 667
construction 662
cos() 667
cosh() 667
exp() 667
input 668
iterator 670
length of 664, 679
log() 667
log10() 667
mathematical functions 667
max() 664
min() 664
operations 664, 667
output 668
pow() 667
range check 664
resize() 666
resize() of 664
sin() 667
sinh() 667
size() of 664
sqrt() 667
subscripting 663
sum() of 664
tan() 667
tanh() 667
| 667
|= 664
|| 667
~ 664
valid iterator 550
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
– V–
value
call by 146
default 239
key and 480
mapped type 55
of character 580
of cin 276
of notation 261
return 283
return by 148
return, function 148
semantics 294
value_comp() 485
value_compare 485
value_type 443, 480, 552
basic_string 583
variable
constructor for global 252
constructor for local 245
global 200, 228
temporary 244, 254
variably-sized object 243
Vec, range checking 53
vector
Fortran 668
arithmetic 65, 662
bit 124
exponentiation 667
mathematical functions 667
operations 664, 667
<vector> 431
Vector 435
example 341, 780
vector 52, 442, 469
< 457
= 447
== 457
[] of 445
and array, valarray and 662
assign() 447
constructor 447
erase() from 452
input into 451
insert() into 452
member type 442
of bool 458
of vector 836
vector of 836
vector<bool> 458
bitset and 492
Vehicle example 734
vertical tab \v 830
viewgraph engineering 704
virtual
function 15
function, renaming 778
Index
virtual 34
<< 612
base class 396
base class, overriding from 401
base, constructor and 397
constructor 323, 424
derive without 780
destructor 319
function 310, 390, 706
function argument types 310
function, definition of 310
function, example of 646
function, implementation of 36
function, operator :: and 312
function, pure 313
output function 612
return type of 424
vision 698
void 76
expression, return of 148
pointer to 100
void*
assignment, difference from C 818
specialization and 341
void*(), operator 616
volatile 808
W
waterfall model 697
wcerr 609
<wchar.h> 432
wchar_t 72– 73
wcin 614
wcout and 624
wclog 609
wcout 609
and wcin 624
wfilebuf 649
wfstream 638
while statement 136
whitespace 614– 615
isspace() 114
wide
character I/O 608
character classification 601
wide-character literal L’ 73
widen() 645
width() 629
of input 616
wifstream 638
wild pointer checking for 722
Window example 398
wiostream 637
wistream 614
wistringstream 641
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.
911
912
Index
– W–
wofstream 638
word 76
working system 709
wostream 608
wostringstream 641
wrapper 781
write through iterator 551
write(), ostream 609
ws 634
wstreambuf 649
wstring 582
wstringbuf 649
wstringstream 641
<wtype.h> 432
X
X3J16 11
xalloc() 650
xgetn() 647
xor keyword 829
xor_eq keyword 829
xputn() 647
Y
Year 285
Z
zero null, 0 88
The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T.
Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved.