The limits on predictability and refinement of English structural annotation are examined by comparing independent annotations, by experienced analysts using the same detailed published guidelines, of a common sample of written texts. Three conclusions emerge. First, while it is not easy to define watertight boundaries between the categories of a comprehensive structural annotation scheme, limits on inter-annotator agreement are in practice set more by the difficulty of conforming to a well-defined scheme than by the difficulty of making a scheme well defined. Secondly, although usage is often structurally ambiguous, commonly the alternative analyses are logical distinctions without a practical difference – which raises questions about the role of grammar in human linguistic behaviour. Finally, one specific area of annotation is strikingly more problematic than any other area examined, though this area (classifying the functions of clause-constituents) seems a particularly significant one for human language use. These findings should be of interest both to computational linguists and to students of language as an aspect of human cognition.