ModeShape Guide-V5-20150918 - 1708

ModeShape 3
ModeShape Guide
Exported from JBoss Community Documentation Editor at 2015-09-18 17:08:06 EDT

Copyright 2015 JBoss Community contributors.
JBoss Community Documentation
Page 1 of 424
ModeShape 3
Table of Contents
1 Introduction to JCR __________________________________________________________________ 8
1.1 Why use a repository _____________________________________________________________ 8
1.1.1 Lots of choices for storing data _______________________________________________ 9
1.1.2 What are repositories good at? _______________________________________________ 9
1.1.3 What are repositories bad at? _______________________________________________ 12
1.1.4 What kinds of applications use repositories? ____________________________________ 13
1.2 Concepts _____________________________________________________________________ 13
1.2.1 Repository ______________________________________________________________ 14
1.2.2 Workspace and Sessions ___________________________________________________ 14
1.2.3 Node, children, names, and paths ____________________________________________ 15
1.2.4 Node types and mixins _____________________________________________________ 19
1.2.5 Events _________________________________________________________________ 23
1.2.6 Queries _________________________________________________________________ 23
1.3 Features _____________________________________________________________________ 25
1.3.1 Discovering support _______________________________________________________ 27
1.4 Repository and Session _________________________________________________________ 27
1.4.1 Getting a Repository ______________________________________________________ 28
1.4.2 Getting a Session _________________________________________________________ 29
1.4.3 Making and persisting changes ______________________________________________ 32
1.4.4 Logging out _____________________________________________________________ 32
1.5 Reading content _______________________________________________________________ 32
1.6 Writing content ________________________________________________________________ 32
1.7 Workspace operations ___________________________________________________________ 32
1.8 Defining custom node types ______________________________________________________ 32
1.8.1 Compact Node Definition (CND) _____________________________________________ 33
1.8.2 CND example ____________________________________________________________ 38
1.8.3 Built-in node types ________________________________________________________ 40
1.8.4 Registering custom node types ______________________________________________ 40
1.9 Query languages _______________________________________________________________ 42
1.9.1 Query grammars _________________________________________________________ 42
1.9.2 Query API _______________________________________________________________ 42
1.9.3 Creating a query __________________________________________________________ 44
1.9.4 Executing the query _______________________________________________________ 44
1.9.5 Use the query results ______________________________________________________ 44
1.10 Using JTA Transactions _________________________________________________________ 48
1.10.1 EJBs with container-managed transactions _____________________________________ 49
1.10.2 EJBs with bean-managed transactions ________________________________________ 51
1.10.3 Explicit JTA transactions ___________________________________________________ 51
1.11 Patterns and Best Practices ______________________________________________________ 53
1.11.1 Use unstructured primary types ______________________________________________ 53
1.11.2 Storing files and folders ____________________________________________________ 53
1.11.3 Mixin characteristics with mixins _____________________________________________ 61
Page 2 of 424
ModeShape 3
1.11.4 Prefer hierarchies _________________________________________________________ 61
1.11.5 Sessions in web applications ________________________________________________ 62
1.11.6 Verify supported features ___________________________________________________ 63
1.11.7 Import and export _________________________________________________________ 64
1.11.8 Sessions and Listeners ____________________________________________________ 64
2 Introduction to ModeShape ___________________________________________________________ 67
2.1 Architecture ___________________________________________________________________ 67
2.1.1 ModeShape engine _______________________________________________________ 67
2.1.2 Repository configuration ___________________________________________________ 68
2.1.3 Clustering _______________________________________________________________ 68
2.1.4 Modules ________________________________________________________________ 68
2.2 Authentication and authorization ___________________________________________________ 72
2.2.1 Authentication and authorization _____________________________________________ 73
2.2.2 Access controls __________________________________________________________ 75
2.3 Backup and restore _____________________________________________________________ 78
2.3.1 Getting started ___________________________________________________________ 79
2.3.2 Introducing the RepositoryManager ___________________________________________ 79
2.3.3 Creating a backup ________________________________________________________ 82
2.3.4 Restoring a repository _____________________________________________________ 83
2.3.5 Migrating from ModeShape 2.8 to 3.0 or 3.1 ____________________________________ 83
2.3.6 What's in the backup? _____________________________________________________ 84
2.4 Binary values __________________________________________________________________ 84
2.4.1 How it works _____________________________________________________________ 84
2.4.2 Extended Binary interface __________________________________________________ 87
2.4.3 Importing and Exporting ____________________________________________________ 89
2.4.4 Implementation design _____________________________________________________ 89
2.4.5 Configuring Binary Stores __________________________________________________ 96
2.4.6 Files and Folders _________________________________________________________ 97
2.5 Clustering ____________________________________________________________________ 97
2.5.1 Local ___________________________________________________________________ 98
2.5.2 Replicated ______________________________________________________________ 98
2.5.3 Invalidation _____________________________________________________________ 101
2.5.4 Distributed _____________________________________________________________ 101
2.5.5 Remote ________________________________________________________________ 103
2.5.6 How to ________________________________________________________________ 103
2.6 Configuration _________________________________________________________________ 103
2.6.1 Infinispan Configuration ___________________________________________________ 104
2.7 Content grid __________________________________________________________________ 106
2.8 Federation ___________________________________________________________________ 106
2.8.1 Concepts and terminology _________________________________________________ 106
2.8.2 How it works ____________________________________________________________ 110
2.8.3 Current connectors _______________________________________________________ 113
2.9 Initial Content ________________________________________________________________ 113
2.9.1 XML Format ____________________________________________________________ 114
2.9.2 Configuring Initial Content _________________________________________________ 115
2.10 Large numbers of child nodes ____________________________________________________ 115
Page 3 of 424
ModeShape 3
2.10.1 Accessing by path _______________________________________________________ 117
2.10.2 Iterating _______________________________________________________________ 117
2.10.3 Accessing by identifier ____________________________________________________ 117
2.10.4 Additional performance considerations _______________________________________ 117
2.11 MIME types __________________________________________________________________ 117
2.12 Monitoring ___________________________________________________________________ 117
2.12.1 Public API ______________________________________________________________ 118
2.12.2 Examples ______________________________________________________________ 123
2.13 Public API ___________________________________________________________________ 126
2.14 Query and search _____________________________________________________________ 126
2.14.1 Choosing a query language ________________________________________________ 127
2.14.2 Creating queries _________________________________________________________ 128
2.14.3 Executing queries ________________________________________________________ 129
2.14.4 JCR-SQL and JCR-SQL2 extensions ________________________________________ 130
2.14.5 Query Object Model extensions _____________________________________________ 130
2.14.6 Search and text extraction _________________________________________________ 149
2.15 Registering custom node types ___________________________________________________ 149
2.15.1 Registering using the standard API __________________________________________ 149
2.15.2 Registering using CND files ________________________________________________ 149
2.15.3 Jackrabbit XML format ____________________________________________________ 152
2.16 Sequencing __________________________________________________________________ 152
2.16.1 Sequencers ____________________________________________________________ 154
2.16.2 Built-in sequencers _______________________________________________________ 155
2.16.3 Custom sequencers ______________________________________________________ 157
2.16.4 Configuring a automatic sequencer __________________________________________ 157
2.16.5 Waiting for automatic sequencing ___________________________________________ 161
3 Using ModeShape _________________________________________________________________ 164
3.1 Deploying to web and app servers ________________________________________________ 164
3.1.1 RepositoryFactory and configuration files _____________________________________ 164
3.1.2 RepositoryFactory and JNDI _______________________________________________ 164
3.1.3 Lookup Repository in JNDI ________________________________________________ 164
3.1.4 Lookup Repositories in JNDI _______________________________________________ 166
3.2 ModeShape in Java applications __________________________________________________ 169
3.2.1 The ModeShape Engine __________________________________________________ 170
3.2.2 Use RepositoryFactory and the JCR API ______________________________________ 176
3.2.3 Configuring repositories ___________________________________________________ 178
3.3 ModeShape and JBoss AS7 and EAP _____________________________________________ 190
3.3.1 JBoss AS7 or EAP? ______________________________________________________ 191
3.3.2 Getting started __________________________________________________________ 191
3.3.3 Installing ModeShape into EAP _____________________________________________ 191
3.3.4 Configuring ModeShape in EAP ____________________________________________ 196
3.3.5 Using Repositories with JCR API in EAP ______________________________________ 212
3.3.6 Using Repositories with REST in EAP ________________________________________ 221
3.3.7 Using Repositories with WebDAV in EAP _____________________________________ 223
3.3.8 Using Repositories with JDBC in EAP ________________________________________ 227
3.3.9 Administering Repositories in JBoss EAP _____________________________________ 229
Page 4 of 424
ModeShape 3
3.4 ModeShape in web applications __________________________________________________ 240
3.4.1 ModeShape's JCA Adapter ________________________________________________ 240
3.5 ModeShape's REST Service _____________________________________________________ 243
3.5.1 REST Service 2.x ________________________________________________________ 243
3.5.2 REST Service 3.x ________________________________________________________ 249
4 Query language grammars __________________________________________________________ 272
4.1 JCR-SQL2 ___________________________________________________________________ 272
4.1.1 Extensions to JCR-SQL2 __________________________________________________ 274
4.1.2 Extended JCR-SQL2 Grammar _____________________________________________ 275
4.1.3 Full-text search grammar __________________________________________________ 300
4.1.4 Example JCR-SQL2 queries _______________________________________________ 300
4.2 JCR-SQL ____________________________________________________________________ 306
4.2.1 Grammar ______________________________________________________________ 308
4.3 XPath _______________________________________________________________________ 309
4.3.1 Column Specifiers _______________________________________________________ 310
4.3.2 Type Constraints ________________________________________________________ 311
4.3.3 Property Constraints _____________________________________________________ 311
4.3.4 Path Constraints _________________________________________________________ 312
4.3.5 Ordering Specifiers ______________________________________________________ 315
4.3.6 Miscellaneous __________________________________________________________ 316
4.4 JCR-JQOM __________________________________________________________________ 316
4.5 Full text search _______________________________________________________________ 318
4.5.1 Grammar ______________________________________________________________ 320
5 Built-in node types _________________________________________________________________ 321
5.1 Standard node types ___________________________________________________________ 321
5.2 ModeShape built-in node types ___________________________________________________ 324
6 Built-in sequencers ________________________________________________________________ 327
6.1 Compact Node Type (CND) files __________________________________________________ 327
6.2 DDL files ____________________________________________________________________ 329
6.3 Image files ___________________________________________________________________ 333
6.4 Java source and class files ______________________________________________________ 334
6.4.1 Node Structure __________________________________________________________ 334
6.4.2 Java Source File Sequencer _______________________________________________ 336
6.4.3 Java Class File Sequencer ________________________________________________ 338
6.5 Microsoft Office files ___________________________________________________________ 338
6.6 MP3 files ____________________________________________________________________ 342
6.7 Teiid Relational Models _________________________________________________________ 343
6.8 Teiid Virtual Database (VDB) files _________________________________________________ 343
6.9 Text Files ____________________________________________________________________ 343
6.10 Web Service Definition Language (WSDL) files ______________________________________ 345
6.11 XML files ____________________________________________________________________ 355
6.12 XML Schema Document (XSD) files _______________________________________________ 358
6.13 ZIP files _____________________________________________________________________ 369
7 Built-in connectors _________________________________________________________________ 372
7.1 File system connector __________________________________________________________ 372
7.2 Git connector _________________________________________________________________ 377
Page 5 of 424
ModeShape 3
7.3 CMIS connector _______________________________________________________________ 379
7.4 JDBC Metadata Connector ______________________________________________________ 383
8 Built-in text extractors ______________________________________________________________ 387
8.1 Tika text extractor _____________________________________________________________ 387
9 Extending ModeShape _____________________________________________________________ 388
9.1 Custom authentication providers __________________________________________________ 388
9.1.1 The AuthenticationProvider interface _________________________________________ 388
9.1.2 The AuthorizationProvider interface __________________________________________ 391
9.1.3 The AdvancedAuthorizationProvider interface __________________________________ 391
9.1.4 Putting it all together _____________________________________________________ 394
9.1.5 Configure a repository to use your provider(s) __________________________________ 395
9.2 Custom sequencers ___________________________________________________________ 395
9.2.1 The Sequencer framework _________________________________________________ 396
9.2.2 Creating a new sequencer _________________________________________________ 398
9.3 Custom text extractors _________________________________________________________ 398
9.3.1 The text extraction framework ______________________________________________ 399
9.3.2 Creating a new sequencer _________________________________________________ 401
9.4 Custom connectors ____________________________________________________________ 401
9.4.1 The Connector framework _________________________________________________ 402
9.4.2 Creating a custom connector _______________________________________________ 414
10 Tools for Eclipse __________________________________________________________________ 415
10.1 Installation ___________________________________________________________________ 415
10.2 Compact Node Definition (CND) editor _____________________________________________ 415
10.2.1 Header Section _________________________________________________________ 416
10.2.2 Namespaces Section ____________________________________________________ 417
10.2.3 Node Types Section _____________________________________________________ 417
10.2.4 CND Preference Page ___________________________________________________ 421
10.3 ModeShape publishing tool ______________________________________________________ 421
10.3.1 Configuration ___________________________________________________________ 422
10.3.2 Publishing ______________________________________________________________ 422
10.4 Want to help? ________________________________________________________________ 424
Page 6 of 424
ModeShape 3
This guide is intended for users of ModeShape. Browse the child pages for more information on each topic.
Page 7 of 424
ModeShape 3
1 Introduction to JCR
The "Content Repository for Java Technology API, Version 2.0" specification (also known as JSR-283)
defines a set of concepts (or abstract model) and a standard Java programming interface (or API) for
working with "content" stored in any compliant implementation. Applications use the API to navigate, change,
create, query, version, import, export, lock and observe content.
This document provides an introduction to the JCR API and its core concepts, and can be used by
application developers who are learning the standard API and who are developing applications that use the
JCR API.
For the most part, this document describes how to use the JCR API without using any
implementation-specific behaviors. This is a testament to the completeness of the JCR API. However, this
document does try to call out where the behaviors of implementations are allowed to vary.
1.1 Why use a repository

Before we can explain why you might want to use a repository, we first need to state the obvious:
applications have different data requirements, and no one data storage solution will fit all applications. So
while most of the applications you've worked on probably used relational databases, the truth is that a
relational database is often not the best fit. And once you understand the sweet spot of repositories, you can
make an informed decision about whether repositories make sense for your application.
Lots of choices for storing data
What are repositories good at?
What are repositories bad at?
What kinds of applications use repositories?
Page 8 of 424
ModeShape 3
1.1.1 Lots of choices for storing data

Application developers have a lot of options for persisting their application's data. Choices over the years
have included a variety of databases, including relational, network, graph, object, document, XML, and
hybrids. While relational databases have certainly been the norm for many years, more recently there has
been a lot of interest in alternative databases that have different characteristics and features than traditional
relational databases. Much of this interest has its origins in several trends, including:
installing databases on larger numbers of smaller hardware (scale out)
storing very large amounts of data
storing simple data structures, such as simple JSON documents
using the (sometimes multiple) natural hierarchies in data
looking up data by keys rather than using queries
searching for data based upon relevance rather than criteria
map-reduce patterns for distributing data operations in parallel across all the data
evolving schemas and/or data structures
services that access/store only one kind of data
caching data in-memory for performance
giving up consistency guarantees for increased availability
Relational databases are less suited for applications with these needs. For example, large relational
databases are usually installed on large, very capable hardware, and are difficult to configure and use with
large clusters. Also, in many situations described above, relational databases are simply overkill because
few of the features are used, or because they lack features that are required.
And while early adopters of "NoSQL" databases advocated avoiding relational databases, it is now much
more widely understood that each type of database system has advantages, features, and sweet spots that
have to be understood and matched to the application and operational requirements. Even relational
databases have their sweet spot, including:
finding data based on ad hoc queries with various (even user-defined) criteria
aggregate operations (e.g., sum, average, min/max, etc.)
transactional guarantees
simplified updates of (normalized) data
constrained data structure (schema) independent of application
well-understood operational behaviors
used by many installed applications
In short, don't pick a database system just because you used it on your last application or because its new
and you want to use it on your new application. Figure out what your application needs, and pick the data
storage solution that best satisfies those needs.
Page 9 of 424
ModeShape 3
1.1.2 What are repositories good at?

No single data storage technology is good at all the data usage patterns described above. So let's examine
what kind of data and access patterns repositories are very good at handling.
Hierarchical data - Some kinds of data are naturally hierarchical, and repositories enable you to use
this hierarchy as an asset. A few examples include: ZIP codes partition the US geographically; the
Dewey Decimal classification scheme breaks knowledge into 10 main classes, which are then
subdivided into 10 divisions, which are then divided into 10 sections; merchandise is often classified
using various categories or taxonomies; temporal and historical data is often naturally stored in a
hierarchy based on time; physical products and assemblies are composed of parts and other
assemblies, forming natural hierarchies; metadata is often structured assemblies of components that
describe other components. Other kinds of data are not inherently hierarchical, but have
characteristics that enable you to easily treat them as such. For example, cryptographic hash
functions like SHA-1 are well distributed even in the first few bits of the hash, so a hierarchy can
easily be created by segmenting the first few pairs of characters of the hexadecimal form of the
hashes. Repositories can very naturally manage this kind of data, whereas other data storage
technologies require you to jump through complex and non-trivial hoops. There are entire books that
describe the multiple ways (none of which are ideal in all cases) to store trees and hierarchies in
relational databases.
File storage - JCR repositories are very good at storing files right alongside your other data. In fact,
you can store quite a bit of metadata about the files you're storing, and some JCR repositories (like
ModeShape) can automatically determine the SHA1 hash and the MIME type for your files.
Navigation-based access - Any data stored in a hierarchy can be identified by the path in that
hierarchy. Often applications that deal with hierarchical data need to work with subsets of the
hierarchy, and thus navigate to a particular location and deal with the subgraph of data below that
node. This form of navigation-based access is a natural advantage of repositories.
Programmatic API - Above all else, JCR is a programming interface for Java (and other JVM
languages) to interact with content repositories independently from how those repositories are
implemented. Thus, the API defines all the useful components necessary to create, read, update,
delete, observer, version, relate, and query the persisted content. And while you can write your
application to use this API, under the covers the repository may support various data storage options,
clustering and scaling options, federation capabilities, and other features designed to add value to
your application's data management needs.
Flexible data structure - Repositories do not require your data structure to be designed a priori.
Instead, you can start storing your data, yet allow additional kinds of information to be stored as the
need arises over time. Plus, not all similar data need be treated identically. Consider how an
application might use user-defined tags. Tags might apply to many different kinds of data, yet with
JCR's mixins you can enable a particular piece of data to become a holder of tags only when users
want to apply tags to it. JCR repositories give tremendous flexibility to your data needs, allowing your
data to vary while evolving over time as needs change.
Page 10 of 424
ModeShape 3
Rigid data structure - Flexible data structures are often very useful, but sometimes your applications
need more control. JCR repositories let you choose how constrained your data should be. You can
ensure that only specific properties are used, with values that fit specific constraints. If you want, you
can start out with these restrictions or you can add them over time. With JCR, you're in control.
Searching for data - JCR supports multiple query languages, ranging from hierarchically oriented
languages like XPath, to highly structured set-oriented languages like JCR-SQL2, to very
unstructured that enable full-text search.
Observing data - Receive notifications when content is changed. use filters to simplify identifying the
particular cases your interested in.
Transactions - Ensure that changes made by a session are either all persisted, or that none of the
changes are persisted. Plus, integrate with the Java Transaction API (JTA) and J2EE to make
changes to your repositories within container-managed transactions in your applications.
Versioning - JCR defines a built-in mechanism for versioning the changes made to single nodes or
entire subgraphs. An application simply marks a node as being versionable, and from that point
forward the repository automatically tracks the changes by recording the snapshots of the versionable
subgraph. The version history of the versionable nodes is easily accessible through the standard API,
allowing the application to access this history and roll back the versionable content to a prior state.
Locking - The JCR API defines a mechanism for temporarily locking a single node or an entire
subgraph to prevent others from modifying or removing that content. This of course works across all
of the clients using the repository, making it easy for applications to take advantage of this built-in
capability.
References - JCR references that allow one node to refer to one (or more) other nodes, allowing
easy navigation from the referring node to the referenced node.
Referential integrity - JCR defines two styles of references: strong references require that
referenced nodes not be removed, whereas weak references do not prevent removal of the
referenced nodes.
Page 11 of 424
ModeShape 3
1.1.3 What are repositories bad at?

In the previous section we talked about what repositories are good at, and in this section we'll cover some of
the kinds of data and access patterns that JCR repositories are less capable of doing well. If your application
uses data in these ways, you might consider using something other than a repository.
Flat, unorganized data - Content repositories work very well with hierarchically structured data. And
while we think most data does have hierarchical nature, not all data does. And content repositories
are far less capable of handling very large amounts of flat, non-hiearchically organized data.
Bulk declarative update - Relational databases support inserting, updating and deleting data via
declarative statements. JCR provides no such facility, and all inserts and updates must be done via
import or direct manipulation of nodes via stateful sessions. However, removing a node does remove
all descendants.
Massively large file storage - As mentioned above, JCR repositories are very good at storing files,
and can even store massively large files (e.g., many GB or more). However, you may have very
critical performance requirements or highly-optimized infrastructure for delivering such gigantic files to
clients, and perhaps the Java streaming mechanism used by the JCR API may not be ideal. In that
case, you may want to consider storing such massive files outside of the repository while storing
inside the repository the metadata and location for that file. So in short, JCR repositories may work for
large files, but be sure to test and evaluate a content repository before committing to it.
Access by non-JVM languages - The JCR API is a standard Java programming interface, so by
definition it can only be used by languages running on the JVM. However, many content repositories
(including ModeShape) do offer non-Java APIs, such as REST, WebDAV and/or CMIS.
Complex merging of versions - Although JCR can version content, the structure of the version
history is relatively limited, and the built-in functionality for merging is not terribly sophisticated. You
may find it more effective to create your own versioning mechanism on top of JCR. Or, if you have
very complex versioning requirements and are dealing with mostly files, then perhaps other file
versioning systems (e.g., Git) may be a better fit.
Page 12 of 424
ModeShape 3
1.1.4 What kinds of applications use repositories?

Below are some categories of applications for which content repositories are a good fit.
Content management systems - Content management systems (CMS) and web content
management systems (WCM) allow users to easily create and change the information used on web
sites and other information systems. So storing information in a hierarchical manner that mirrors the
structured web site and the XHTML files allows such systems to easily manage and serve the
information.
Document repositories - Storing and versioning documents and associated metadata is something
repositories do very well, and so they're often used within document management systems.
Artifact repositories - Systems that store artifacts (files that are the output of some process and
used by other systems) often use repositories. For example, several Maven repository systems use
JCR for storage of the artifacts and metadata, and rely upon not only the direct navigational access
but also query and search.
Governance systems - Repositories are a great fit for applications or systems that govern the
lifecycle of artifacts, services, and files. Such applications typically need to store a wide variety of
metadata with each governed artifact, and often this metadata changes as the lifecycle process is
changed.
Configuration management - Configuration information usually consists of structured files (e.g.,
XML, YAML, JSON, etc.). A JCR repository can provide a more formal way to manage and version
multiple configurations. Plus, JCR's event system allows easy notification of configuration changes,
while versioning can help guarantee the ability to revert back to a previous (valid) configuration.
Knowledge systems - Repositories provide an excellent way to store the varied and changing
information managed by knowledge management systems. Such systems also require search,
references, referential integrity, and versioning capabilities.
1.2 Concepts
It's important to understand the central concepts used within the JCR API. This section presents an overview
of these concepts, but for a more thorough description see the JCR 2.0 specification.
Repository
Workspace and Sessions
Node, children, names, and paths
Properties and values
Node types and mixins
Events
Queries
Page 13 of 424
ModeShape 3
1.2.1 Repository
A repository is a single, self-contained persistent store of information plus the software that is used to access
and update that information. Each repository contains an area for system-wide information and version
storage, and separate areas for its workspaces.
A repository is represented in software with the javax.jcr.Repository interface, which defines:
methods to authenticate and obtain a session (or connection) to the repository
methods to obtain the set of features supported by the repository
constants used to identify the various repository features
There are multiple ways that your application can obtain a Repository instance. The easiest is to simply
look it up in JNDI, although this obviously only works if your application is running in an environment that has
JNDI support. But another technique is to use Java's ServiceLoader to look up the
javax.jcr.RepositoryFactory implementations, and use them to ask for a repository given a set of
parameters; the first factory to understand the parameters will return a Repository instance:
Map<String,String> parameters = ...

Repository repository = null;
for (RepositoryFactory factory : ServiceLoader.load(RepositoryFactory.class)) {
repository = factory.getRepository(parameters);
if (repository != null) break;
}
The parameters are unique to the JCR implementation, but you can keep them outside of your codebase by
simply reading them in from a properties file. For more details and other options, see the Repository and
Session page.
1.2.2 Workspace and Sessions

Each repository is divided into separate named workspaces, and it is within these workspaces that all
content is stored as a tree-structure of nodes. The top of that tree structure is the root node (named "/"), and
all nodes in the tree are accessible via navigation, lookup by unique identifier, or via query result.
Accessing and updating the content within a workspace requires establishing a session obtained by
authenticating with the repository. Generally speaking, sessions are intended to be short-lived, meaning that
clients will create a session, use the session to read or update content, save the session's transient
changes, and then close the session. However, the only way to access or update any repository content is
through an authenticated session.
When using a session to read content, all of the nodes reflect the persisted state of the workspace. So as
the persisted state changes, all sessions immediately see the updated content. This means that as a client
uses a session, the content accessible by that session may be changing if other sessions are making
changes to the content.
Page 14 of 424
ModeShape 3
Sessions are also used to update content. Each session maintains the transient set of changes overlaid on
top of the persisted state, and these transient changes are persisted only when the session is saved. (When
using transactions, the session must still be saved, but the changes are persisted only when the transaction
is committed.)
The JCR API defines the javax.jcr.Session interface to represent a session, and it's created to access
and change the content of a single persistent workspace. The javax.jcr.Workspace interface to
represent a persistent workspace, and contains methods that modify or copy content (including from other
workspaces). Each Session object has its own distinct Workspace instance, since ensure that the
session's authorizations are respected. (This is why Workspace objects are not shared.)
Let's look at a very simple example that shows the basics for obtaining a Session and Workspace:
Repository repository = ...

// Create a session by logging in. There are multiple forms of 'login(...)',
// but we'll use the one that just specifies the workspace name ...
Session session = repository.login(workspaceName);
// Obtain the session's workspace ...
Workspace workspace = session.getWorkspace();
// Note how the workspace is owned by the session ...
assert session == workspace.getSession();
// And we can always get back to the repository, too ...
assert repository == session.getRepository();
// Work with your content ...
// Eventually log out ...
session.logout();
1.2.3 Node, children, names, and paths

The content of each workspace is organized as a tree of nodes: at the top of the tree is a single root node,
and every node can contain multiple child nodes. Every node has a name and a unique identifier, and can
also be identified by a path containing the names of all ancestors, from the parent to the node itself. Names
are comprised of a namespace and local part, and there is a namespace registry to centralize short prefixes
for each namespace.
Generally, all of the children of a single node are uniquely named, and this is considered a best-practice.
However, this is not required, and it is possible (and desirable in some use cases) for a single parent node to
have multiple child nodes with the same name. In these cases, the same-name siblings (SNS) are
distinguished by including a 1-based SNS index in the path. The index simply identifies the order of the
SNSs, so inserting, reordering, or removing children may alter the SNS index of a particular node, effectively
changing the node's path.
Page 15 of 424
ModeShape 3
Let's consider a sample repository that stores information about various assets of a fictional company. This
is of course just one possible repository design to showcase the capabilities and features of JCR
repositories:
We've kept it simple: none of these nodes uses SNS indexes, since there are no two children with the same
names. But even with this simple example you can see how a hierarchical structure naturally organizes the
data in such a way that makes navigating to relevant data very straightforward. For example, our asset
repository breaks down the assets into "vehicles", "equipment" and "facilities", and each of those is
further segmented into smaller divisions. The "vehicles" assets are divided into "cargo", "passenger",
and "transit" vehicles, and under each of those are all of the vehicle nodes named with their Vehicle
Identification Number (VIN).
(We could have alternatively designed the repository without the "cargo", "passenger", and "transit"
layer, and tracked that information as properties on the nodes. But since any given vehicle is only one of
these types, including the layer has advantages.)
The "equipment" area stores information about computer assets, office equipment, etc., while the "
facilities" area stores information about the buildings used or owned by our company.
So with this cursory overview of our sample repository, let's look at some code that shows just a few of the
ways to navigate our repository structure:
Page 16 of 424
ModeShape 3
Repository repository = ...

Session session = repository.login("assets");
Node root = session.getRootNode();
// Find a node by absolute path ...
Node nyc = session.getNode("/facilities/NYC");
// Or find a nodes by relative path ...
Node cargo = root.getNode("vehicles/cargo");
Node veh1 = cargo.getNode("JF2SH636X9G700001");
Node transit = veh1.getNode("../../transit");
// Iterate over all children ...
NodeIterator iter = cargo.getNodes();
while ( iter.hasNext() ) {
Node child = iter.nextNode();
}
// implements Iterator
// Or iterate over some children, using name patterns ...

iter = cargo.getNodes("JF*");
while ( iter.hasNext() ) {
Node child = iter.nextNode();
}
// If we know a node's identifier ...
String nodeId = veh1.getIdentifier();
// We can find it by simply looking it up ...
Node byId = session.getNodeByIdentifier(nodeId);
// And we can always get back to the session we used to get a node ...
Session sessionForNode = byId.getSession();
assert sessionForNode == session;
There are a couple of interesting things in this example:
Page 17 of 424
ModeShape 3
1. You can only obtain the root node from the session (line 3). The Node object for the root should never
change during the lifetime of the session, even when the properties and children of the Node object
change.
2. You can get a node by absolute path directly from the Session (line 5).
3. You can find nodes relative to other nodes (including the root node) using relative paths (lines 9-11).
Relative paths may contain ".." and "." segments to signify the parent or self, respectively. Thus,
JCR paths are very similar to paths on a file system path.
4. You can iterate over all the children of a node (lines 14-17) that this session has authorization to see.
Any child for which you don't have authorization just doesn't even appear to exist. Note that JSR-283
was designed for JRE 1.4 (pre-generics), so it defines a NodeIterator interface that extends
java.util.Iterator adds type-specific methods and size information. (JSR-333 is updating this
so that NodeIterator extends Iterator<Node>, though it will retain the ability to return the size
for backward compatibility.)
5. You can iterate over some of the children that have names matching one or several glob patterns (line
20). In this example, the names of the nodes are VIN numbers, and we can use this knowledge and
the structure of VINs to iterate over all of the passenger vehicles that were made in Japan by Fuji
Heavy Industries (for Subaru). We could just as easily issued a query instead, and would likely want
to if the criteria were any more complicated.
6. Every node in the workspace has a unique but opaque identifier (line 26) assigned by the
implementation, and you can easily look up any node in the workspace using this ID (line 29). (Note
that two different workspaces in the same repository can have nodes with the same ID. These are
referred to as corresponding nodes, and this characteristic is an important factor in deciding whether
to design your repository to have one or several workspaces.)
7. Every node is valid only in the context of its session, and you can always get a Node's Session
object (line 32). This means that your code can pass around a Node object (assuming the Session
remains open while doing so), but don't have to also pass around the Session.
One more thing about node identifiers. While all node identifiers are unique within a workspace, it is possible
for multiple workspaces in the same repository to each have a node with a given identifier. Such nodes are
called corresponding nodes, and this can be a primary factor in deciding whether to design your repository
with a single workspace or several. For example, it's possible to clone a subgraph of nodes in one
workspace into another workspace, and all these nodes will retain the same identifiers.
Page 18 of 424
ModeShape 3
Properties and values

So far we've seen how the content in a repository workspace can be organized into a tree structure of
nodes, but we haven't yet seen how to store any data (other than node names and parent-child
relationships). In JCR, all data is stored on nodes in properties. Each property has a name and is either
single-valued (meaning it always has 1 value) or multi-valued (meaning it has 0 or more values). Each value
is immutable and can be any of the following types:
Property Type
Java type
STRING
java.lang.String
NAME
java.lang.String
PATH
java.lang.String
BOOLEAN
java.lang.Boolean
LONG
java.lang.Long
DOUBLE
java.lang.Double
DATE
java.util.Calendar
BINARY
javax.jcr.Binary
REFERENCE
javax.jcr.Node
WEAKREFERENCE javax.jcr.Node
DECIMAL
java.math.BigDecimal
URI
java.lang.String
One really nice thing about values is that they have methods that will convert the value to a desired type.
This means your applications don't have to keep track of the actual type of a value, but instead can simply
ask for the value in the type your application wants. JCR defines conversions between most types (e.g.,
every value can be converted to a STRING or a BINARY representation), but some conversions don't make
sense (e.g., converting a path to a date) and result in an exception.
Before we look at an example, let's talk about node types and mixin types.
Page 19 of 424
ModeShape 3
1.2.4 Node types and mixins

The repository enforces the structure of the content by defining node types, which specify the patterns of
acceptable properties and children. Some node types can allow any combination of properties and/or
children, while other node types can be extremely restrictive on the names and values of properties and
names and types of children. Each repository comes with a large set of predefined standard node types, but
applications can define and start using custom node types at any time. The Workspace interface exposes a
NodeTypeRegistry that can be used to discover the existing node types and (with proper privileges) make
changes to the set of registered node types. (Note that it may not be possible to change or remove node
types if they are in use by the content.)
Every node declares one primary node type and zero or more mixin node types. Primary node types are
typically used to declare the core characteristics of a node, while mixin node types are used to add (i.e., "mix
in") additional characteristics. A primary type must be assigned when the node is created, but may be
changed at a later time. Mixin types can be added to and removed from a node at any time.
Node types can use inheritance, but it is much more common to define a few concrete node types (that
might use inheritance) and many more mixins that do no use inheritance but that can be mixed and matched
on nodes as needed.
The next table describes a few of the more commonly-used standard node types that are defined by the
JSR-283 specification and available in all implementations:
Name
Kind
Description
nt:base
primary The implicit abstract base type for all node types.
type
nt:unstructured
primary A concrete node type that allows any single- or multi-valued properties
type
and any children with or without same-name-siblings. This node type is

frequently used as the primary node type for nodes, coupled with mixins
to more accurately describe the sets of properties that are used on that
node.
nt:file
primary A concrete node that that represents a file uploaded to the repository,
type
with properties describing the file's metadata and a child node used to
store the content of the file.
nt:folder
primary A concrete node that is often used as a container for nt:file and
type
nt:query
nt:folder nodes.
primary A concrete node type used to store a JCR query expression.

type
Page 20 of 424
ModeShape 3
nt:address
primary A concrete node type that represents the location of a JCR node or
type
property not just within the current repository but within the set of all
addressable repositories. It defines properties for the URL to the
repository, the workspace, the path within the workspace, and the
identifier of the node within the workspace.
mix:referenceable mixin
mix:created
mix:lastModified
mix:etag
Used on nodes that can be referenced directly by REFERENCE and
type
WEAKREFERENCE properties.
mixin
Used on a node when the repository should add properties automatically
type
capture when the node was created and by whom.
mixin
Used on a node when the repository should add properties that
type
automatically capture when the node was last modified and by whom.
mixin
Added to a node when the repository should created and automatically
type
maintain a "jcr:etag" property containing a value that is semantically

comparable to the HTTP/1.1 strong entity tag, and that changes only
when a BINARY property is added, removed or changed on the node.
The "jcr:etag" value can then be used by applications to quickly
determine if the node has changed relative to a previously-known state.
mix:versionable
mixin
Added to a node to make it versionable using the JCR versioning API.
type
mix:lockable
mixin
Added to a node to make it lockable using the JCR locking API.
type
mix:shareable
mix:title
mixin
Added to a node to make it able to be shared (i.e., linked) into multiple
type
locations within the same workspace or into different workspaces.
mixin
Added to a node when it should have a "jcr:title" property.
type
mix:mimeType
mixin
Added to a node to add properties useful for tracking the MIME type
type
and/or encoding.
Now that we have a cursory understanding of nodes, properties, node types, and mixins, let's continue
looking at our asset repository example. The higher-level nodes aren't terribly interesting, as they are largely
just containers with few properties on their own. So let's look at how vehicle information might be stored.
Again, this is just an example of one possible repository design to showcase the capabilities and features of
repositories.
Page 21 of 424
ModeShape 3
This rendering shows several of the nodes in the "/vehicles/passenger" branch and the properties on
them. As we just learned, every node has a "jcr:primaryType" property that contains the name of that
node's primary type. For example, the primary type of the " passenger" is "nt:unstructured", which
means that it can contain any property and any child nodes (in other words, the node is not constrained by a
schema). However, the "passenger" node also has three mixins: "dot:defined" is a (notional) custom
node type (for our example) that represents a particular defined DOT class, a " acme:category" (notional)
custom node type that is a marker (with no defined properties) signifying a category within our asset
repository, and the standard "mix:lastModified" node type that enables automatic tracking of when the
nod was last changed and by whom.
Page 22 of 424
ModeShape 3
The "JF2SH636X9G700001" node represents a vehicle with a particular VIN, and contains a "
jcr:primaryType" of "nt:unstructured" and two mixins: the "veh:vehicle" mixin is a (notional)
custom node type that signifies a vehicle with several vehicle-related properties, and the "asset:acquired
" mixin is a (notional) custom node type that signifies several properties related to when and how the asset
was acquired. The remaining properties of the "passenger" node contain the information related to these
characteristics.
The "History" node is intended to be a container for various nodes that describe important events in the
ownership history of a vehicle, including the acquisition information, maintenance activities, accidents/issues,
etc.
The "Documents" node is a container for documents, images, photos, and other files that are relevant to the
vehicle. This particular vehicle has two such documents: a JPEG photo with EXIF metadata, and a PDF
document of the vehicles title.
1.2.5 Events
An application can be notified of each change to the repository by registering listeners with a session. Note
that each listener will only receive events until the session with which it's registered is closed. There are
some common patterns for properly using listeners in long-running applications.
When registering a listener, the caller has the ability to specify the kinds of events that its interested in.
Examples of filters include the types of events (e.g., created, modified, deleted), the location of nodes (e.g.,
by path), the identities of nodes, and even the types of nodes.
1.2.6 Queries
JCR defines a powerful query feature that applications can use to search the repository using a variety of
expression languages and even a query object model that can be used to programmatically build a query.
The most powerful expression language is JCR-SQL2, which is a SQL-like language for querying
relational-like views defined by the properties of each node type. JCR-SQL2 has support for a variety of rich
criteria (including full-text matching) and multiple kinds of joins. See the Query languages for more
information about these languages, and Query language grammars for the detailed grammars of the
languages supported by ModeShape.
Let's continue with our example and see how we can use the query system and the JCR-SQL2 language to
find content satisfying some fairly complex criteria. First, let's imagine want to issue a simple criteria to final
all Chevrolet, Toyota and Ford vehicles. We can do this with a query that selects nodes in the "
veh:vehicle" table (or node type), and apply the criteria against the columns (or properties) defined by the
"veh:vehicle" node type:
SELECT * FROM [veh:vehicle] AS vehicle

WHERE vehicle.[veh:make] = 'Chevrolet' OR vehicle.[veh:make] = 'Toyota' OR vehicle.[veh:make] =
'Ford'
Page 23 of 424
ModeShape 3
This looks very much like the SQL-99 you're probably familiar with. ModeShape even extends the
JCR-SQL2 grammar to add support for "IN" clauses, so this is not standard JCR-SQL2 but works in
ModeShape:

WHERE vehicle.[veh:make] IN ('Chevrolet', 'Toyota', 'Ford')
Using the JCR API, we can walk through the results and get the tuple results (like regular SQL) or get the
Node object that each row represents.
Let's look at a more complicated example. Imagine that we want to find "large" images (greater than 100
pixels wide) of all Ford vehicles acquired in 2011 or later. The images will exist as files under the "
Documents" child of a particular vehicle, and the vehicle has the make and acquisition information. This
seems pretty straightforward, except that the criteria involves multiple nodes within a structural pattern.
One approach is to first issue a simple query to find all the FORD vehicles that match acquired in or after
2011:

JOIN [asset:acquired] AS asset ON ISSAMENODE(vehicle,asset)
WHERE vehicle.[veh:make] = 'Ford'
AND asset.[asset:acquisitionId] > CAST('2011-01-01T00:00:00' AS DATE)
but then we have to get each node in the results and for each navigate to its " Documents" folder and look
for images that meet our size criteria.
Alternatively, we could issue a single (more complex) query that finds the images that satisfy all of the
criteria, and with this approach we have very little programming to do. This query will need to join three
"tables":
the "veh:vehicle" table represents all nodes that are of this particular node type, and it contains
columns for the "veh:year", "veh:make", "veh:model", and "veh:color" properties defined by
the "veh:vehicle" mixin. We'll apply a constraint on the "veh:make" column with the value "Ford".
the "asset:acquired" table represents all nodes that are of this particular node type, and it
contains columns for the "asset:acquiredOn" and "asset:acquisitionId" properties defined
by the "asset:acquired" mixin. We'll apply a constraint on the "asset:acquiredOn" to find all
nodes acquired later than January 1, 2011 at midnight.
the "nt:file" table represents all of the nodes of this particular node type. We're largely interested
in the path of this node.
We'll also need some join criteria:
that ensures that the nodes in the "veh:vehicle" table are also in the "asset:acquired" table;
and
that the "nt:file" nodes (in the 3rd table) are below (a descendant of) the nodes in the "
veh:vehicle" table
Page 24 of 424
ModeShape 3
Our query then becomes:
SELECT
FROM
JOIN
JOIN
WHERE
AND
AND
images.[jcr:path]
[veh:vehicle] AS vehicle
[asset:acquired] AS asset ON ISSAMENODE(vehicle,asset)
[nt:file] AS images ON ISDESCENDANTNODE(images,vehicle)
vehicle.[veh:make] = 'Ford'
asset.[asset:acquisitionId] > CAST('2011-01-01T00:00:00' AS DATE)
images.[jcr:mimeType] IN ('application/jpeg','application/png')
Note that this really isn't that much more complicated than our single-query, and results include a column
with the path of each of our images that satisfies our constraints (or we can just look up the Node objects).
These are just some of the possibilities that the query API makes possible.
1.3 Features
The JCR 2.0 specification organizes the various features of the specification into several categories. The first
are the mandatory features:
Repository acquisition, authentication and authorization - Obtaining a Repository instance and
connecting to the repository to obtain multiple Session objects.
Reading and browsing content - Navigating and finding nodes by path and by identifier, and
reading properties of nodes.
Querying content - Using the two query languages defined by JCR 2.0, including the SQL-like
JCR-SQL2 and the programmatic JCR-JQOM languages.
Exporting content - Exporting all or part of a workspace to one of two XML formats: the system
document format results in "sv:node", "sv:property", and "sv:value" XML elements, and a
document view format that results in XML elements that correspond to nodes and XML attributes that
correspond to properties.
Discovering node types - Using the built-in NodeTypeManager and related JCR interfaces to find
and list all the node types, property type definitions, and child node definitions that are registered with
the repository.
Checking permissions and capabilities - Checking whether any session has the ability to read, set
properties, add nodes, or remove nodes at particular locations within the workspace. This also
includes allowing any session to determine a priori whether certain operations cannot be performed,
although this basically is a best-effort mechanism.
The specification then outlines some of the optional features:
Page 25 of 424
ModeShape 3
Writing to the repository - Creating, modifying, or removing content using the Session-related
methods.
Importing content - Importing the information in an XML file (in either the system document format or
the document view format) into the repository, prescribing the behavior when there are clashes in
identities.
Observing changes - Allowing users to register listeners with various criteria that will be notified
when changes are made in the repository content that satisfy the listener's criteria.
Managing workspaces - Programmatically creating new workspaces and deleting existing
workspaces.
Shareable nodes - A feature that allows certain nodes to be "shared" at multiple locations within the
same or different workspaces.
Versioning - A mechanism that allows the user to mark content as versionable (by adding a
"mix:versionable" mixin), and then use methods to signal to the repository that the current state of that
content be captured in the repository's version history. Users can also navigate the version history,
and merge or restore the content from older versions.
Locking content - A simple mechanism by which a user can lock (for a short period of time) content
to prevent other users from modifying or removing that same content.
Managing node types - Allowing users to register new node types, re-register node types with
alternative definitions, and unregister node types that are no longer used.
Transactions - Defines how repositories and sessions behave when used within a JTA environment.
Same name siblings - Defines the semantics and behaviors for allowing children of a single node to
have the same name yet be individually addressable.
Ordering children - Defines how nodes (and in particular node types) can specify that the ordering of
their children is to be maintained or changed.
Managing access controls - An advanced mechanism for allowing users to discover and manage
fine-grained access control privileges. Much of this feature relies upon implementation-specific policy
definitions.
Managing content lifecycles - Allow content to be associated with various lifecycles, and allow
users to transition the state of content through the appropriate lifecycle. The names and semantics of
the lifecycle states and transitions are implementation-specific.
Retention and hold - Allows a repository to integrate with an external retention and hold facility to
place a hold on certain content, effectively making that content appear as if it were read-only, while
tracking information about the hold and the hold policy that applies to various content.
ModeShape 3.x supports all of these JCR features except the last three: access controls, content
lifecycles, and retention and hold.
Page 26 of 424
ModeShape 3
1.3.1 Discovering support

The JCR API also provides a way for a user to dynamically discover which of these features a repository
supports. Each Repository instance exposes descriptors for each of these behavioral features. The
descriptor keys for a repository instance are accessible as are the individual values for each descriptor.
Standard descriptor keys are defined by constants on the javax.jcr.Repository interface.
Here's a small sample of code that gets the various descriptors for a repository:
javax.jcr.Repository repository = ...

String[] keys = repository.getDescriptorKeys();
for ( String key : keys ) {
boolean std = repository.isStandardDescriptor(key);
boolean singleValued = repository.isSingleValueDescriptor(key);
if ( singleValued ) {
Value value = repository.getDescriptorValue(key);
} else {
Value[] values = repository.getDescriptorValues(key);
}
}
Note that the Value object is the same as what's used for property values, and the Value class has
methods to obtain the value in the form of a String, boolean, long, double, java.util.Calendar,
java.match.BigDecimal, java.io.InputStream, and javax.jcr.Binary objects.
Here's another example that shows how to determine the languages that are supported by a repository:

Value[] languages = repository.getDescriptorValues(Repository.QUERY_LANGUAGES);
1.4 Repository and Session

Before your application can do anything with a JCR repository, it first has to find the
javax.jcr.Repository instance and use it to establish a javax.jcr.Session. This page shows the
ways of using the JCR API to do exactly this.
Page 27 of 424
ModeShape 3
Getting a Repository
Using JNDI
Using RepositoryFactory
Getting a Session
Credentials
Thread-safety
Long-running Sessions
Making and persisting changes
Visibility of changes
Transactions
Logging out
1.4.1 Getting a Repository

The JCR 2.0 specification defines two primary ways to obtain a Repository instance, and neither require
your application to use any implementation-specific code.
Using JNDI
One of the more popular ways to find a Repository instance is to use JNDI, though this only works in
environments like web servers or application servers that contain a JNDI implementation. It also assumes
that a Repository instance has already been registered in JNDI; how this is done is specific to the
environment.
For example, consider that our environment has registered our Repository in JNDI with the name "jcr".
We can then simply use JNDI to obtain the instance using the JNDI API (in the " javax.naming" package):
InitialContext initCtx = new InitialContext();

Context envCtx = (Context) initCtx.lookup("java:comp/env");
javax.jcr.Repository repository = (javax.jcr.Repository) envCtx.lookup("jcr");
Different environments require different techniques to obtain the javax.naming.Context object.

The above example is what's used in the Tomcat web server. Some app servers allow you to
directly look up components in JNDI using the InitialContext.
Page 28 of 424
ModeShape 3
Using RepositoryFactory
The JCR 2.0 specification defined a new mechanism to find Repository instances without relying upon
JNDI, and this works in any Java application.
Implementations that support this mechanism implement the javax.jcr.RepositoryFactory interface,
and define a resource in their JARs that registers their implementation with the Java SE Service Loader
facility.
The only thing a JCR client application needs to do is use the ServiceLoader facility to loop over all the
RepositoryFactory implementations and ask each one to create (or obtain) a repository given a set of
supplied parameters:

}
The parameters are implementation-specific, but you can keep your application independent of the
JCR implementation by simply loading the properties from a resource file.
1.4.2 Getting a Session

Once you've gotten hold of your Repository instance, your application can connect to it by passing a set
of credentials identifying the user and the name of the workspace that the user wishes to use. The repository
will return a javax.jcr.Session instance that has the privileges awarded to that user per the credentials.
The Session can be used to read, query, observe, or change repository content.
The Repository interface defines a login method that takes the users javax.jcr.Credentials
object and the name of the workspace. The signature of this method is as follows:
public interface Repository {

...
public Session login(Credentials credentials, String workspaceName)
throws LoginException, NoSuchWorkspaceException, RepositoryException;
...
}
The application can supply null values for either or both of the "credentials" and "workspaceName".
When no credentials is provided, the repository can use an external mechanism to identify and authenticate
the user. When no workspace name is provided, the repository chooses a default workspace.
Page 29 of 424
ModeShape 3
For convenience, the Repository interface defines three other forms of login that take different
combinations of the credentials and/or workspace name, but all are simple wrappers around the primary
login(Credentials,String) method.
Credentials
The JCR API defines a javax.jcr.Credentials marker interface that is intended to encapsulate the
information necessary to identify, authenticate, and authorize a particular user. JCR implementations can
define their own implementations, or they can reuse the two concrete Credentials implementation
classes:
javax.jcr.SimpleCredentials - A Credentials that identifies a user with a username and
password.
javax.jcr.GuestCredentials - A Credentials that can be used to obtain an anonymous
session.
Credentials don't need to be supplied, either. In such cases, the repository can use an external mechanism
to authenticate and authorize the current user. Many implementations support JAAS, allowing the login
context currently associated with the thread to perform the authentication.
Thread-safety
JCR sessions are intended to be lightweight, so creating them should be very fast. But they are not thread
safe, so they shouldn't be used concurrently by multiple threads. Therefore, most applications should
probably create a session to read, query or change repository content, and then quickly close that session.
Web applications, for example, often will obtain a Session for each incoming request, use that session to
process the request, and then will close the session.
Although not required by the specification, ModeShape 3 sessions are thread-safe and can be
used by multiple threads. Care still needs to be used, however, as any transient state of a session
used by multiple threads will be visible to all components using that session.
Page 30 of 424
ModeShape 3
Long-running Sessions
One exception to the short-lived session pattern is that an application can only register listeners to the
repository through a Session. When a session is closed, all listeners registered with that session are
unbound. So if an application requires listeners to exist for long periods of time, the application will have to
use a long-lived session.
Applications should use these long-lived sessions only to register listeners, and should never use the
session to process any of the events. This is because the session is not thread-safe, and the listeners will
often be notified on separate threads. Instead, listener implementations should enqueue any work to be
done and return without using the listener's session. Then, have separate threads that take items from the
queue, obtain a new Session, perform the work, and close the session.
Getting off the listener thread is generally a good practice for asynchronous listeners, even if the Session
implementations are thread-safe. This is because implementations may serialize delivery of the events to all
listeners, so one listener that takes a long time to process each event might cause a delay for the other
listeners.
Some JCR implementations can operate in a mode where the repository remains open only as long
as there is at least one session. As such, your application might need to use a long-running
session just to keep the repository running.
Page 31 of 424
ModeShape 3
1.4.3 Making and persisting changes

The Session can be used to make changes to the persistent workspace content. However, all changes are
transient and visible only to that Session until the save() method is called on the Session. At that point,
the transient changes are persisted to the workspace, and generally become immediately visible to all other
sessions.
Visibility of changes
The specification gives a fair amount of freedom to implementations in how and when a Session sees the
changes persisted by other Session}}s. Some implementations use a copy-on-read
behavior, where the {{Session obtains a snapshot of each node it accesses, and any changes to
those nodes will only be seen by the session if/when it refreshes it's cached information or if/when it saves
its state. Other implementations use a different copy-on-write behavior, where the only content a session is
sure to see unchanged are the transient changes it has made, and will immediately see all other changes
persisted by other sessions.
Be sure to know how the implementation you're using works.
ModeShape 3.x uses the copy-on-write behavior. Note that this is different than ModeShape 2.x,
which used copy-on-read.
Transactions
When using transactions, the changes made by a Session are not persisted on "save()", but as expected
are persisted when the transaction commits. Note that only those saved changes will be committed as part
of the transaction, so be sure to call "save()" prior to committing the transaction.
1.4.4 Logging out

An application is expected to close each session after it is no longer needed, and this is done using the
logout() method. This will immediately discard any transient (but unsaved) changes made with that
session, and will immediately unregister all listeners that were registered using the session.
1.5 Reading content

1.6 Writing content
1.7 Workspace operations
Page 32 of 424
ModeShape 3
1.8 Defining custom node types

One of the important features of JCR is that it allows your applications to define and use custom node types,
which can be used for either primary types or mixin types. This section talks about how to define and use the
node types, property definitions, and child node definitions.
Compact Node Definition (CND)
Declaring namespaces
Declaring node types
Property definitions
Child node definitions
CND example
Built-in node types
Registering custom node types
1.8.1 Compact Node Definition (CND)

The JCR 2.0 specification defines a file format called "Compact Node Definition", or CND. True to its
namesake, the format does indeed make it possible to define node types in a very compact form. It supports
Java-style comments, uses JCR-style namespace prefixes, and does not require whitespace or newlines
around key characters (e.g., '[', ']', '>', ',', '(', ')', '=', and '<').
This documentation is a summary of the CND format. For the true specification of the format, see
Section 25.2 in the JCR 2.0 specification.
Here is a small CND file that defines just a few of the 33 built-in node types, and hopefully gives you an
example of what CND files look like:
Page 33 of 424
ModeShape 3
<jcr='http://www.jcp.org/jcr/1.0'>
<nt='http://www.jcp.org/jcr/nt/1.0'>
<mix='http://www.jcp.org/jcr/mix/1.0'>
/* Every node type directly or indirectly extends from 'nt:base' */
[nt:base] abstract
- jcr:primaryType (name) mandatory autocreated protected compute
- jcr:mixinTypes (name) protected multiple compute
[nt:unstructured] orderable
- * (undefined) multiple
- * (undefined)
+ * (nt:base) = nt:unstructured sns version
[mix:created] mixin
- jcr:created (date) protected
- jcr:createdBy (string) protected
[nt:hierarchyNode] > mix:created abstract
// The 'nt:file' and 'nt:folder' node types allows applications to store
// files and directories as content inside a repository
[nt:file] > nt:hierarchyNode
+ jcr:content (nt:base) primary mandatory
[nt:folder] > nt:hierarchyNode
+ * (nt:hierarchyNode) version
[mix:referenceable] mixin
- jcr:uuid (string) mandatory autocreated protected initialize
[mix:mimeType] mixin
- jcr:mimeType (string)
- jcr:encoding (string)
[mix:lastModified] mixin
- jcr:lastModified (date)
- jcr:lastModifiedBy (string)
[nt:resource] > mix:mimeType, mix:lastModified
- jcr:data (binary) primary mandatory
Let's break apart the format into the different parts.
Declaring namespaces
A namespace is declared using the form:
< prefix = uri >
where prefix is the quoted or unquoted string literal used in the rest of the file as the prefix for the
namespace, and uri is the quoted URI for the namespace. For readability, most people place each
namespace declaration on a separate line near the top of the file, but that's not required.
Page 34 of 424
ModeShape 3
Declaring node types

Each node type declaration is made up of several parts. The first is the specification of the node type's name
and attributes, and is of the form:
[ prefixedName ] > supertypes attributes
where prefixedName is the name of the node type in prefixed notation (e.g., "nt:base", "
nt:unstructured", "mix:lastModified", etc), supertypes is the optional comma-separated list of the
prefixed names of the node type's supertypes, and attributes is the optional list of node type attributes:
Attribute
Description
Keywords
orderable
The node type supports orderable children. If absent, then orderable children are not
ord
supported.
o
mixin
The node type is a mixin. If absent, the node type can be used as a primary type.
mix
m
abstract
The node type is abstract and cannot be directly used as nodes' primary type or mixin
abs
type. If absent, the node type is concrete.
a
noquery
Specifies whether the node type can be queried. If 'noquery' or 'nq' are used, then the
nq
node type cannot be queried; if 'query' or 'q' are used, then the node type can be
query
queried. If neither are specified, ModeShape assumes that the node type can be queried.
If multiple are specified, the last one is used.
primaryitem The string following this keyword specifies the prefixed name of the property or child
!
(including same-name-sibling index if required) that will be considered the primary item ,
which is the child that can be navigated using a special method and allows a client to
more easily navigate an unknown structure. If absent, the node type does not have a
primary item.
After the attributes are listed the property definitions and child node definitions.
Property definitions
Each property definition begins with a '-' character and follows the form:
- prefixedName ( type ) = defaultValues attributes
where
Page 35 of 424
ModeShape 3
prefixedName is the name of the property definition in prefix form

type is the case-insensitive name of the JCR property type, and is one of: STRING, BINARY, LONG,
DOUBLE, DATE, BOOLEAN, NAME, PATH, REFERENCE, WEAKREFERENCE, URI, and DECIMAL
defaultValues is an optional comma-separated list of quoted string literals containing the string
representation of the default values; multiple are allowed for multi-valued properties
attributes is the optional list of property definition attributes:
Attribute
Description
Keywords
mandatory
The parent node must have at least least one property to which this property definition
man
applies.
m
autocreated
The property is automatically created when the node is created with the node type as
aut
primary type, or when the node type is added as a mixin to a node. If absent, the
property is not auto-created.
protected
The property to which this definition applies is protected, meaning it can be read but not
pro
modified by client applications. When absent, the property can be set by client
applications.
multiple
The property is multi-valued. If absent, the property is single-valued.
mul
*
COPY
The specification for how the property is to be handled when the node is versioned.
VERSION
When absent, the default versioning is COPY.
INITIALIZE
COMPUTE
IGNORE
ABORT
< constraints
The quoted string literals containing regular expressions or exact matches for the values.
When constraints are provided, every value must satisfy at least one constraint on the
property definition. If absent, there are no constraints on the values.
queryops
This keyword is followed by a quoted string literal containing the comma-separated
qop
operators that can be used in property comparison constraints against this property. If
absent, all possible operators are used, and this is equivalent to specifying '=, <>, <,
<=, >, >=, LIKE'.
nofulltext
Specifies whether the property value(s) should be considered when performing a full-text
not
search. If absent, the values will be used in full-text searches.
noqueryorder Specifies whether the property can be ordered in queries. If absent, the property can be
nqord
used to order query results.
Page 36 of 424
ModeShape 3
Child node definitions

Each child node definition begins with a '{+}' character and follows the form:
{+} prefixedName ( requiredTypes ) = defaultType attributes
where
prefixedName is the name of the property definition in prefix form
requiredTypes is optional comma-separated names of the required node types for the child node.
Any child adhering to this child node definition must be instances of at least the required types listed
here. If absent, the required type is assumed to be 'nt:base' (of which all nodes are instances).
defaultType is an optional name of the node type that should be used by default when creating the
child node. If absent, the default type is assumed to be 'nt:unstructured'. The default type can
always be overridden by clients using the Node.addNode(String,String) method.
attributes is the optional list of child node definition attributes:
Page 37 of 424
ModeShape 3
Attribute
Description
Keywords
mandatory
The parent node must have at least least one child to which this child node definition
man
applies.
m
autocreated
The child node is automatically created when the node is created with the node type as
aut
primary type, or when the node type is added as a mixin to a node. If absent, the child
node is not auto-created.
protected
The child to which this node definition type applies is protected, meaning it can be read
pro
but not modified by client applications. When absent, the property can be set by client
applications. If absent, the child nodes can be read and removed from the parent node.
sns
There may be multiple child nodes with the same name to which this definition applies.
Such child nodes will be distinguished with a same-name-sibling index. If absent,

same-name-siblings are not allowed.
COPY
The specification for how the child nodes to which this definition applies are to be
VERSION
handled when the parent node is versioned. When absent, the default versioning is
INITIALIZE
COPY.
COMPUTE
IGNORE
ABORT
< constraints
The quoted string literals containing regular expressions or exact matches for the values.
When constraints are provided, every value must satisfy at least one constraint on the
property definition. If absent, there are no constraints on the values.
queryops
This keyword is followed by a quoted string literal containing the comma-separated
qop
operators that can be used in property comparison constraints against this property. If
absent, all possible operators are used, and this is equivalent to specifying '=, <>, <,
<=, >, >=, LIKE'.
nofulltext
Specifies whether the property value(s) should be considered when performing a full-text
not
search. If absent, the values will be used in full-text searches.
noqueryorder Specifies whether the property can be ordered in queries. If absent, the property can be
nqord
used to order query results.
1.8.2 CND example

Consider that we want to define a node type named 'ns:NodeType' that:
Page 38 of 424
ModeShape 3
has two supertypes

is abstract
has orderable children
is a mixin
can be queried
has the ex:property property as the primary item
defines a property named ex:property that
has values of type STRING
is multi-valued (meaning it can have zero or more values)
is mandatory
will by default have two values 'default1' and 'default2'
will be auto-created as soon as the mixin is applied to a node if the node does not already have
a property to which this definition can be assigned
is a protected property, meaning clients cannot change the value
is included in the version history when the node is checked in
has two constraints, which for string values are regular expressions or exact matches for the
values
can be used in a query property comparison constraint using any of the specified operators
has values that are not included in full-text search queries
cannot be used to order query results
allows a child node that
has a name of 'ns:node'
has a primary type that is at least subtypes of 'ns:reqType1' and 'ns:reqType2'
unless specified will be given a primary type of 'ns:defaultType'
is mandatory
will be auto-created as soon as the mixin is applied to a node if the node does not already have
a child node to which this definition can be assigned
is a protected child, meaning clients cannot remove the child node
allows multiple children with the same 'ns:node' name, using same-name-sibling indexes to
distinguish between the different child nodes
is included in the version history when the parent node is checked in
Using the CND format where we already defined the 'ns' namespace prefix, we can define this node type as:
[ns:NodeType] > ns:ParentType1, ns:ParentType2 abstract orderable mixin query primaryitem

ex:property
- ex:property (STRING) = 'default1', 'default2' mandatory autocreated protected multiple VERSION
< '[.]*\d', 'constraint2' queryops '=, <>, <, <=, >, >=, LIKE' nofulltext noqueryorder
+ ns:node (ns:reqType1, ns:reqType2) = ns:defaultType mandatory autocreated protected sns
VERSION
Note that we used the full-length keywords. We could use the mid-length keywords to make it a bit more
readable:
Page 39 of 424
ModeShape 3
[ns:NodeType] > ns:ParentType1, ns:ParentType2 abs ord mix q ! ex:property

- ex:property (STRING) = 'default1', 'default2' man aut pro mul VERSION
< '[.]*\d', 'constraint2' qop '=, <>, <, <=, >, >=, LIKE' nof nqord
+ ns:node (ns:reqType1, ns:reqType2) = ns:defaultType man aut pro sns VERSION
We can make it even more compact by using the shortest keywords:
[ns:NodeType] > ns:ParentType1, ns:ParentType2 a o m q ! ex:property

- ex:property (STRING) = 'default1', 'default2' m a p * VERSION
< '[.]*\d', 'constraint2' qop '=, <>, <, <=, >, >=, LIKE' nof nqord
+ ns:node (ns:reqType1, ns:reqType2) = ns:defaultType m a p * VERSION
Again, this example uses every possible attribute for the node type, the property definition, and the child
node definition. Often the default attributes will suffice, making the node definitions even more compact and
readable.
1.8.3 Built-in node types

The JCR specification defines 33 node types that are built-in and available for use without applications
having to register them. In fact, most JCR implementations will not allow applications to re-register or modify
any of the built-in node types. These standard built-in node types and those node types defined in all
ModeShape repositories are defined in Built-in node types.
1.8.4 Registering custom node types

While the JCR 2.0 specification uses the CND format within the specification, the only standard API for
registering custom node type definitions is to use the programmatic API.
ModeShape provides non-standard a way for clients to register node types by reading a stream
containing a CND file. For details, see Registering custom node types.
The standard programmatic API uses mutable template objects that can be created and registered using the
javax.jcr.nodetype.NodeTypeManager. Here's some Java code that registers the 'ns:NodeType'
node type definition we used earlier:
Page 40 of 424
ModeShape 3
// Get the node type manager ...

javax.jcr.nodetype.NodeTypeManager mgr = session.getWorkspace().getNodeTypeManager();
// Create a template for the node type ...
NodeTypeTemplate type = mgr.createNodeTypeTemplate();
type.setName("ns:NodeType");
type.setDeclaredSuperTypeNames(new String[]{"ns:ParentType1","ns:ParentType2"});
type.setAbstract(true);
type.setOrderableChildNodes(true);
type.setMixin(true);
type.setQueryable(true);
type.setPrimaryItemName("ex:property");
// Create a template for the property definition ...
PropertyDefinitionTemplate propDefn = mgr.createPropertyDefinitionTemplate();
propDefn.setName("ex:property");
propDefn.setRequiredType(PropertyType.STRING);
ValueFactory valueFactory = session.getValueFactory();
Value[] defaultValues =
{valueFactory.createValue("default1"),valueFactory.createValue("default2")};
propDefn.setDefaultValues(defaultValues);
propDefn.setMandatory(true);
propDefn.setAutoCreated(true);
propDefn.setProtected(true);
propDefn.setMultiple(true);
propDefn.setOnParentVersion(OnParentVersionAction.VERSION);
propDefn.setValueConstraints(new String[]{"[.]*\\d","constraint2"});
String[] queryOps = {QueryObjectModelConstants.JCR_OPERATOR_EQUAL_TO,
QueryObjectModelConstants.JCR_OPERATOR_NOT_EQUAL_TO,
QueryObjectModelConstants.JCR_OPERATOR_LESS_THAN,
QueryObjectModelConstants.JCR_OPERATOR_LESS_THAN_OR_EQUAL_TO,
QueryObjectModelConstants.JCR_OPERATOR_GREATER_THAN,
QueryObjectModelConstants.JCR_OPERATOR_GREATER_THAN_OR_EQUAL_TO,
QueryObjectModelConstants.JCR_OPERATOR_LIKE,
};
propDefn.setAvailableQueryOperators(queryOps);
propDefn.setFullTextSearchable(false);
propDefn.setQueryOrderable(false);
type.getPropertyDefinitionTemplates().add(propDefn);
// Create a template for the child node definition ...
NodeDefinitionTemplate childDefn = mgr.createNodeDefinitionTemplate();
childDefn.setName("ns:node");
childDefn.setRequiredPrimaryTypeNames(new String[]{"ns:reqType1","ns:reqType2"});
childDefn.setDefaultPrimaryTypeName("ns:defaultType");
childDefn.setMandatory(true);
childDefn.setAutoCreated(true);
childDefn.setProtected(true);
childDefn.setSameNameSiblings(true);
childDefn.setOnParentVersion(OnParentVersionAction.VERSION);
type.getNodeDefinitionTemplates().add(childDefn);
// Now register our node type template ...
NodeTypeDefinition[] nodeTypes = new NodeTypeDefinition[]{type};
mgr.registerNodeTypes(nodeTypes, true);
Page 41 of 424
ModeShape 3
As you can see, even registering a simple node type requires quite a bit of code.
1.9 Query languages

The JCR API defines a mechanism for applications to query a repository for content that meets
application-defined criteria. This is done by creating a query in one of several languages, and then
processing the results to find the nodes and/or property values requested by the query. The earlier JCR 1.0
specification defined two languages:
XPath is a subset of the standard XPath 2.0 query language for XML documents, and relies upon the
repository content being semantically similar to an XML document. Support for this language was
required by the JCR 1.0 specification.
JCR-SQL is a query language that is based upon SQL, but does not support many of the SQL
expressions. This language was optional.
However, JCR 2.0 specification deprecated these older languages, and instead defined two other query
languages:
JCR-SQL2 is SQL-like query language that is an improvement over the original and now deprecated
JCR-SQL language. JCR-SQL2 is much more similar to SQL and has support for joins, richer
expressions, and full-text search.
JCR-JQOM is a language for programmatically defining a query with Java objects, and is referred to
as the JCR Query Object Model (or JQOM).
Most of the JCR 2.0 implementations support all four of these languages, even though the JCR-SQL and
XPath languages were deprecated in the JCR 2.0 specification. Some implementations, including
ModeShape, translate all the expression-based languages into the JCR-QOM form, enabling all queries to
be processed in exactly the same way.
1.9.1 Query grammars

Grammars for all of the standard JCR query languages (and one non-standard query language supported by
ModeShape) are described in significant detail in the "Query language grammars" section.
1.9.2 Query API

When querying the content of a workspace, an implementation always evaluates the query against the
persisted content of the workspace, and never considers any of the transient changes made by a session.
To reinforce this idea, JCR defines a javax.jcr.query.QueryManager that can be obtained from the
session's javax.jcr.Workspace instance.
The QueryManager interface defines methods for creating Query objects, executing queries, storing
queries (not results) as Nodes in the repository, and reconstituting queries that were stored on Nodes.
Querying a repository generally follows this pattern:
Page 42 of 424
ModeShape 3
1. Obtain the session's query manager

2. Create a query using a particular language
3. Execute the query and get the results
4. Iterate over the nodes or rows in the results
This is demonstrated with the following sample:
// Obtain the query manager for the session via the workspace ...
javax.jcr.query.QueryManager queryManager = session.getWorkspace().getQueryManager();
// Create a query object ...
String language = ...
String expression = ...
javax.jcr.query.Query query = queryManager.createQuery(expression,language);
// Execute the query and get the results ...
javax.jcr.query.QueryResult result = query.execute();
// Iterate over the nodes in the results ...
javax.jcr.NodeIterator nodeIter = result.getNodes();
while ( nodeIter.hasNext() ) {
javax.jcr.Node node = nodeIter.nextNode();
// ...
}
// Or iterate over the rows in the results ...
String[] columnNames = result.getColumnNames();
javax.jcr.query.RowIterator rowIter = result.getRows();
while ( rowIter.hasNext() ) {
javax.jcr.query.Row row = rowIter.nextRow();
// Iterate over the column values in each row ...
javax.jcr.Value[] values = row.getValues();
for ( javax.jcr.Value value : values ) {
// ...
}
// Or access the column values by name ...
for ( String columnName : columnNames ) {
javax.jcr.Value value = row.getValue(columnName);
// ...
}
}
// When finished, close the session ...
session.logout();
Unlike JDBC, there's no need to close the result.

Let's look at each part in more detail.
Page 43 of 424
ModeShape 3
1.9.3 Creating a query

Use the QueryManager to create a query:

javax.jcr.Query query = queryManager.createQuery(expression,language);
The javax.jcr.query.Query interface defines constants for each of the standard query languages, and
these can be used in the second parameter to specify the language. If an implementation defines additional,
non-standard languages, then the implementation-specific language name would be used instead. Here's an
example of creating a query and using the constant for the JCR-SQL2 language:

javax.jcr.query.Query query =
queryManager.createQuery(expression,javax.jcr.query.Query.JCR_SQL2);
Note that the "createQuery" method will throw an exception if the query expression is not well-formed
according to the specified language.
Be sure you're specifying the correct language. The JCR-SQL and JCR-SQL2 languages seem
pretty similar, but they actually use different syntax for identifiers. Thus, defining a JCR-SQL2 query
but accidentally specifying the JCR-SQL language will often result in a bizarre exception message.
1.9.4 Executing the query

If the query expression was valid and your application obtains a Query object, it must be executed. This is,
of course, very straightforward:

Again, this might throw an exception if there was a problem executing the query.
1.9.5 Use the query results

There are several different ways of accessing the results: by nodes that satisfied the criteria, or as a table
with rows of property values as defined specified in the query's SELECT expression.
Page 44 of 424
ModeShape 3
Table-like results
Access the "result set" as a table is very similar to accessing the result set of a query using SQL (or JDBC).
The query identifies which properties (e.g., columns) on which node types (e.g., tables) should be selected,
and the QueryResult contains a row for each node that satisfies the query's criteria. Applications can
obtain the names of the columns and can iterate over the result's rows to obtain the actual values in each
column. Note that if the query involves a join, the columns may correspond to properties on different node
types (e.g., "selectors").
The first step is to obtain the list of column names:
QueryResult result = //...

Note that if any properties (or columns) were aliased, the alias will appear in the column names.
Then, the next step is to iterate over the rows in the results and obtain the values for each column, either by
position or by name. The following example shows how they can be accessed by position (or array index):

// Get the column values in each row ...
// ...
}
}
Here is an example of getting the values for each column by the column's name:

// Get the column values in each row by name ...
// ...
}
}
Again, there's no need to close the result.
Page 45 of 424
ModeShape 3
Scores
Each row is given a relative score that ranks how well that particular row satisfied the criteria. The magnitude
of the score is implementation-dependent, but a higher relative value does signal that the row was a better
match than other rows with lower scores. The score can be obtained for each row using the " getScore()"
method:
javax.jcr.query.Row row = ...

double score = row.getScore();
Some implementations, including ModeShape, will include the score in the result columns.
Page 46 of 424
ModeShape 3
Nodes
Even when accessing the results as a table of rows, it's still possible to get the underlying node that
corresponds to the property values appearing in the results. When a query has one selector (that is, does
not use joins), every row will contain the properties from a single node, which can be accessed from the Row
. Note that the Node instance will be from the Session, and so the node may have different, transient
changes to some of the properties and no longer match the persisted values that the criteria were evaluated
against during processing.

Node node = row.getNode();
// use the node
If the query defined more than one selector (that is, used at least one join), then every row will correspond to
a node from each selector. In this case, there is no one node and the " getNode()" method describe above
will throw an exception, and instead the caller should use the " getNode(String)" method that takes a
selector name. Here's a code fragment that shows how to get the node for each selector for a particular row:

for ( String selector : result.getSelectorNames() ) {
Node node = row.getNode(selector);
// use the node
}
Now, queries may use a self-join, which is when it uses a join and join criteria specifying that the node on
both sides of the join must be the same node instance. An example is if selecting the properties defined on
two node types (perhaps a primary type and a mixin), but where those properties must be on the same node.
Here's a JCR-SQL2 query that uses a self-join:
SELECT file.[jcr:createdBy], file.[jcr:created], tagged.[acme:tagName] AS tag

FROM [nt:file] AS file JOIN [acme:taggable] AS tagged ON ISSAMENODE(file,tagged)
WHERE tag IS NOT NULL
This query returns all [nt:file] nodes that have a [acme:taggable] mixin with a non-null
[acme:tagName].
In this case, the Node object returned for the two selectors, "file" and "tagged", will be the same node:

Node fileNode = row.getNode("file");
Node taggedNode = row.getNode("tagged");
// both nodes should be the same node
assert fileNode.equals(taggedNode);
// and probably the following is also true, tho this depends on the implementation ...
assert fileNode == taggedNode
Page 47 of 424
ModeShape 3
Path
Just like every row has one or more nodes, the Row interface provides an easy way to access the paths of
the nodes without having to first get the Node objects. And just like the "getNode()" and "
getNode(String)" method, the Row interface has both the "getPath()" and "getPath(String)"
methods.

Path path = row.getPath();
// use the path
or

for ( String selector : result.getSelectorNames() ) {
String path = row.getPath(selector);
// use the path
}
Some implementations might track the path with the query results, and may not need to load the
node. Thus if you just need the Path, this might be slightly faster in some implementations.
Node results
Although accessing the results as a table often makes the most sense, if the query involves only a single
selector and your application only needs access to the nodes that satisfied the criteria, your application can
get a NodeIterator that will access the result nodes in the order specified by the query:

// Iterate over the nodes in the results ...
// ...
}
Using this style to access the result nodes works well for the older, deprecated JCR-SQL and
XPath query languages, since they only support a single selector (no joins). Thus, many
applications that used JCR 1.0 will access query results in this fashion. However, where possible,
it's better to use the table-style access, as it's much more flexible and able to be used with any
queries, including those with many selectors.
Page 48 of 424
ModeShape 3
1.10 Using JTA Transactions

Working with JCR repositories within the context of JTA transactions is quite straightforward. Your
application, service or component still gets a Session as usual, and all read and write behaviors are
unchanged. The only difference is when transient state within your session is persisted to the repository.
When not using transactions, calling Session.save() will immediately write the transient changes to the
repository. When using transactions, however, JCR clients are still expected to call Session.save(), but
only when the transaction is committed are the changes are persisted in the repository, visible to other
sessions, and described by events.
Only the transient changes that were saved will be committed in the transaction. Don't forget to call
Session.save(), even in code that relies on transactions.
Let's look at several examples of using the JCR API with transactions.
1.10.1 EJBs with container-managed transactions

The latest version of Java EE greatly simplifies EJBs while still providing all the same (if not more) benefits,
like container-managed transactions. Let's consider this stateless EJB that stores a file inside a repository.
To keep things simple (and focused on the transactions), we'll use a utility called JcrTools that
ModeShape includes in its public API with a method to upload the content in the supplied InputStream into
a nt:file node (with its "jcr:content" child node), creating any missing ancestor nt:folder nodes
along the way:
ContentProvider.java
@Stateless
public class ContentProvider {
protected Session getSession() throws RepositoryException, NamingException {
InitialContext context = new InitialContext();
Repository repository = (Repository)context.lookup("jcr/my_repository");
return repository.login(); // uses default workspace and JAAS
}
public void uploadFile( String path, InputStream stream ) throws RepositoryException,
NamingException, IOException {
Session session = getSession();
try {
// Use a ModeShape utility ...
new JcrTools().uploadFile(session,path,stream);
session.save();
} finally {
session.logout();
}
}
}
Page 49 of 424
ModeShape 3
This EJB can then be called by any other component in the web application (including JSPs). And notice that
we don't include any transaction-related code, yet this is what happens when the uploadFile method is
called:
1. If a transaction doesn't exist, one will be started
2. The uploadFile(...) method is executed
1. A new Session is created using the current JAAS credentials (line 10)
2. nt:folder and nt:file nodes dictated by the path are created as needed, a Binary value
is created with the content from the InputStream and set as the "jcr:data" property on the
"jcr:content" child of the nt:file node (line 13)
3. The changes made to the session are saved (line 14)
4. The session is closed (line 16)
3. The transaction (started before the uploadFile method is executed or at an earlier point) is
committed (or rolled back if an exception is thrown)
Now the somewhat tricky part. The transient changes made on line 13 are not persisted when the Session
is saved (on line 14); rather, the changes are written to the repository, made visible to other sessions, and
events are produced only when/if the transaction is committed. This would be true if we'd made several sets
of transient changes interspersed with multiple Session.save() calls.
You may have noticed that EJBs no longer have to implement a remote or special interface. The
only line in the above code that is at all EJB-related is the @Stateless annotation on line 1.
Page 50 of 424
ModeShape 3
1.10.2 EJBs with bean-managed transactions

Using JCR within EJBs that manage the transactions themselves (i.e., bean-managed transactions) works
the same way, except that the bean must explicitly control the transaction boundaries via the
UserTransaction interface. Prior to EE6, obtaining the UserTransaction instance was not (as)
standardized, though often looking it upon JNDI was the most common approach. However, with EE6, the
UserTransaction can simply be injected into the bean:
ContentProvider.java
@Stateless
@TransactionManagement(BEAN)
public class ContentProvider {
@Resource
UserTransaction ut;
protected Session getSession() throws RepositoryException, NamingException {
Repository repository = (Repository)context.lookup("jcr/my_repository");
return repository.login(); // uses default workspace and JAAS
}
public void uploadFile( String path, InputStream stream ) throws RepositoryException,
NamingException, IOException {
Session session = getSession();
try {
ut.begin();
session.save();
// Commit the transaction ...
ut.commit();
} catch ( RepositoryException e ) {
ut.rollback();
throw e;
} catch ( ... ) {
// Handle the various (and many) JTA-related transactions, rolling back as
appropriate
} finally {
session.logout();
}
}
}
Page 51 of 424
ModeShape 3
1.10.3 Explicit JTA transactions

If your application is running in an environment that supports JTA, then your application can control the
boundaries of the transactions. Here's an example that uploads a file into the repository, creating also
missing nodes or updating the existing nodes (the same thing as the EJB example above):
TransactionExample.java
javax.jta.TransactionManager txnMgr = ...
// Start the transaction ...
txnMgr.begin();
try {
// Obtain a session, using default workspace and credentials (which may be JAAS)
Session session = repository.login();
try {
session.save();
txnMgr.commit();
} finally {
session.logout();
}
} catch ( ... ) {
// Handle RepositoryException and the various (and many) JTA-related transactions, rolling
back as appropriate
}
In this example, we had to code all the transaction-related logic, which is quite extensive and can be tricky to
get right while properly handling all the JTA-related and repository exceptions. But once again, the transient
changes made on line 11 and saved on line 12 will not actually be persisted in the repository until the
transaction is committed (line 15).
This example shows all the complexity of explicit user transactions. EJBs with container-managed
transactions looks really nice, doesn't it.
We could just as easily created our session outside the transaction, too:
Page 52 of 424
ModeShape 3
TransactionExample.java
javax.jta.TransactionManager txnMgr = ...
Session session = repository.login();
// Start the transaction ...
txnMgr.begin();
try {
session.save();
txnMgr.commit();
} catch ( ... ) {
// Handle RepositoryException and the various (and many) JTA-related transactions, rolling
back as appropriate
} finally {
session.logout();
}
We could have also used UserTransaction, but the details of how to obtain the UserTransaction
instance varies by environment (particularly outside non-JEE6 environments).
1.11 Patterns and Best Practices

The previous pages in this document show how to use the different parts of the JCR API. But such tutorials
don't cover what are perhaps the most interesting and useful topics: what are common patterns and best
practices when using the JCR API?
This section of the document attempts to shed some light on some of them:
1.11.1 Use unstructured primary types

1.11.2 Storing files and folders
One really nice feature of JCR repositories is that you can use them to store files and folders. And because it
is such a common pattern, the JCR specification defines several built-in node types that can be used to do
just this:
Page 53 of 424
ModeShape 3
nt:folder - The node type used to represent folder-type nodes

nt:file - The node type used to represent files.
nt:hierarchyNode - An abstract node type that serves as the base type of nt:file and
nt:folder
nt:resource - The node that used to represent the content of the file
nt:linkedFile - Similar to nt:file, except that the content is not stored under the file node but
instead is a reference to the content stored elsewhere.
Learning to use these node types can take a little work, because they're not quite as straightforward as you
might think. Here's a UML diagram showing the node types and the inheritance hierarchy:
Page 54 of 424
ModeShape 3
Using the built-in node types

Consider a "MyDocuments" folder that contains a "Personal" folder and a "Status Report.pdf" file. Heres
what those nodes might look like:
The folders look like what you might expect: they have a name, a primary type of nt:folder, and the
jcr:createdBy and jcr:created properties defined by the nt:folder node type. (These properties
are defined as autocreated, meaning the repository should set these automatically.)
The file representation, on the other hand, is different. The " Status Report.pdf" node has a primary type
of nt:file and the jcr:createdBy and jcr:created properties defined by the nt:file node type,
but everything about the content (including the binary file content in the jcr:data property) is actually
stored in the child node named "jcr:content". This may seem odd at first, but actually this design very
nicely separates the file-related information from the content-related information.
Think about how an application might navigate the files and folders in a repository. Using the JCR API, the
application asks for the "MyDocuments" node, so the repository materializes it (and probably its list of
children) from storage. The application then asks for the children, so the repository loads the " Personal"
folder node and the "Status Report.pdf" node, and theres enough information on those nodes for the
application to display relevant information. Note that the "Status Report.pdf" files content has not yet
been materialized. Only when the application asks for the content of the file (that is, it asks for the "
jcr:content" node) will the content-related information be materialized by the repository. (And, some
repository implementations might delay loading the jcr:data binary property until the application asks for
it.) Nice, huh?
Page 55 of 424
ModeShape 3
Creating folders using the JCR API

Creating folders is pretty straightforward:
// Find the parent node ...

Node myDocuments = session.getNode(pathToMyDocuments);
// Create a new folder node ...
Node personal = myDocuments.addNode("Personal","nt:folder");
// The auto-created properties are added when the session is saved ...
session.save();
// Get the property values that were auto-created ...
String createdBy = personal.getProperty("jcr:createdBy").getString();
Calendar createdAt = personal.getProperty("jcr:created").getDate();
Note how we used the second parameter of the addNode method to specify which node type should be
used as the primary type for the new node. Also, the "jcr:created" and "jcr:createdBy" auto-created
properties defined on the nt:folder node type (actually, inherited from the nt:resource supertype,
which inherits it from the mix:created mixin type).
Page 56 of 424
ModeShape 3
Uploading files using the JCR API

The only tricky part of adding files to the repository is just properly creating the two node pattern for
nt:file nodes. Here's the code that uploads a file represented by a java.io.File object:
// Find the parent node ...

Node folder = ...
// Assume that we have a file that exists and can be read ...
File file = ...
// Determine the last-modified by value of the file (if important) ...
Calendar lastModified = Calendar.getInstance();
lastModified.setTimeInMillis(file.lastModified());
// Create a buffered input stream for the file's contents ...
InputStream stream = new BufferedInputStream(new FileInputStream(file));
// Create an 'nt:file' node at the supplied path ...
Node fileNode = folder.addNode(file.getName(),"nt:file");
// Upload the file to that node ...
Node contentNode = fileNode.addNode("jcr:content", "nt:resource");
Binary binary = session.getValueFactory().createBinary(stream);
contentNode.setProperty("jcr:data", binary);
contentNode.setProperty("jcr:lastModified",lastModified);
// Save the session (and auto-created the properties) ...
session.save();
Again, this is not very complicated if you understand the pattern. We first create the nt:file node that
represents the file and its metadata, and under that we create a child node named " jcr:content" (of type
nt:resource) that represents the content of the file. We also explicitly set the "jcr:lastModified"
timestamp to mirror the time the file was last modified (if that's important).
We could try to set the "jcr:createdBy" or "jcr:created" properties till we're blue in the face,
but JCR always sets the time of these auto-created properties when newly-created nodes are
saved, and JCR will always overwrite the values we set.
Adding other properties

Another interesting aspect of the nt:file and nt:folder node types (and even the nt:resource node
type) is that they dont allow adding just any property on the node. The beauty is that they dont have to,
because you can still add extra properties to these nodes using mixins!
Lets imagine that we want to add tags to our file and folder nodes, and that we want to start capturing the
SHA-1 checksum (as a hexadecimal string) of our files. To start, we need to create two mixins, which we'll
define using the standard CND format:
Page 57 of 424
ModeShape 3
<acme = "http://www.acme.com/nodetypes/1.0">
[acme:taggable] mixin
- acme:tags (STRING) multiple
[acme:checksum] mixin
- acme:sha1 (STRING) mandatory
We could have defined a mixin that allows any single or multi-valued property, similar to how the
standard nt:unstructured node type is defined. Then, we can add any properties we want.
However, it's sometimes better to use more targeted mixins like these, if for no other reason than it
makes it very easy to use JCR-SQL2 to query the nodes that use these mixins.
We then need to register these node types in our repository (perhaps by loading the CND file or
programmatically using the NodeTypeManager). Then, we can add the acme:taggable mixin to whatever
file and folder nodes we want. This is as simple as:
// Find the node ...

Node personalFolder = myDocuments.getNode("Personal");
// Add the mixin and set the "acme:tags" property on the "Personal" folder ...
personalFolder.addMixin("acme:taggable");
String[] tags = {"non-work"};
personalFolder.setProperty("acme:tags",tags);
// Add the tags to the "Status Report.pdf" node ...
Node statusReport = myDocuments.getNode("Status Report.pdf");
statusReport.addMixin("acme:taggable");
statusReport.setProperty("acme:tags",{"status", "projectX"});
// Add add the SHA-1 hash to the "jcr:content" node ...
Node content = statusReport.getNode("jcr:content");
content.addMixin("acme:checksum");
content.setProperty("acme:sha1","e676b12c3ebfb1"});
// Save the changes ...
session.save();
The result is something like this, where the new properties are shown in a boldface font:
Page 58 of 424
ModeShape 3
Reading the content

Reading the file and folder information is simpler than writing it. The key is to remember that the "file"
information is broken into separate nodes.
// Find the node ...

Node statusReport = myDocuments.getNode("Personal/Status Report.pdf");
// Get the created information (we'll assume it's there) ...
Calendar created = statusReport.getProperty("jcr:created").getDate();
String creator = statusReport.getProperty("jcr:createdBy").getString();
// Now get the content of the file ...
Node statusReportContent = statusReport.getNode("jcr:content);
// Get the MIME type if it's there ...
String mimeType = null;
if ( statusReportContent.hasProperty("jcr:mimeType") ) {
mimeType = statusReportContent.getProperty("jcr:mimeType").getString();
}
Binary content = statusReportContent.getProperty("jcr:data").getBinary();
InputStream stream = content.getInputStream();
try {
// do something with the stream
} finally {
stream.close();
}
If you added mixins and other properties to these nodes, they're accessible just like any other property.
Page 59 of 424
ModeShape 3
Extending the built-in node types

We showed above how to use mixins to store extra properties on nodes that use primary types that don't
allow just any property. While that may be a preferred way to do it, it's not the only way. Another approach is
to create custom node types that we'd use in place of nt:folder, nt:file and nt:resource. Here's a
CND fragment showing the new types:
<acme = "http://www.acme.com/nodetypes/1.0">
[acme:file] > nt:file
[acme:folder] > nt:folder
[acme:resource] > nt:resource
- acme:sha1 (STRING) mandatory
Then in our code we could simply use "acme:file" in place of "nt:file", and "acme:folder" in place of
"nt:folder", and "acme:resource" in place of "nt:resource".
But there are several pretty big disadvantages to this approach:
1. Any applications that are expecting nt:file and nt:folder nodes might break. Granted, such
apps would have been hard coded to expect a particular set of content. But someone may have taken
a shortcut.
2. Notice that we've had to define three node types, even though both the acme:file and
acme:folder types have exactly the same property. When we used mixins, we only created two
mixins, and we could use them anywhere.
3. The new node types don't really mirror a characteristic or facet (e.g., something is "taggable" or "has
a SHA-1 hash"), whereas the mixins did exactly that and we could use them anywhere it made sense.
4. Every acme:file has an optional "acme:tags" property, whether or not that node needs it. And so
we have to make the decision up front whether a file should be represented by an acme:file node
or a standard nt:file node. With mixins, we can add the acme:tags mixin only when we need to
add tags to a node.
There is one benefit to defining node types in this way: when using JCR-SQL2 queries, we can use a simple
query to find all the properties of acme:file nodes:
SELECT file.[jcr:createdBy], file.[acme:tags], file.[jcr:name]

FROM [acme:file] AS file
If we had used mixins with the standard nt:file types, we can still get the same information but our query
has to use a join:
Page 60 of 424
ModeShape 3
SELECT file.[jcr:createdBy], taggable.[acme:tags], file.[jcr:name]

FROM [nt:file] AS file JOIN [acme:tags] as taggable
ON ISSAMENODE(file,taggable)
That's certainly not much more complicated, and in general the benefits of using mixins far outweigh the
slightly-increased complexity of JCR-SQL2 queries.
1.11.3 Mixin characteristics with mixins

1.11.4 Prefer hierarchies
Page 61 of 424
ModeShape 3
1.11.5 Sessions in web applications

Using ModeShape in web applications is a powerful way to persist information, especially since the
hierarchical nature of ModeShape maps very well to URLs. How your web application uses {{Session}}s
depends a little on how the web application works. But the usual rules about not using a Session in multiple
threads and long-running Sessions still apply here.
Stateless web applications and services
Stateful web applications
Stateless web applications and services

It's very easy to use sessions in web applications or services that are stateless: simply obtain a new
Session for each request, use it to read/change content, save any changes, and then close the session.
Creating a new Session is very lightweight, and it's easier and likely faster than trying to maintain a pool of
active Session}}s. It also means your application or service can authenticate the
{{Session with the user making the request.
If you can make your web application or service stateless, then just do it. Stateless web apps are
always simpler and easier to scale .
Stateful web applications

In a stateful web application,s clients can interact with the server across multiple HTTP requests. These
kinds of applications are more complicated, but very useful if you have to have this behavior. If possible, try
to design your web application so that (just like stateless web applications) the application handles each
HTTP request by using a new Session, reading/changing content, saving any changes, and then closing
the session.
Sometimes that won't be possible, and you'll need to associate a single Session with the HTTP session. In
this case, each HTTP request that is part of the same HTTP session will read/change content, and one (or
more) of the requests will ultimately save any changes. Be sure to always close the Session when the
HTTP session ends (this can be tricky).
Be aware, however, that Session instances are not serializable, so this using a single Session for each
HTTP session is really only useful if you're using session affinity. That way, all client requests that are part of
the same HTTP session are handled by the same server (and thus Session instance).
Page 62 of 424
ModeShape 3
1.11.6 Verify supported features

Not every JCR repository supports all features. This may be because the particular implementation you've
chosen has not implemented every feature, or a particular repository may have certain features disabled. If
your application assumes certain behaviors, then it will break if the repository is configured differently than
expected or if another JCR implementation is used.
Another JCR best practice is for your application to use the repository descriptors to determine which of the
features/behaviors used by the application are indeed available and supported. Your application can even
gracefully degrade its functionality based upon the results.
Typically it is sufficient for the application to check the descriptors only the first time it obtains the
Repository instance. However, be aware that some repositories (including ModeShape) may
change the value of some descriptors as features are dynamically enabled or disabled through
implementation-specific means.
Examples
The following code shows how to check whether a repository supports locking nodes:

boolean supportLocks =
repository.getDescriptorValue(Repository.OPTION_LOCKING_SUPPORTED).getBoolean();
If not, the parts of your application that rely upon locking might be disabled.
Another example is determining the set of query languages supported by a repository:

Value[] languages = repository.getDescriptorValues(Repository.QUERY_LANGUAGES);
Set<String> supportedLangs = new HashSet<String>();
for ( Value language : languages ) {
supportedLangs.add(language.getString());
}
There are over 50 standard descriptors that expose a wide variety of functionality. Check the specification or
JavaDocs for more details.
Page 63 of 424
ModeShape 3
1.11.7 Import and export

The JCR API includes mechanisms for importing and exporting all or some of the content of a JCR
workspace from/to XML files. But the API also defines two XML formats to choose from: the Document View
and System View.
Document View
The Document View maps each JCR node to an XML element, and each JCR property to an XML attribute.
The advantage of this format is that the XML structure looks very similar to the JCR content.
However, there are several disadvantages. First, because JCR properties are mapped to XML attributes
(which can only have a single value), this format is not capable of properly accurately representing
multi-valued properties. Most implementations basically just export the first value of a multi-valued property.
The second disadvantage is that Document View does not handle large binary values well. Again, this is
because the Document View maps properties to XML attributes, and very large XML attribute values
generally pose problems for many XML libraries.
In general, avoid using the Document View for reliable exporting content unless your application
requires this view. Never use it for backups of a repository.
System View
The System View treats all JCR nodes and properties as XML elements. So although this is generally more
verbose than the Document View, it is the only view that can accurately and efficiently represent any
repository content, including large binary values and multi-valued properties.
Always use the System View when transferring content from one repository to another. See the
ModeShape backup utility for backing up repository content.
1.11.8 Sessions and Listeners

JCR observation allows a client to register listeners with a Session, and any time changes are persisted to
the repository the listeners will be notified asynchronously. However, listeners are only valid during the
lifetime of the session with which they're registered. So how can an application create and register a single
long-lived listener? What are the common patterns for registering listeners and then responding to the
events?
Page 64 of 424
ModeShape 3
Use a single "listener" Session

The simple approach is to create a single long-lived Session that is used only to register the application's
listener(s). This session should be created as soon as the application needs to start listening for events, and
is kept alive until the application no longer needs to respond to events. This can be done by any thread, but
generally the application will hold onto this session in a single application-wide location.
It is important, however, that this "listener session" not be used for any other purposes. As we'll see shortly,
the listeners will be notified in separate threads, but the JCR specification states that {{Session}}s do not
need to be thread-safe. If each listener were to hold onto the session with which it's registered and then use
that session to access the workspace content, then multiple listeners might be notified concurrently in
separate threads and would attempt to concurrently use the session.
ModeShape sessions, nodes, and properties are all thread-safe, so it's perfectly safe to use the
same session to concurrently read content in separate threads. Just keep in mind that other
implementations will likely be more restrictive.
Responding to events
Each time a set of changes is persisted to the workspace used by the session, the repository will
asynchronously call each listener with the set of changes. The JCR specification does not specify how or in
what order the listeners will be notified, other than that a single listener will always see events in the same
order that they were generated.
Each listener should process the event and return as quickly as possible.
The JCR events usually contain enough information to be directly usable without having to look up other
information. So the safest and most efficient listeners can just use the information in the event without
needing a session.
This isn't always enough. Some listeners may need to respond to an event by reading workspace content. In
these situations, it's best for such listeners to use the information in the events to create and enqueue a job
that will be performed by a worker pool rather than doing the work within the listener's methods. If the worker
needs a session, then it should create a new one to do the work.
If the response to a change involves much work (like processing additional content), enqueue the
work and use a separate worker pool to perform the work.
Page 65 of 424
ModeShape 3
Filtering events
Many applications simply register a single listener and then process the events. If the listener is evaluating
each change and doing different work based upon which nodes changed, then consider registering multiple
listeners with different filter criteria. Or better yet, try it both ways to see which is the fastest.
Keep your listeners simple by defining multiple listener implementations and registering them with
different filter criteria.
Page 66 of 424
ModeShape 3
2 Introduction to ModeShape
2.1 Architecture
ModeShape is intended to be an embeddable hierarchical data store, and as such we try to ensure that it
makes judicious use of third party dependencies and is packaged in such a way as to have a relatively small
number of separate artifacts but also enough artifacts to bring in only the necessary dependencies.
See also the descriptions of the concepts in ModeShape.
ModeShape engine
Repository configuration
Clustering
Storage
Indexing
Modules
Public APIs
Core modules
Sequencers
Connectors
Web APIs
JDBC driver
Deployment
Examples
Quickstarts
2.1.1 ModeShape engine

Perhaps the most important component in ModeShape is the engine, which is responsible for managing and
making available all of the configured repositories. When ModeShape is embedded into an application, the
application is better of manually instantiating the org.modeshape.jcr.ModeShapeEngine class and
explicitly invoking the start(), deployRepository(...) and shutdown() methods in appropriate
places within the application's own lifecycle. Note that repository configurations can be updated even when
the repository is running and in use. ModeShape can also be deployed to a server (e.g., JBoss EAP,
Tomcat, etc.) so that the server manages the lifecycle of the engine.
Every repository in a ModeShapeEngine instance has a unique name, and applications can easily use the
engine to get a particular repository by name. If used within an environment that has JNDI, ModeShape will
also register each repository into JNDI so that applications can easily look it up. See the documentation for
all the ways to find a repository.
Page 67 of 424
ModeShape 3
2.1.2 Repository configuration

Each repository is configured separately with a file that conforms to the JSON format. (Note that when
installed into JBoss EAP, configuring ModeShape is done through EAP's configuration system.) The
configuration files can be read with the org.modeshape.jcr.RepositoryConfiguration class, and
the resulting RepositoryConfiguration instances can be passed to the
ModeShapeEngine.deployRepository(...) and ModeShapeEngine.updateRepository(...)
methods.
2.1.3 Clustering
ModeShape can be clustered at the repository level. This means that a repository with the same name is
deployed to multiple engines (typically in separate processes), and those repository instances are aware of
each other so that events that originate in one repository instance will be forwarded to all other repository
instances in the cluster. Additionally, the Infinispan cache(s) used in each repository should also be
clustered, so that Infinispan can coordinate changes to the data stored in the cache(s).
There are two other important aspects of clustering: storage and indexing.
Storage
If the Infinispan caches use cache stores to persist content to the filesystem, a database, cloud storage, etc.,
then this storage must be compatible with clustering. For example, if the cache store content on the file
system, then the cached used by each repository instance needs to have its own non-shared directory in
which the cache can persist information. (Infinispan clustering will use network messaging to ensure that
multiple instances that "own" a particular piece of data are all kept in sync.) Some of Infinispan's cache
stores are sharable, which means that multiple instances can all share a single store.
Indexing
Each repository instance uses indexes to help answer queries. When clustering a repository, the repository
needs to know whether it owns the indexes (in which case the repository will update the indexes to reflect all
changes that originate from the local or remote repository instances) or whether indexes are shared (in
which case the repository will update the indexes only when changes that are made with that repository
instance). Note that even in the shared case, the index files might be local copies that are periodically cloned
from a master set.
Local indexes are much easier to configured, but the disadvantage is that every repository is essentially
updating its own indexes for every change (so there is duplicate work). This might cause a write-heavy
system to become inundated with changes.
Shareable indexes are more difficult to configure (they require the use and proper configuration of JMS
and/or JGroups), but are generally more capable of handling large amounts of updates.
Page 68 of 424
ModeShape 3
2.1.4 Modules
The ModeShape software is in the form of JAR files, and these JAR files are published in the Maven Central
Repository for simpler application development. They are also available as a ZIP distribution from our
downloads area.
Please see our Getting Started page for more detail on how to use ModeShape in your
Maven-based application.
There are separate ModeShape JARs (or Maven artifacts) for each major component:
Public APIs
javax.jcr - This is the standard JCR 2.0 API, and it actually is not in our codebase but is available
in Maven. It has no dependencies.
modeshape-jcr-api - ModeShape's small extension to the standard JCR 2.0 API. This public API
was meant to be used by client applications that already use the JCR API, but it is entirely optional.
Many of the interfaces here extend standard interfaces and add just one or a few useful methods, so
most of the time clients can cast standard JCR instances to these ModeShape interfaces only when
they need a ModeShape-specific method. A few interfaces are new concepts that clients might need
to access. It only depends on the JCR API JAR. Note that the ModeShape team will only modify the
public API in a backward-compatible fashion: while some methods might be deprecated at any time
(though we don't anticipate doing so), changes that are not backward compatible (e.g., removal of
deprecated methods) will only occur on major releases. This module also defines the Sequencer SPI,
since sequencer implementations only need the JCR API and this public API.
Page 69 of 424
ModeShape 3
Core modules
modeshape-common - A simple set of domain-independent utilties and classes that are available for
use in any other module. Some of these might be similar to those available in other third-party
libraries, but were create and are maintained here to help minimize third-party dependencies
(especially when small fractions of the third party libraries would be used). This includes
ModeShape's framework for internationalization (I18n) and the logging framework that is a slight
facade on top of several other logging systems, including SLF4J, Log4J, Logback, JDK logging. Sure,
SLF4J is already a logging abstraction framework, but using our own abstraction makes it easier for
developers to hook up ModeShape logging to their preferred framework (just include the appropriate
logging JAR on the classpath, or fallback to JDK logging) and it also allows ModeShape to enforce
using only internationalized logging messages (except for debug and trace, which just take string
messages). Therefore, this module has no required dependencies, but will use one of the logging
frameworks if they are available on the classpath.
modeshape-schematic - A library for working with JSON and BSON documents, for storing them
inside Infinispan, and for editing them in a way that allows for the changes to be recorded as a set of
changes to the documents and atomically apply them. (The latter is what distinguishes this library
from other JSON or BSON libraries.) Supports reading a document from JSON and/or BSON, and
writing a document to JSON and/or BSON. ModeShape stores each node as a document inside
Infinispan, and this library encapsultes all of the domain-indpendent logic for doing this. The module
depends on several Infinispan artifacts.
modeshape-jcr - The primary module that contains the ModeShape engine and implementations of
the standard JCR API and ModeShape's public API. It also defines several SPIs, including the
Connector SPI (for federation) and the BinaryStore SPI (for storing binary values). It contains the file
system connector and CND sequencer (since neither is dependent upon any other libraries and thus
are too simple to be distinct artifacts.
Sequencers
All of the sequencer artifacts are named in a similar way: modeshape-sequencer-name. For example, the
DDL sequencer is in the modeshape-sequencer-ddl module, while the WSDL sequencer is in the
modeshape-sequencer-wsdl module.
The use of sequencers in a repository is entirely optional. And because nearly all of the sequencers depend
upon third-party libraries, we've put each sequencer into a separate artifact so that only the required
dependencies are included.
Page 70 of 424
ModeShape 3
Connectors
All of the connector artifacts are named in a similar way: modeshape-connector-name. For example, the
Git connector is in the modeshape-connector-git module, while the CMIS connector is in the
modeshape-connector-cmis module.
The use of federation (and thus connectors) in a repository is entirely optional. And because nearly all of the
connectors depend upon third-party libraries, we've put each connector into a separate artifact so that only
the required dependencies are included.
Web APIs
ModeShape has a number of web-based APIs that may optionally be used by remote clients to interact with
one or more repositories.
REST Service - a RESTful service that enables navigating, searching, modifying and deleting nearly
any content in the repositories (see the detailed API documentation). All representations are in JSON,
XML or text form. Each operation creates a new session, fulfills the request, and then closes the
session; sessions longer than a single request are not possible. Versioned content can be
manipulated: if it is changed, it is checked out, modified, saved, and checked back in. However, the
rest of the JCR functionality is not available. The WAR file is named
modeshape-web-jcr-rest-war-<version>.war.
WebDAV Service - exposes content via WebDAV, enabling WebDAV clients and operating systems
to mount the repositories as network disk drives. This service exposes a small amount of ModeShape
functionality, and allows clients to basically navigate, download, and upload files and folders. The
WAR file is named modeshape-web-jcr-webdav-war-<version>.war.
CMIS Service - exposes an API that conforms to CMIS. The CMIS functionality exposes the ability to
navigate, download, and upload folders and CMIS documents. The WAR file is named
modeshape-web-jcr-cmis-war-<version>.war.
Each of these services can be independently deployed to a web or application server and in which
ModeShape must be running. Each service talks to a single (local) ModeShapeEngine instance (typically
found via JNDI) and will work with all of the repositories deployed to that engine.
JDBC driver
ModeShape supports several query languages to allow client applications to find content independent of its
hierarchical location. The JCR-SQL2 language is by far the most powerful, and ModeShape provides a
JDBC driver that applications can use to query a repository (running in the same process or in a remote
process where the REST service is available). The driver JAR is self-contained, making it pretty easy to
incorporate into existing JDBC-aware applications. See this page for more detail.
Page 71 of 424
ModeShape 3
Deployment
In addition to embedding within a simple Java application, ModeShape can also be deployed to web or
application servers, and there are a number of other modules that support this:
JCA Adapter - A JCA-compliant adapter that enables deploying a ModeShape engine into a
JCA-compatible server. See this page for details.
JBoss EAP - A kit that can be unzipped on top of a JBoss EAP 6.1 installation to install a custom
subsystem that will manage a ModeShape engine and that makes it possible to configure
ModeShape, Infinispan, JGroups, security, and JDBC data sources all from within the standard EAP
configuration mechanism. Additionally, ModeShape can be used in any EJB or Java EE component,
and can even participate in user-managed or container-managed transactions. See this page for
details.
Examples
ModeShape provides a number of very simple examples that showcase how ModeShape can be used within
Maven-based applications. Each of these applications is targeted to show a very specific piece of
functionality (e.g., how to use Log4J logging). See https://github.com/ModeShape/modeshape-examples for
more details.
Quickstarts
ModeShape provides a Git repository of "quickstarts" that work with ModeShape installed on top of JBoss
EAP. See https://github.com/ModeShape/quickstart for details.
2.2 Authentication and authorization

ModeShape delegates all authentication and authorization to the providers with which a repository is
configured. ModeShape includes a few providers out-of-the-box, but it is also possible to create custom
authentication and/or authorization providers.
One exception is the Access Control feature added in ModeShape 3.4, which provides a way to use the
standard JCR API to define node-level access control lists (ACLs) that ModeShape will use to augment the
normal authorization mechanism. This fine-grained access controls are handled entirely within ModeShape,
stored within the normal repository content, and built on top of the existing authentication and authorization
providers.
This section describes how ModeShape's authentication and authorization features work.
Page 72 of 424
ModeShape 3
Authentication and authorization

Anonymous sessions
Built-in providers
JAAS
Configuration
Servlet authentication
Custom providers
Access controls
Privileges
Principals
Access Control Policies (ACLs)
2.2.1 Authentication and authorization

In order to create a Session, a client application must authenticate their identity by logging in and providing
a javax.jcr.Credential. ModeShape passes this credential to a series of
AuthenticationProvider components. The first provider to accept the credential will result in
ModeShape authenticating the caller and returning a valid Session.
The authorizing provider, as part of the authentication step, returns an internal SecurityContext that is
associated with that session. This SecurityContext is then used to determine whether the session is
authorized to read, write, or administer the repository. These are coarse-grained roles that apply to all
content; for example, if a session only has the read role, then it can read all repository content but can write
or administer no content.
The names of the three roles are "readonly", "readwrite", and "admin".
Anonymous sessions
ModeShape does make it possible for clients to create anonymous sessions. These are never authenticated,
and they generally are given only the "readonly" role. Of course, you can choose to configure anonymous
sessions to use any of the three roles, though be careful granting more than " readonly".
When a client attempts to authenticate normally by supplying credentials, should that authentication fail the
repository can do one of two things:
fail by throwing an exception
return an anonymous session
This is often useful in applications that want to always provide at least some read-only functionality for all
users.
Built-in providers
ModeShape includes a few authentication providers:
Page 73 of 424
ModeShape 3
JAAS
The "org.modeshape.jcr.security.JaasProvider" class is configured to use a specific JAAS policy
to perform all authentication and role-based authorization. This is the easiest to use, since most application
servers will come with JAAS support and even Java SE applications can pretty easily set up one of the
available JAAS implementations.
If no providers are explicitly configured, the JAAS provider is automatically enabled with the
"modeshape-jcr" policy.
Configuration
Each JAAS implementation will be configured differently. In the case of the PicketBox implementation,
configuration is done via a "jaas.conf.xml" file on the classpath. There are quite a few modules to
choose from, including LDAP, database, XACML, and even a simple file-based option. Here is an example
of a "jaas.conf.xml" file that uses the users and roles defined in local files:
<?xml version='1.0'?>
<policy xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:jboss:security-config:5.0" xmlns="urn:jboss:security-config:5.0">
<application-policy name="modeshape-jcr">
<authentication>
<login-module code="org.jboss.security.auth.spi.UsersRolesLoginModule" flag="required">
<module-option name="usersProperties">security/users.properties</module-option>
<module-option name="rolesProperties">security/roles.properties</module-option>
</login-module>
</authentication>
</application-policy>
</policy>
This file sets up a JAAS policy named "modeshape-jcr" that uses the User-Roles Login Module, and
defines the users and passwords in the "security/users.properties" file and the roles in the "
security/roles.properties" file.
The users file contains a line for each user, of the form " username=password". The roles file also contains
a line for each user, but this format is a little more complicated:
{{<username>=<role>\[,<role>,...\]}}
where:
Page 74 of 424
ModeShape 3
<username> is the name of the user,

<role> is an expression describing a role for the user and which adheres to the format "
<role>=<roleName>[.<workspaceName]", where:
<roleName> is one of "admin", "readonly", "readwrite", or (for WebDAV and RESTful
access) "connect"
<workspaceName> is the name of the repository workspace to which the role is granted; if
absent, the role will be granted for all workspaces in the repository
For example, the following line provides all roles to user 'jsmith' for all workspaces in the configured
repository:
jsmith=admin,connect,readonly,readwrite
while
jsmith=connect,readonly,readwrite.ws1
provides connect and read access to all workspaces, but only write access to the " ws1" workspace.
When using the JBoss EAP kit, the security mechanism is configured as part of EAP, though some
of the example configuration files in the kit set up the file-based authentication system via "
modeshape-users.properties" and "modeshape-roles.properties" files.
Servlet authentication
Simply configure a repository to this provider, and then have your applications create a "
org.modeshape.jcr.api.ServletCredentials" instance with the servlet's HttpServletRequest.
ModeShape will then delegate all authentication and role-based authorization to the servlet container. Again,
the roles are expected to be "readonly", "readwrite" and "admin".
If no providers are explicitly configured, the Servlet provider is automatically enabled if the servlet
API is on the classpath.
Custom providers
If you would like to have ModeShape integrate with a different security system, then you will need to create a
custom authorization provider. For more information about this, see the Custom authentication providers
page.
Page 75 of 424
ModeShape 3
2.2.2 Access controls

Recall that the aforementioned role-based authorizations apply to a whole repository or workspace, and thus
are referred to as coarse-grained authorization. This simple approach is perfectly acceptable for many
applications. However, starting with ModeShape 3.4 it is possible to use fine-grained authorization to
determine what operations are allowed on specific nodes or subtrees. The API to set up and manage these
fine-grained permissions and access control lists is actually part of the standard JCR 2.0 API.
Note that an authenticated user must have already be granted the coarse-grained roles for a repository
before any fine-grained access controls are even evaluated. This means that, for example, even if an
authenticated user is granted a privilege to modify the properties of a node, that means nothing unless the
user has one of the roles that allows writing or changing content. In other words, when using fine-grained
access controls, ModeShape will require that both the coarse-grained and fine-grained authorizations allow
the requested action.
Privileges
The JCR 2.0 API defines the following privileges:
Page 76 of 424
ModeShape 3
Privilege
Description
jcr:read
The privilege to retrieve a node and get its properties and their values.
jcr:modifyProperties
The privilege to create, remove and modify the values of the properties of
a node.
jcr:addChildNodes
The privilege to create child nodes of a node.
jcr:removeNode
The privilege to remove a node.
jcr:removeChildNodes
The privilege to remove child nodes of a node.

In order to actually remove a node requires jcr:removeNode on that
node and jcr:removeChildNodes on the parent node.
jcr:write
An aggregate privilege that contains: jcr:modifyProperties,

jcr:addChildNodes, jcr:removeNode, and
jcr:removeChildNodes.
jcr:readAccessControl
The privilege to read the access control settings of a node.
jcr:modifyAccessControl The privilege to modify the access control settings of a node.

jcr:lockManagement
The privilege to lock and unlock a node.
jcr:versionManagement
The privilege to perform versioning operations on a node.
jcr:nodeTypeManagement
The privilege to add and remove mixin node types and change the
primary node type of a node.
jcr:retentionManagement The privilege to perform retention management operations on a node.

jcr:lifecycleManagement The privilege to perform lifecycle operations on a node.
jcr:all
An aggregate privilege that contains: jcr:read, jcr:write,

jcr:readAccessControl, jcr:modifyAccessControl,
jcr:lockManagement, jcr:versionManagement,
jcr:nodeTypeManagement, jcr:retentionManagement, and
jcr:lifecycleManagement
See the javax.jcr.security.AccessControlManager API for methods to determine the privileges

supported by the repository on any given node and for manually determining whether the session has
particular privileges on any given node.
Principals
Privileges are assigned to specific principals, which can either represent usernames or the names of groups.
A principal is represented in the API via the javax.security.Prinicpal, which can be any
implementation. (ModeShape primarily just uses the principal's name.)
An authenticated user is considered a member of a group if the AuthorizationProvider or
AdvancedAuthorizationProvider implementations return true for hasRole(groupName).
Page 77 of 424
ModeShape 3
Access Control Policies (ACLs)

The privileges granted to a user can be controlled by assigning an access control policy to nodes. Before the
access to a node can be controlled, however, it must have the " mode:accessControllable" mixin. Each
such node has one or more access control policies to which additional access control entries (e.g., a
principal-permissions pair) can be added.
For example, the following code fragment shows how to define an access control policy on a specific node
(and its descendants):
String path = "/Cars/Luxury";

String[] privileges = new String[]{Privilege.JCR_READ, Privilege.JCR_WRITE,
Privilege.JCR_MODIFY_ACCESS_CONTROL};
Principal principal = ... /* any implementation, referring to a username or group name */
Session session = ...
AccessControlManager acm = session.getAccessControlManager();
// Convert the privilege strings to Privilege instances ...
Privilege[] permissions = new Privilege[privileges.length];
for (int i = 0; i < privileges.length; i++) {
permissions[i] = acm.privilegeFromName(privileges[i]);
}
AccessControlList acl = null;
AccessControlPolicyIterator it = acm.getApplicablePolicies(path);
if (it.hasNext()) {
acl = (AccessControlList)it.nextAccessControlPolicy();
} else {
acl = (AccessControlList)acm.getPolicies(path)[0];
}
acl.addAccessControlEntry(principal, permissions);
acm.setPolicy(path, acl);
session.save();
From this point on, when a session is created by authenticating as a user with the supplied principal (e.g.,
username or group membership), then that session will be allowed to read, write and modify access controls
on the "/Cars/Luxury" node or its descendants (unless otherwise restricted with access controls). Again,
this presume that the authentication session already has the coarse-grained roles for reading and writing
content in this particular workspace.
Creating an access control entry for a principal that does not exist is not useful, but it is not
dangerous, either. Evaluation of access controls requires that the entry match the current session's
username or roles (for groups); other principals are never considered.
See the javax.jcr.security.AccessControlManager API and the JSR-283 for more information
about defining and using access control policies.
Page 78 of 424
ModeShape 3
2.3 Backup and restore

ModeShape contains a backup and restore feature enables repository administrators to create backups of
an entire repository (even when the repository is in use), and to then restore a repository to the state
reflected by a particular backup. This works regardless of where the repository content is persisted.
There are several reasons why you might want to restore a repository to a previous state, and many are
quite obvious. For example, the application or the process its running in might stop unexpectedly. Or
perhaps the hardware on which the process is running might fail. Or perhaps the persistent store might have
a catastrophic failure (although surely youre also using the persistent stores backup system, too).
But there are also non-failure related reasons. Backups of a running repository can be used to transfer the
content to a new repository that is perhaps hosted in a different location. It might be possible to manually
transfer the persisted content (e.g., in a database or on the file system), but the process of doing so varies
with different kinds of persistence options. Also, ModeShape can be configured to use a distributed
in-memory data grid that already maintains its own copies for ensuring high availability, and therefore the
data grid might not persist anything to disk. In such cases, the content is stored on the data grids virtual
heap, and getting access to it without ModeShape may be quite difficult. Or, you may initially configure your
repository to use a particular persistence approach that suitable given the current needs, but over time the
repository grows and you want to move to a different, more scalable (but perhaps more complex)
persistence approach. Finally, the backup and restore feature can be used to migrate to a new major version
of ModeShape.
In short, you may very well have the need to set the contents of a repository back to an earlier state.
ModeShapes backup and restore feature makes this easy to do.
Getting started
Introducing the RepositoryManager
Creating a backup
Restoring a repository
Migrating from ModeShape 2.8 to 3.0 or 3.1
What's in the backup?
2.3.1 Getting started

Lets walk through the basic process of creating a backup of an existing repository and then restoring the
repository. Both of these steps require an authenticated Session that has administrative privileges. It actually
doesnt matter which workspace the session uses:

javax.jcr.Credentials credentials = ...
String workspaceName = ...
javax.jcr.Session session = repository.login(credentials,workspaceName);
So far, this is basic and standard stuff for any JCR client.
Page 79 of 424
ModeShape 3
2.3.2 Introducing the RepositoryManager

Each JCR Session instance has its own Workspace object that provides workspace-level functionality and
access to a set of manager interfaces: the VersionManager, NodeTypeManager, ObservationManager,
LockManager, etc. The JSR-333 (aka, JCR 2.1) effort is still incomplete, but has plans to introduce a
RepositoryManager that offers some repository-level functionality. The ModeShape public API has created
such an interface, and accessing it from a standard JCR Session instance is pretty simple:
org.modeshape.jcr.api.Session msSession = (org.modeshape.jcr.api.Session)session;

org.modeshape.jcr.api.RepositoryManager repoMgr =
((org.modeshape.jcr.api.Session)session).getWorkspace().getRepositoryManager();
The interface is pretty self-explanatory, and defines several methods including two that are related to the
backup and restore feature:
public interface RepositoryManager {

...
/**
* Begin a backup operation of the entire repository, writing the files
* associated with the backup to the specified directory on the local
* file system.
*
* The repository must be active when this operation is invoked, and
* it can continue to be used during backup (e.g., this can be a
* "live" backup operation), but this is not recommended if the backup
* will be used as part of a migration to a different version of
* ModeShape or to different installation.
*
*
* Multiple backup operations can operate at the same time, so it is
* the responsibility of the caller to not overload the repository
* with backup operations.
*
*
* @param backupDirectory the directory on the local file system into
*
which all backup files will be written; this directory
*
need not exist, but the process must have write privilege
*
for this directory
* @return the problems that occurred during the backup operation
* @throws AccessDeniedException if the current session does not
*
have sufficient privileges to perform the backup
* @throws RepositoryException if the backup cannot be run
*/
Problems backupRepository( File backupDirectory ) throws RepositoryException;
/**
* Begin a restore operation of the entire repository, reading the
Page 80 of 424
ModeShape 3
*
*
*
*
*
*
*
*
*
backup files in the specified directory on the local file system.

Upon completion of the restore operation, the repository will be
restarted automatically.
The repository must be active when this operation is invoked.
However, the repository <em>may not</em> be used by any other
activities during the restore operation; doing so will likely
result in a corrupt repository.
*
* It is the responsibility of the caller to ensure that this method
* is only invoked once; calling multiple times wil lead to
* a corrupt repository.
*
*
* @param backupDirectory the directory on the local file system
*
in which all backup files exist and were written by a
*
previous {@link #backupRepository(File) backup operation};
*
this directory must exist, and the process must have read
*
privilege for all contents in this directory
* @return the problems that occurred during the restore operation
* @throws AccessDeniedException if the current session does not
*
have sufficient privileges to perform the restore
* @throws RepositoryException if the restoration cannot be run
*/
Problems restoreRepository( File backupDirectory ) throws RepositoryException;
}
Next, well take a look at each of these two methods.
Page 81 of 424
ModeShape 3
2.3.3 Creating a backup

The backupRepository(...) method on ModeShapes RepositoryManager interface is used to create a
backup of the entire repository, including all workspaces that existed when the backup was initiated. This
method blocks until the backup is completed, so it is the callers responsibility to invoke the method
asynchronously if that is desired. When this method is called on a repository that is being actively used, all of
the changes made while the backup process is underway will be included; at some point near the end of the
backup process, however, additional changes will be excluded from the backup. This means that each
backup contains a fully-consistent snapshot of the entire repository as it existed near the time at which the
backup completed.
Heres an code example showing how easy it is to call this method:
org.modeshape.jcr.api.RepositoryManager repoMgr = ...

java.io.File backupDirectory = ...
Problems problems = repoMgr.backupRepository(backupDirectory);
if ( problems.hasProblems() ) {
System.out.println("Problems restoring the repository:");
// Report the problems (we'll just print them out) ...
for ( Problem problem : problems ) {
System.out.println(problem);
}
} else {
System.out.println("The backup was successful");
}
Each ModeShape backup is stored on the file system in a directory that contains a series of GZIP-ed files
(each containing representations of a approximately 100K nodes) and a subdirectory in which all the large
BINARY values are stored.
It is also the applications responsibility to initiate each backup operation. In other words, there currently is no
way to configure ModeShape to perform backups on a schedule. Doing so would add significant complexity
to ModeShape and the configuration, whereas leaving it to the application lets the application fully control
how and when such backups occur.
Page 82 of 424
ModeShape 3
2.3.4 Restoring a repository

Once you have a complete backup on disk, you can then restore a repository back to the state captured
within the backup. To do that, simply start a repository (or perhaps a new instance of a repository with a
different configuration) and, before its used by any applications, load into the new repository all of the
content in the backup. Heres a simple code example that shows how this is done:
Heres an code example showing how easy it is to call this method:
org.modeshape.jcr.api.RepositoryManager repoMgr = ...

java.io.File backupDirectory = ...
Problems problems = repoMgr.restoreRepository(backupDirectory);
if ( problems.hasProblems() ) {
System.out.println("Problems backing up the repository:");
// Report the problems (we'll just print them out) ...
for ( Problem problem : problems ) {
System.out.println(problem);
}
} else {
System.out.println("The restoration was successful");
}
Once a restore succeeds, the newly-restored repository will be restarted and will be ready to be used.
2.3.5 Migrating from ModeShape 2.8 to 3.0 or 3.1

Earlier I mentioned that backup and restore can be used to migrate from one version of ModeShape to the
next major version of ModeShape. Unfortunately this backup does not exist in any of the 2.x releases, so this
is not currently an option. Instead, content can be migrated via JCR import/export or with a custom
application that walks a 2.x repository and copies content into the 3.x repository.
Page 83 of 424
ModeShape 3
2.3.6 What's in the backup?

When ModeShape creates a backup in the directory of your choosing, it simply extracts the node
representations and binary values and writes them to files in a directory on the file system. The node
representations are actually Schematic documents, which are in-memory documents that have all the
capability of both JSON and BSON, and can easily be written out in either format without loss of information.
ModeShape performs the following steps when creating a backup:
1. Iterate over the all of the Infinispan cache entries, appending a number (e.g., 1000) of node
documents in JSON format to a file. Whenever the maximum number of entries per backup file is
reached, the file is closed, a new file will created, and the appending will continue. Note that files are
compressed using GZIP as the JSON format compresses quite well. By default, 100K nodes will be
exported to a single backup file; if each node requied about 200 bytes (compressed), the resulting
files will be about 19 MB in size.
2. Write out each of the binary values to a separate file.
We'll use a naming convention and organization within a single directory so that the restore process can
simply process all of these files, load them into the new repository's Infinispan cache and binary store.
2.4 Binary values

A common use for repositories is to manage (among other things) files, so ModeShape 3 is now capable of
handling even extremely large binary values that are larger than available memory. This is because
ModeShape never loads the whole value onto the heap, but instead streams the value to and from the
persistent store. And you can configure where ModeShape stores the binary values independently of where
the rest of the content is stored.
How it works
Extended Binary interface
Importing and Exporting
Implementation design
BinaryValue
BinaryStore
Minimum binary size
Minimum string size
MIME type detection
Text extraction
Garbage collection
BinaryStore implementations
Configuring Binary Stores
Files and Folders
Page 84 of 424
ModeShape 3
2.4.1 How it works

The key to understanding how ModeShape manages the Binary values is to remember how the JCR API
exposes them. To set a property to a Binary value, the JCR client creates the javax.jcr.Binary instance
from the binary stream:
javax.jcr.Session session = ...

javax.jcr.ValueFactory factory = session.getValueFactory();
// Create the binary value ...
java.io.InputStream stream = ...
javax.jcr.Binary binary = factory.createBinary(stream);
// Use the binary value ...
javax.jcr.Property property = ...
property.setValue(binary);
session.save();
Then, to access the binary content, the JCR client gets the property, gets the binary value(s), and then
obtains the binary value's InputStream:

javax.jcr.Binary binary = property.getValue().getBinary();
java.io.InputStream stream = binary.getStream();
// Use the stream ...
When ModeShape creates the actual javax.jcr.Binary value, it reads the supplied
java.io.InputStream and immediately stores the content in the repository's binary storage area, which
then returns a Binary instance that contains a pointer to the persisted binary content. When needed, that
Binary instance (or another one obtained at a later time) obtains from the binary storage area the
InputStream for the content and simply returns it.
Note that the same Binary value can be read from one property and set on any other properties:
Page 85 of 424
ModeShape 3
javax.jcr.Session session = ...

// Get the Binary value from one property ...
javax.jcr.Binary binary = property.getValue().getBinary();
// And set it as the value for other properties ...
property.setValue(binary);
session.save();
This works because the Binary value contains only the pointer to the binary content, copying or reusing the
Binary objects is very efficient and lightweight. It also works because of what ModeShape uses for the
pointers.
ModeShape stores all binary content by its SHA-1 hash. The SHA-1 cryptographic hash function is not used
for security purposes, but is instead used because the SHA-1 can reliably be determined entirely from the
content itself, and because two binary contents will only have the same SHA-1 if they are indeed identical.
Thus, the SHA-1 hash of some binary content serves as an excellent key for storing and referencing that
content. The pointer we mentioned in the previous paragraph is merely the SHA-1 of the binary content. The
following diagram represents how this works:
Page 86 of 424
ModeShape 3
Using the SHA-1 hash as the identifier for the binary content also means that ModeShape never needs to
store a given binary content more than once, no matter how many nodes or properties refer to it. It also
means that if your JCR client already knows (or can compute) the SHA-1 of a large value, the JCR client can
use ModeShape-specific APIs to easily determine if that value has already been stored in the repository.
(We'll see an example of this later on.)
2.4.2 Extended Binary interface

The ModeShape public API defines the org.modeshape.jcr.api.Binary interface as a simple
extension to the standard javax.jcr.Binary interface. ModeShape's extension adds useful methods to
get the SHA-1 hash (as a binary array and as a hexadecimal string) and the MIME type for the content:
Page 87 of 424
ModeShape 3
@Immutable
public interface Binary extends javax.jcr.Binary {
/**
* Get the SHA-1 hash of the contents. This hash can be used to determine whether two
* Binary instances contain the same content.
*
* Repeatedly calling this method should generally be efficient, as it most implementations
* will compute the hash only once.
*
* @return the hash of the contents as a byte array, or an empty array if the hash could
* not be computed.
* @see #getHexHash()
*/
byte[] getHash();
/**
* Get the hexadecimal form of the SHA-1 hash of the contents. This hash can be used to
* determine whether two Binary instances contain the same content.
*
* Repeatedly calling this method should generally be efficient, as it most implementations
* will compute the hash only once.
*
* @return the hexadecimal form of the getHash(), or a null string if the hash could
* not be computed or is not known
* @see #getHash()
*/
String getHexHash();
/**
* Get the MIME type for this binary value.
*
* @return the MIME type, or null if it cannot be determined (e.g., the Binary is empty)
* @throws IOException if there is a problem reading the binary content
* @throws RepositoryException if an error occurs.
*/
String getMimeType() throws IOException, RepositoryException;
/**
*
* @param name the name of the binary value, useful in helping to determine the MIME type
*/
String getMimeType( String name ) throws IOException, RepositoryException;
}
All javax.jcr.Binary values returned by ModeShape will implement this public interface, so feel free to
cast the values to gain access to the additional methods.
Page 88 of 424
ModeShape 3
2.4.3 Importing and Exporting

When exporting content from a workspace with large Binary values, be sure to export using JCR's System
View format. Only the System View treats properties as child elements. This allows each large value to be
streamed (using buffered streams) into the XML element's content as a Base64-encoded string. Importing
can also take advantage of streaming.
Exporting content using JCR's Document View results in all properties being treated as XML attributes, and
various XML processing libraries treat large attributes poorly (e.g., using values that are in-memory String
objects). Another critical disadvantage of the Document View is that it is unable to represent multi-valued
properties, since attributes can have only one value.
2.4.4 Implementation design

This section describes the internal design of how ModeShape stores binary values, and is typically useful to
either understand the nuances of the various configuration choices or to implement custom binary stores.
None of the interfaces described in this section are part of the public API, and should never be
directly used by JCR client applications.
Page 89 of 424
ModeShape 3
BinaryValue
In addition to the ModeShape-specific org.modeshape.jcr.api.Binary extension, ModeShape also
defines a org.modeshape.jcr.value.BinaryValue interface that adds several other features required
to properly persist and manage Binary values. These other features that are part of ModeShape's internal
design and therefore not appropriate for inclusion in the public API. Specifically, BinaryValue instances
are themselves immutable, they have an immutable BinaryKey that is a comparable representation of the
SHA-1 hash, they are comparable with each other (based upon their keys), they can be serialized, and the
getSize() method does not throw an exception like the standard method:
@Immutable
public interface BinaryValue extends Comparable<BinaryValue>, Serializable,
org.modeshape.jcr.api.Binary {
/**
* Get the length of this binary data.
*
* Note that this method, unlike the standard {@link javax.jcr.Binary#getSize()} method,
* does not throw an exception.
*
* @return the number of bytes in this binary data
*/
@Override
public long getSize();
/**
* Get the key for the binary value.
*
* @return the key; never null
*/
public BinaryKey getKey();
}
BinaryStore
The ModeShape-specific BinaryStore interface is thus defined to use the internal BinaryValue
interface:
@ThreadSafe
public interface BinaryStore {
/**
* Initialize the store and get ready for use.
*/
public void start();
/**
* Shuts down the store.
*/
public void shutdown();
/**
Page 90 of 424
ModeShape 3
* Get the minimum number of bytes that a binary value must contain before it can
* be stored in the binary store.
* @return the minimum number of bytes for a stored binary value; never negative
*/
long getMinimumBinarySizeInBytes();
/**
* Set the minimum number of bytes that a binary value must contain before it can
* be stored in the binary store.
* @param minSizeInBytes the minimum number of bytes for a stored binary value; never
negative
*/
void setMinimumBinarySizeInBytes( long minSizeInBytes );
/**
* Set the text extractor that can be used for extracting text from binary content.
* @param textExtractor the text extractor
*/
void setTextExtractor( TextExtractor textExtractor );
/**
* Set the MIME type detector that can be used for determining the MIME type for binary
content.
* @param mimeTypeDetector the detector; never null
*/
void setMimeTypeDetector( MimeTypeDetector mimeTypeDetector );
/**
* Store the binary value and return the JCR representation. Note that if the binary
* content in the supplied stream is already persisted in the store, the store may
* simply return the binary value referencing the existing content.
*
* @param stream the stream containing the binary content to be stored; may not be null
* @return the binary value representing the stored binary value; never null
* @throws BinaryStoreException if there is a problem storing the content
*/
BinaryValue storeValue( InputStream stream ) throws BinaryStoreException;
/**
* Get an InputStream to the binary content with the supplied key.
*
* @param key the key to the binary content; never null
* @return the input stream through which the content can be read
* @throws BinaryStoreException if there is a problem reading the content from the store
*/
InputStream getInputStream( BinaryKey key ) throws BinaryStoreException;
/**
* Mark the supplied binary keys as unused, but key them in quarantine until needed again
* (at which point they're removed from quarantine) or until
* removeValuesUnusedLongerThan(long, TimeUnit) is called. This method ignores any keys for
* values not stored within this store.
*
* Note that the implementation must *never* block.
*
* @param keys the keys for the binary values that are no longer needed
* @throws BinaryStoreException if there is a problem marking any of the supplied
* binary values as unused
Page 91 of 424
ModeShape 3
*/
void markAsUnused( Iterable<BinaryKey> keys ) throws BinaryStoreException;
/**
* Remove binary values that have been unused for at least the specified amount of time.
*
* Note that the implementation must *never* block.
*
* @param minimumAge the minimum time that a binary value has been unused before it can be
*
removed; must be non-negative
* @param unit the time unit for the minimum age; may not be null
* @throws BinaryStoreException if there is a problem removing the unused values
*/
void removeValuesUnusedLongerThan( long minimumAge,
TimeUnit unit ) throws BinaryStoreException;
/**
* Get the text that can be extracted from this binary content.
*
* @param binary the binary content; may not be null
* @return the extracted text, or null if none could be extracted
* @throws BinaryStoreException if the binary content could not be accessed
*/
String getText( BinaryValue binary ) throws BinaryStoreException;
/**
*
* @param binary the binary content; may not be null
* @param name the name of the content, useful for determining the MIME type;
* may be null if not known
*/
String getMimeType( BinaryValue binary,
String name ) throws IOException, RepositoryException;
/**
* Obtain an iterable implementation containing all of the store's binary keys. The
resulting iterator may be lazy, in the
* sense that it may determine additional {@link BinaryKey}s only as the iterator is used.
*
* @return the iterable set of binary keys; never null
* @throws BinaryStoreException if anything unexpected happens.
*/
Iterable<BinaryKey> getAllBinaryKeys() throws BinaryStoreException;
Each BinaryStore implementation must provide a no-arg constructor and member fields can be
configured via the repository configuration. Note that the BinaryStore implementation must also
implement several setter methods, which the repository calls when the BinaryStore is initialized and may
be called at any time after that (due to the repository configuration changing).
Page 92 of 424
ModeShape 3
Minimum binary size

When the BinaryStore is initialized, the repository will use the setMinimumBinarySizeInBytes(...)
method to specify the size for BinaryValue}}s that must be persisted within the
{{BinaryStore. Any binary content smaller than this can be represented with InMemoryBinaryValue
instances (meaning they will be persisted with property where it's used) or persisted in the BinaryStore.
Note that if repository's configuration changes, the repository may set a minimum size threshold.
Minimum string size

The repository can also use the BinaryStore to store large string values. Any strings larger than the
threshold set in the repository configuration will be stored in the BinaryStore and referenced in the node.
Note that there is nothing to configure in the BinaryStore itself.
MIME type detection

When the BinaryStore is initialized, the repository will use the setMimeTypeDetector(...) method to
give the BinaryStore a MimeTypeDetector instance it can use to determine the MIME type for any
binary content. The BinaryStore is free to determine the MIME type at any time, including when the binary
content is stored or only when the MIME type is needed (via the getMimeType(...) method). The
BinaryStore is also free to persist this information, since binary content for a given SHA-1 never changes.
Note that if repository's configuration changes, the repository may set a different MIME type detector.
Text extraction
When the BinaryStore is initialized, the repository will use the setTextExtractor(...) method to
give the BinaryStore a TextExtract instance it can use to extract the content's full-text search terms.
The BinaryStore is free to extract these terms at any time, including when the binary content is stored or
only when the terms are requested (via the getText(...) method). The BinaryStore is also free to
persist this information, since binary content for a given SHA-1 never changes. Note that if repository's
configuration changes, the repository may set a different text extractor.
To extract content from a binary value, ModeShape relies on 3rd party libraries for the extraction
functionality. ModeShape comes with one built-in extractor which uses Apache Tika. See Built-in text
extractors to see how you can configure this extractor.
Page 93 of 424
ModeShape 3
Garbage collection
There are a number of ways in which the BinaryStore may contain binary content (keyed by the SHA-1)
that are no longer used or referenced. The first is when a JCR client or the repository removes the last
Property containing the Binary. A second case is when a JCR client uses a Session to create a
javax.jcr.Binary value and clears the transient state (before the Session's transient state saved).
Neither of these pose a problem, since the minimum requirement is that the BinaryStore contain at least
the content that is referenced in the repository content. However, all unused binary content in the
BinaryStore takes up storage space, so ModeShape defines a way for the repository and the
BinaryStore to recover that unused storage.
The repository periodically runs a multi-phase garbage collection process to identify those binaries that are
no longer referenced by repository content. When such binaries are discovered, the repository calls the
BinaryStore's markAsUnused(...) method. The BinaryStore then quarantines the binaries; if any
quarantined binaries are used again, the BinaryStore can remove them from quarantine. The repository
then periodically calls the BinaryStore's removeValuesUnusedLongerThan(...) method to purge all
binaries that have been quarantined for at least the specified period of time.
The quarantine approach means that when {{BinaryValue}}s are removed, there is a period of time that they
can be reused without the expensive removal and re-adding of the binary content.
Page 94 of 424
ModeShape 3
BinaryStore implementations
There are currently a couple of implementations of BinaryStore:
org.modeshape.jcr.value.binary.FileSystemBinaryStore - Stores each binary in a file
on the file system, in a hierarchy of directories based upon the SHA-1 hash. The store does use
Java's native OS file locks to prevent other processes from concurrently writing the files, and it also
uses an internal set of locks to prevent mulitple threads from simultaneously writing to the persisted
files. This store exposes buffered FileInputStream instances that directly access the underlying
files.
org.modeshape.jcr.value.binary.InfinispanBinaryStore - Stores binary values within
Infinispan, allowing the binary values to be chunked and distributed across the data grid (while the
binary metadata is replicated across the grid). This option works really well for clustered topologies,
since all processes in the cluster can access the same store. Two different caches are used: one for
binary value metadata and one for the chunked values. Because the metadata for each value is very
small (roughly 120 bytes), the metadata cache can be replicated, whereas the value cache can be
replicated or distributed. Added in 3.1.0.Final
org.modeshape.jcr.value.binary.MongodbBinaryStore - Store binary values within a
MongoDB instance, where the binary values are chunked and stored inside the database. It does use
a local cache of binary values (backed by the file system store). Added in 3.1.0.Final
org.modeshape.jcr.value.binary.DatabaseBinaryStore - Store binary values within a
JDBC database, where the binary values are stored as BLOBs in the underlying database. Added in
3.1.0.Final
org.modeshape.jcr.value.binary.CassandraBinaryStore - Store binary values within a
Cassandra database, where the binary values are stored as BLOBs in the underlying database.
Added in 3.4.0.Final
org.modeshape.jcr.value.binary.TransientBinaryStore - A customization of the
FileSystemBinaryStore that uses the System's temporary directory (as defined by
java.io.tmpdir). Useful for testing or transient repositories only.
org.modeshape.jcr.value.binary.CompositeBinaryStore - A binary store which is able to
aggregate several binary stores of the type: file, infinispan, database or custom. Each nested binary
store must have a unique name, under which it is aggregated by the composite store. When creating
binary values, this name acts as a hint to binary value factory based on which a created value will go
in one store or another. To create binary values for this type of store, you must use the
org.modeshape.jcr.api.ValueFactory interface and the public Binary createBinary(
InputStream value, String hint ) method. Added in 3.3.0.Final
We would like to have other options, including storage in S3 and Hadoop. But it's also possible for
developers using ModeShape to write their own implementations.
Page 95 of 424
ModeShape 3
2.4.5 Configuring Binary Stores

If you no explicit binary store configuration is present, the TransientBinaryStore implementation will be
used by default. As explained above, this is not really suitable outside a testing context, as any binaries will
be lost between restart.
To explicitly configure a Binary Store in the repository JSON configuration file, add a binaryStorage
section to the main storage section.
For example:
"storage" : {
.........
"binaryStorage" : {
"type" : "file",
"directory": "target/persistent_repository/binaries"
}
}
will configure a FileSystemBinaryStore while
"storage" : {
.........
"binaryStorage" : {
"type" : "database",
"driverClass" : "org.h2.Driver",
"url" : "jdbc:h2:mem:target/test/binary-store-db;DB_CLOSE_DELAY=-1",
"username" : "sa"
}
}
will configure a DatabaseBinaryStore.

The valid list of types for the type attribute are: file, database, transient, cache, composite and
custom.
Regardless of the type, all binary stores support the following attributes:
minimumBinarySizeInBytes the minimum size (in bytes) above which binary values will be stored in
the store. Any binary value lower in size will be stored together with the
other node information
minimumStringSize
the minimum length of a string above which all strings are stored in the
binary store (as an optimization)
Beside these, each binary store type has its own list of custom attributes it supports. For more information
about each possible value see the repository schema .
Page 96 of 424
ModeShape 3
2.4.6 Files and Folders

A very simple way of adding binary content into a repository is uploading files & folders.
The JCR specification defines the following node types:

which means that a natural folder/file hierarchy would use a nt:folder/nt:file/jcr:content->

jcr:data hierarchy.
ModeShape provides via the modeshape-jcr-api artifact a utility for creating such a hierarchy:
org.modeshape.jcr.api.JcrTools#uploadFile. See the implementation for more information.
2.5 Clustering
You can create a ModeShape repository that stands alone and is self-contained, or you can create a cluster
of ModeShape repositories that all work together to ensure all content is accessible to each of the
repositories.
When you create a ModeShape cluster, then a client talking to any of the processes in the cluster will see
exactly the same content and the same events. In fact, from a client perspective, there is no difference
between talking to a repository that is clustered versus not-clustered.
ModeShape can be clustered in a variety of ways, but the biggest decision will be where ModeShape is to
store all of its content. Much of this flexibility comes from the power and flexibility of Infinispan, which can
use a variety of topologies:
Local
Replicated
Invalidation
Distributed
Remote
How to
Page 97 of 424
ModeShape 3
2.5.1 Local
In a local mode, ModeShape is not clustered at all. This is the default, so if you don't tell both ModeShape
and Infinispan to cluster, each process will happily operate without communicating or sharing any content.
Updates on one process will not be visible to any of the other processes.
"Local
Note that in the local, non-clustered topology data must be persisted to disk or some other system.
Otherwise, if the ModeShape process terminates, all data will be lost.
2.5.2 Replicated
The simplest clustering topology is to have each replicate all content across each member of the cluster.
This means that each cluster member has its own storage for content, binaries, and indexes - nothing is
shared. However, ModeShape (and Infinispan) processes in the cluster communicate to ensure that locks
are acquired as necessary and that committed changes in each member are replicated to all other members
of the cluster.
Page 98 of 424
ModeShape 3
Replicated cluster topology with shared storage

The advantage of this topology is that each member of the cluster has a complete set of content, so all reads
can be satisfied with locally-held data. Once a node is brought into memory and/or modified, then that
change is immediately propagated to the other nodes in the cluster. Should that same node be needed again
shortly thereafter, all processes have the latest change. This works great for small- to medium-sized
repositories, even when the available memory on each process is not large enough to hold all of the nodes
and binary values at one time. Additionally, because repositories share nothing (they all have their own
cache, own indexes, etc.), it is simple to add or remove cluster instances.
Most of the time ModeShape should be configured so that all of the members to share a persistent store. As
long as that persistent store is transactional and capable of coordinating multiple concurrent operations (e.g.,
a relational database), then all of the data will be persisted in the store and the entire cluster can be
shutdown with no data loss.
However, because all members of a replicated cluster have copies of all of the data, it is possible to not use
a persistent store. This will typically be faster, but it does mean that you must ensure that there will always
be at least one (probably several) members of the cluster running at all times. And, you have to be sure that
every process has enough memory to hold all of the nodes and binary values, so this is likely an option only
in a limited number of use cases.
Page 99 of 424
ModeShape 3
Replicated cluster topology with no storage

Either of the replicated topologies work well for repositories with fairly large amounts of content, and with
relatively few members of the cluster. Typically replication is used when you want clustering for
fault-tolerance purpose, to handle larger workloads of clients, or when the hardware is not terribly powerful.
Page 100 of 424
ModeShape 3
2.5.3 Invalidation
Using ModeShape with invalidation clustering is very similar to using replicated. The exception is that when
a node is changed or added one process, a replicated cluster will send that updated or new node to all other
processes in the cluster, making it very efficient should that same node be needed shortly thereafter on any
of the processes in the cluster. However, some application scenarios will rarely need to access the same
node again, and the replication of the changes from one process to all other processes in the cluster is really
just unnecessary overhead. In these situations, invalidation mode may be much better, since changing a
node on one process will simply notify the other processes to evict any cached (and now out-of-date)
representation of that node from memory. Should that same node be needed on a process, it then merely
just reads that representation from persistent storage.
Invalidation cluster topology with shared storage
2.5.4 Distributed
With larger cluster sizes, however, it is not as efficient for every member in the cluster to have a complete
copy of all of the data. Additionally, the overhead of coordination of locks and inter-process communication
starts to grow. This is when the distributed cluster topology becomes very advantageous.
In a distributed cluster, each piece of data is owned/managed by more than two members but fewer than the
total size of the cluster. In other words, each bit of data is distributed across enough members so that no
data will be lost if members catastrophically fail. And because of this, you can choose to not use persistent
storage but to instead rely upon the multiple copies of the in-memory data, especially if the cluster is hosted
in multiple data centers (or sites). In fact, a distributed cluster can have a very large number of members.
Page 101 of 424
ModeShape 3
Distributed cluster topology

In this topology, you will lose data if you lose or shutdown more than n processes in the cluster, where n is
the number of duplicates/copies of each node that the cluster maintains. Generally, n is chosen based upon
the maximum number of processes you can lose at any one time. Remember that if you lose several, you
can still bring them back up or even start additional processes, and Infinispan will reshuffle the data amongst
the cluster to ensure there are again n copies of all nodes.
In this scenario, when a client requests some node or binary value, ModeShape (via Infinispan) looks to see
which member owns the node and forwards the request to that node. (Each ModeShape repository instance
maintains a cache of nodes, so subsequent reads of the same node will be very quick.)
Of course, you can choose to use a shared persistent store with a distributed cache:
Distributed cluster topology with persistence

Here, your cluster is relying upon the shared persistent store to maintain persistence, while relying upon the
distributed nature of Infinispan to maintain all of the nodes in-memory somewhere on the cluster. Often, if a
process needs a node but does not have it in-memory, it can more quickly obtain that node from another
process that has it in-memory than it can read it from persistent storage.
Page 102 of 424
ModeShape 3
2.5.5 Remote
The final topology is to cluster ModeShape as normal but to configure Infinispan to use a remote data grid.
The benefit here is that the data grid is a self-contained and separately-managed system, and all of the
specifics of the Infinispan configuration can be hidden by the data grid. Additionally, the data grid could itself
be replicated or distributed across one or multiple physical sites.
Cluster topology with remote (data grid) storage

Because of differences in the remote and local Infinispan interfaces, the only way to get this to work is to use
a local cache with a remote cache store.
2.5.6 How to
Read on to learn how to to cluster an embedded repository or how to cluster a repository in EAP.
2.6 Configuration
There are three ways to configure ModeShape, and it depends on how you're deploying your application.
Page 103 of 424
ModeShape 3
1. ModeShape and JBoss AS7 - When ModeShape is installed into JBoss AS7, it is configured using
the AS7 configuration and tools (e.g., the command line interface, or CLI). For more details, see
Configuring ModeShape in EAP.
2. ModeShape via JCA - When your application is to be deployed to an environment that supports the
Java Connector Architecture (JCA), then you can deploy ModeShape repositories via ModeShape's
JCA adapter (available in ModeShape 3.1 or later). See ModeShape's JCA Adapter for more details.
3. Embedded ModeShape - In all other cases, ModeShape runs within your application, whether your
application is a regular Java SE application or a web application deployed to a web container. This
means that your application needs to create a single ModeShape engine, deploy a JSON
configuration file for each of the repositories needed by your application, and shut down the engine
when your application is shut down. See ModeShape in Java applications for more details.
2.6.1 Infinispan Configuration

In all cases, you still need to configure Infinispan. There are a few things to keep in mind:
Minimally, the cache used by a repository needs to be transactional, since ModeShape internally
uses transactions and works with client-initiated or container-managed JTA transactions.
Applications that may be concurrently updating the same nodes should use Infinispan configured to
use pessimistic locking with the READ_COMMITTED isolation level. By default Infinispan will
use optimistic locking; this is more efficient for applications that don't update the same nodes, but
concurrently updating the same nodes with optimistic locking may very well cause some updates to
be lost. If you're not sure, use pessimistic locking.
The following are some sample Infinispan configuration snippets using a FileCacheStore:
Page 104 of 424
ModeShape 3
Infinispan Embedded Pessimistic

<?xml version="1.0" encoding="UTF-8"?>
<infinispan
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:infinispan:config:5.2
http://www.infinispan.org/schemas/infinispan-config-5.2.xsd"
xmlns="urn:infinispan:config:5.2">
<global>
<globalJmxStatistics enabled="false" allowDuplicateDomains="true"/>
</global>
<namedCache name="persistentRepository">
<locking isolationLevel="READ_COMMITTED"/>
<transaction
transactionManagerLookupClass="org.infinispan.transaction.lookup.GenericTransactionManagerLookup"
transactionMode="TRANSACTIONAL"
lockingMode="PESSIMISTIC"/>
<loaders
passivation="false"
shared="false"
preload="false">
<loader
class="org.infinispan.loaders.file.FileCacheStore"
fetchPersistentState="false"
purgeOnStartup="false">
<properties>
<property name="location" value="target/persistent_repository/store"/>
</properties>
</loader>
</loaders>
</namedCache>
</infinispan>
AS7/EAP will use REPEATABLE_READ as the default isolation level. If an application is

concurrently changing the same node, it must make sure to change this default to
READ_COMMITTED
Infinispan EAP/AS7 Pessimistic Locking

<local-cache name="sample">
<locking isolationLevel="READ_COMMITTED"/>

<transaction mode="NON_XA" locking="PESSIMISTIC"/>

<file-store passivation="false" path="modeshape/store/sample"
relative-to="jboss.server.data.dir" purge="false"/>
</local-cache>
Page 105 of 424
ModeShape 3
2.7 Content grid

2.8 Federation
This is first available starting with ModeShape 3.1.
Usually a ModeShape repository owns all of its own data. It stores all of the information about all nodes
within an Infinispan cache, and the repository has a single binary store used to persist all BINARY values.
Sometimes it would be nice if a ModeShape repository could include some data that is actually owned and
managed by an external system. Clients could access internal data (owned by ModeShape) and external
data (owned by an external system) in exactly the same way, using the JCR API. ModeShape might cache
this external data (for performance reasons), but it would never store any of this external data.
The ability to access external and internal data in exactly the same way as if it were stored in one place is
what we call federation. This page describes how ModeShape federation works, the important concepts
used in federation, and how your applications can use federation.
Concepts and terminology
External system and sources
Connectors
Internal nodes
External nodes
Federated nodes
Projection
FederationManager
How it works
Navigation
Lookup by identifier
Caching
Querying
Paging connectors
Current connectors
Page 106 of 424
ModeShape 3
2.8.1 Concepts and terminology

The following diagram shows a conventional repository that owns all of its data, including all nodes,
properties, and binary values. This is usually what people think of when they think of a database. By default,
ModeShape operates in this way, where all content is stored in a distributed Infinispan data grid (which may
use replication/distribution to maintain multiple copies of the data, and/or it might use cache stores to persist
the content on the file system, in a database, in the cloud, etc.) and where binaries can be stored separately
(in a variety of places, including Infinispan).
Conventional repository
In repositories like this, all of the nodes are treated the same way, since they're all owned by ModeShape.
However, if federation is enabled on the repository, then other kinds of nodes appear and other concepts
become important:
Page 107 of 424
ModeShape 3
Federated repository
The next sections talk about these concepts.
External system and sources

An external system is a system outside of ModeShape that owns its own data and that ModeShape interacts
with to access (and optionally update) that data. The external system might be a data store, or it might be a
service that dynamically produces data. Examples of external systems are Oracle 11i, Cassandra,
MongoDB, Git, SVN, SAP, file systems, CMIS, RPM repositories, and JCR repositories.
Whereas an external system is a kind of software system, we use the term external source to describe an
addressable instance or installation of the external system. For example, external sources might include a
particular database instance, a particular Git repository, a particular file system on a specific machine, or a
particular instance of a CMIS repository.
In the digram above, two external sources are shown and labeled "External Source A" and "External Source
B". (But the diagram does not define what kind of system they are.)
Page 108 of 424
ModeShape 3
Connectors
A ModeShape connector is the software used to interact with a specific kind of external system. A connector
consists of compiled Java classes and resources, and is usually packaged as a JAR with dependencies on
3rd party libraries. ModeShape defines a connector SPI (or Service Provider Interface) which the connector
must implement. Generally connectors can read and update data in the external system, although a
connector implementation may support only read operations.
To be useful, however, a connector must be instantiated and that instance configured to talk to a specific
external source. Then that connector instance's job is to create a single, virtual tree of nodes that represents
the data in the external source. Note that the connector doesn't create the entire tree up front; instead, the
connector creates the nodes in that virtual tree only when ModeShape asks for them. Thus, the potential tree
of nodes for a given source might be massive, but only the nodes being used will be materialized.
The diagram of the federated repository shown above includes two connector instances, each of which is
configured to talk to one of the external sources.
Internal nodes
An internal node is any node within a ModeShape repository that is owned by ModeShape and stored within
the Infinispan cache. In a regular repository (without federation), all nodes are internal nodes.
Internal nodes are shown in the federated repository diagram above as the rust-colored and purple nodes.
External nodes
An external node is any node within a federated ModeShape repository that is not owned by ModeShape but
instead is dynamically generated to represent some portion of data in an external source. ModeShape
clients view internal and external nodes in exactly the same way, but internally ModeShape handles internal
and external nodes in very different ways.
External nodes are shown in the federated repository diagram above as the green, yellow, and blue colored
nodes.
Federated nodes
A federated node is simply an internal node that contains some children that are external nodes. In other
words, only federated nodes can have internal nodes and external nodes as children, whereas internal
nodes can only have other internal nodes as children and external nodes can only have other external nodes
as children.
Federated nodes are shown in the federated repository diagram above as the purple nodes.
Page 109 of 424
ModeShape 3
Projection
A projection is a portion of the repository (really a subgraph) whose nodes are all external nodes that are
representations of some of the data in an external source. The nodes are dynamically generated (by the
connector's logic) as needed, and can optionally be cached for a configurable amount of time.
The federated repository diagram above shows three projections, labeled "Projection 1", "Projection 2", and
"Projection 3". Strictly speaking, projections do not have a name, so the labels are merely for discussion
purposes. Note how projections 1 and 2 both project external nodes from "External Source A", whereas
projection 3 only projects the external nodes from "External Source B". We often will talk about an external
source as having one or more projections; thus "External Source A" has two projections ("Projection 1" and
"Projection 2"), while "External Source B" has only one projection ("Projection 3").
Each projection maps a specific subtree of the virtual tree (created by a connector talking to an external
source) underneath a specific federated node. A simple path is used to identify the subtree of external
nodes, and a simple path is used to identify the federated node. ModeShape uses a projection expression
that is a string with these two paths:
<workspace-name>':' <path-to-federated-node> '=>' <path-in-external-source-of-node>
where
<workspace-name> is the name of the ModeShape workspace where the projection is to be placed
<path-to-federated-node> is a regular absolute path to the existing internal node under which
the external nodes are to appear
<path-in-external-source-of-node is a regular absolute path in the virtual tree created by the
connector of the node whose children are to appear as children of the federated node.
Projections can be defined within a repository's configuration (making them available immediately upon
startup of the repository) or programmatically added or removed by client applications using the
FederationManager interface.
FederationManager
The ModeShape public API includes the org.modeshape.jcr.api.federation.FederationManager
interface that defines several methods for programmatically creating and removing projections. Note that at
this time it is not possible to programmatically create, modify, or remove external sources, so these must be
defined within the repository configuration.
2.8.2 How it works

Federation is intended to be completely transparent to clients. There is no apparent difference between
internal nodes, federated nodes, and external nodes. Some operations might not be permitted on external
nodes (e.g., if the connector is read-only), but that's also true of internal nodes (though the reason while the
operation is not permitted may be different).
Page 110 of 424
ModeShape 3
Navigation
As clients navigate the nodes in the repository, they typically ask for one (or multiple) children of a particular
node. Clients repeat this process until they access the node(s) they're looking for.
ModeShape performs these operations differently depending upon the kind of node:
If the parent is an internal node, then all children will also be internal. Therefore, to find a particular
child by name, ModeShape obtains the parent's child reference to obtain the child's node key, and
then looks up the node with that key in the Infinispan cache. This is the "conventional" behavior, and
this incurs no overhead even when the repository is configured to use federation.
If the parent is a federated node, then the process is very similar to internal nodes, except that the
internal and external child references are managed separately. ModeShape then looks at the child's
node key to determine (from the key itself) if the child exists in a the Infinispan cache or in an external
source. If in an external source, ModeShape then calls to the connector to ask for the representation
of the requested node.
If the parent is an external node, then ModeShape obtains the parent's child reference and looks up
the node with that key in the same connector. The connector then generates a representation of the
requested node.
Lookup by identifier
All nodes (both internal and external) can be accessed by Session.getNodeByIdentifier(String,
where the identifier is the same string returned by calling the getIdentifier() method on the node.
ModeShape can tell from the identifier whether it is for an external node, and if so it will look up the node in
the connector.
Per the JCR specification, clients should treat these identifiers as opaque. In fact, ModeShape
identifiers follow a fairly complex pattern that will likely be difficult to reverse engineer, and which
may change at any time.
Page 111 of 424
ModeShape 3
Caching
ModeShape actually uses an in-memory LIRS cache of the nodes. So although the navigation and lookup
steps mentioned above don't discuss using the LIRS cache, ModeShape always consults this cache when it
needs to find a node with a particular node key. If found in the cache, the node will simply be used. If the
cache does not contain the node, then it will consult the Infinispan cache or the connector to obtain (and
cache) the node.
Normally, nodes in the LIRS cache are evicted after a certain (but configurable) time. However, external
nodes can have an additional internal property that specifies the maximum time that the node can be in the
cache. Or, an external source can be configured with a global time to live value. Either way, the LIRS cache
ensures that the nodes are evicted at the appopriate time.
Of course, a node is also evicted from the cache if the node has been changed and persisted (e.g., via
Session.save() or user transaction commit), even if that change was made on a different process in the
cluster.
Querying
A connector decides which external nodes are to be indexes.
The connector instance can be configured with a "queryable" boolean parameter that states
whether any of the content is to be queryable. This defaults to true.
The connector can mark any or all nodes as not queryable.
Thus, even though a connector implementation may be written such that some or all of the external nodes
can be queried, a repository configuration can configure an instance of that connector and override the
behavior so that no nodes are queryable.
If a connector is implemented by marking all nodes as not queryable, then configuring an instance
of that connector with queryable=true has no effect.
Any nodes that are queryable will be included in the index, as long as ModeShape is notified of new nodes.
By default, external nodes are not automatically indexed, so to index them simply use ModeShape's API for
reindexing.
Once indexed, the nodes can be queried just like any other nodes.
Page 112 of 424
ModeShape 3
Paging connectors
A connector works by creating a node representation of external data, and that node contains the references
to the node's children. These references are relatively small (just the ID and name of the child), and for many
connectors this is sufficient and fast enough. However, when the number of children under a node starts to
increase, building the list of child references for a parent node can become noticeable and even
burdensome, especially when few (if any) of the child references may ultimately be resolved into nodes
because no client actually uses those references.
A pageable connector is one that want to expose the children of nodes in a "page by page" fashion, where
the parent node only contains the first page of child references and subsequent pages are loaded only if
needed. This turns out to be quite effective, since when clients navigate a specific path (or ask for a specific
child of a parent by its name) ModeShape doesn't need to use the child references in a node's document
and can instead simply have the connector resolve such (relative or absolute external) paths into an
identifier and then ask for the document with that ID.
Therefore, the only time the child references are needed are when clients iterate over the children of a node.
A pageable connector will only be asked for as many pages as needed to handle the client's iteration,
making it very efficient for exposing a node structure that can contain nodes with numerous children.
2.8.3 Current connectors

ModeShape currently provides several connectors out-of-the-box, including:
2.9 Initial Content

To help set-up a simple, out-of-the-box repository pre-populated with some content, ModeShape provides a
way to configure such content using a simple xml format. This content can be imported either in a specific
workspace, or imported by default in all predefined or new workspaces.
Initial content is imported only the first time a repository starts up into the predefined workspaces or
when a new workspace is created, if that workspace was configured as such.
The initial content feature is intended to allow the import of a simple structure and is not intended
for large volumes of data or complex data structures. There are other, more powerful mechanisms
like backup & restore or JCR XML import/export that may be better suited to those cases.
Page 113 of 424
ModeShape 3
2.9.1 XML Format

Each initial content XML must define a single root node called jcr:root under the namespace
http://www.jcp.org/jcr/1.0. This represents the root node of a workspace and all content is
imported below it.
Example
<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0">
<folder jcr:mixinTypes="mix:created, mix:lastModified" jcr:primaryType="nt:folder">
<file1 jcr:primaryType="nt:file">
<jcr:content/>
</file1>
<file2 jcr:primaryType="nt:file">
<jcr:content/>
</file2>
</folder>
</jcr:root>
Each node has by default, the same name as the XML element which defines it and the properties the
attributes of the XML element. Beside any number of custom properties, the JCR properties: jcr:name,
jcr:primaryType and jcr:mixinTypes are supported, allowing for a node to have custom name, type
and/or mixins. If not specified, the default node type of the created node will be nt:unstructured.

<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0">
<Cars>
<Hybrid>
<car jcr:name="Toyota Prius" maker="Toyota" model="Prius"/>
<car jcr:name="Toyota Highlander" maker="Toyota" model="Highlander"/>
<car jcr:name="Nissan Altima" maker="Nissan" model="Altima"/>
</Hybrid>
<Sports>
<car jcr:name="Aston Martin DB9" maker="Aston Martin" model="DB9"/>
<car jcr:name="Infiniti G37" maker="Infiniti" model="G37"/>
</Sports>
</Cars>
</jcr:root>
It is also possible to override the name of the nodes by defining the jcr:name attribute, which will then be
used instead of the XML element's name.
Page 114 of 424
ModeShape 3
2.9.2 Configuring Initial Content

The configuration necessary for a repository to make use of the initial content is the following:
{
"name" : "Repository with initial content",
"storage" : {
"transactionManagerLookup" :
"org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
},
"workspaces" : {
"predefined" : ["ws1", "ws2"],
"default" : "default",
"allowCreation" : true,
"initialContent" : {
"ws1" : "xmlImport/docWithMixins.xml",
"ws2" : "xmlImport/docWithCustomType.xml",
"default" : "xmlImport/docWithoutNamespaces.xml",
"ws4" : "",
"ws5" : "xmlImport/docWithCustomType.xml",
"*" : "xmlImport/docWithMixins.xml"
}
}
}
One needs to define an initialContent object inside the workspaces object, with the following content:
each attribute name inside the initialContent object, with the exception of the * string, will be
treated as the name of a workspace and will have precedence over anything else. This includes the
empty string, which can be used to explicitly configure workspace without any initial content, when a
default is defined (see below)
the * character is interpreted as "default content" which means that any predefined or newly created
workspaces, that aren't configured explicitly, will make use of this content
the value of each attribute must be a simple string (including the empty string) which represents the
URL of an XML file located in the runtime classpath
2.10 Large numbers of child nodes

ModeShape 3 has been designed to efficiently handle a single node having a very large number (>100K) of
child nodes. It does this by segmenting the parent's list of child references into multiple blocks, where each
block is small enough to deal with.
Page 115 of 424
ModeShape 3
ModeShape actually performs this optimization in the background rather than do it during the Session's
save() operation. As a consequence, the actual number of child references stored in any block might vary
significantly from the "optimal" value. And while ModeShape is capable of transparently handling any size
blocks, performance when dealing with very large numbers of child nodes will be improve when the block
sizes are optimized.
The segmenting function is not enabled by default for a repository, meaning that ModeShape will store all
children nodes under the same parent. To enable it, you need to add it to the repository configuration:
JSON
"storage" : {
"documentOptimization" :
"childCountTarget" : 1000,
"childCountTolerance" : 10,
"threadPool" : "modeshape-opt",
"initialTime" : "00:00",
"intervalInHours" : 24
}
}
or:
EAP Configuration
<repository name="sample"
document-optimization-child-count-target="1000"
document-optimization-child-count-tolerance="10"
document-optimization-initial-time="02:00"
document-optimization-interval="24"
document-optimization-thread-pool="modeshape-opt"
/>
where the first 2 attributes control the desired number of children per segment and the variance tolerance,
while the last 3 control the details of the thread-pool that spawns the actual threads performing the
optimization process.
ModeShape actually performs really well while using a single block for storing child references,
even for moderate numbers of children (~10K).
Page 116 of 424
ModeShape 3
2.10.1 Accessing by path

Navigating to a node by using its path is perhaps one of the most common access patterns in JCR. This
uses the 'Node.getNode(String)' method that takes a relative path, and essentially boils down to finding
a particular child node with the supplied name and same-name-sibling index. ModeShape internally indexes
the children in each block by both name, so finding nodes by name (and SNS) are as fast as possible, even
if multiple blocks need to be accessed.
2.10.2 Iterating
Another common access pattern is to iterate over some or all of a parent node's children, using the '
Node.getNodes()' and 'Node.getNodes(String)' methods. The resulting NodeIterator will
transparently access the children in one block at a time, and will continue with all blocks until the last child
reference is found or until the caller halts the iteration.
2.10.3 Accessing by identifier

Another common access pattern is to find a node by identifier, using the '
Session.getNodeByIdentifier(String)' method. ModeShape handles this request by directly finding
the node by its identifier, and only needs to access the parent's (or ancestors') child references only when
the node's name or path is requested by the caller (via the ' Node.getName()' or 'Node.getPath()'
methods).
2.10.4 Additional performance considerations

See
http://modeshape.wordpress.com/2014/08/14/improving-performance-with-large-numbers-of-child-nodes
2.11 MIME types

2.12 Monitoring
ModeShape has a lot of moving parts, and with 3.7.2 we've made it much easier for your application to
monitor the repository to understand how it's being used and what work is going on in the background.
Page 117 of 424
ModeShape 3
Public API
Metrics
Windows and statistics
Histories
RepositoryMonitor
Examples
Active sessions during the last hour
Query durations during the last day
Worst performing queries during the last day
Event queue backlog during the last hour
2.12.1 Public API

ModeShape now includes as part of its public API a set of interfaces that your application can use to monitor
the activities and health of your ModeShape repository. We did this because the standard JCR API doesn't
cover monitoring at all, and we thought it's useful enough to make it available.
Metrics
ModeShape can capture a number of different measurements, called metrics, and these are broken into two
categories: duration-based metrics (how long something takes) and simple value metrics.
Duration metrics are represented by the org.modeshape.jcr.api.monitor.DurationMetric
enumeration, and include:
Metric
Description
Query execution
The amount of time required to execute a query.
time
Session duration The length of time that a session is used before being closed.
Sequencer
The length of time required to sequence a node, produce the output, and save the
duration
changes to the workspace.
Value metrics are represented by the org.modeshape.jcr.api.monitor.ValueMetric enumeration,

and include:
Page 118 of 424
ModeShape 3
Metric
Style
Description
Active sessions
continuous
The number of active sessions.
Active queries
continuous
The number of active queries.
Workspace count
continuous
The number of workspaces.
Session-scoped
continuous
The number of session-scoped locks held by clients.
Open-scoped locks
continuous
The number of open-scoped locks held by clients.
Listener count
continuous
The number of listeners registered with active sessions.
Event queue size
continuous
The number of events that are enqueued for processing and sending to
locks
listeners.
Event count
incremental The number of events that have been sent to at least one listener.
Changed nodes
incremental The number of nodes that were created, updated, or deleted.
Session saves
incremental The number of Session.save() calls.
Sequencer queue
continuous
The number of sequencing operations that are enqueued.
size
Sequenced nodes
incremental The number of nodes sequenced.
Values for each of these metrics is captured every 5 seconds, where the continuous metrics are simply
recorded as is (the values continue from one measurement to the next), while the incremental metrics
represent distinct perturbations (or increments) from 0.
Windows and statistics

As mentioned above, ModeShape measures the values for each metric every 5 seconds. But it would take
vast amounts of space to keep all these measurements around for long periods of time. Instead, ModeShape
calculates the statistics for various intervals, and then rolls up the statistics into different time windows.
The statistics are straightforward:
Page 119 of 424
ModeShape 3
Statistic
Data
Description
type
Count
int
The number of samples.
Maximum
long
The maximum value from the samples.
Minimum
long
The minimum value from the samples.
Mean
double
The mean (or average) value from the samples.
Variance
double
The average of the squared differences from the mean.
Standard
double
A measure of how spread out the samples are and is the square root of the
Deviation
variance
and are represented in the API by the org.modeshape.jcr.api.monitor.Statistics interface (with

getter methods for each statistic). The statistics were chosen because multiple Statistics objects can
easily be rolled-up into a single Statistic object.
The rollup process is pretty simple. For each metric:
The value is captured every 5 seconds, and recorded a Statistics instance with a single sample.
This is repeated 9 more times.
After 60 seconds, the 10 Statistics objects recorded in the previous step are rolled-up into a
single Statistics object for this minute. This is repeated 59 more times.
After 60 minutes, the 60 Statistics objects recorded in the previous step are rolled-up into a single
Statistics object for this hour. This is repeated 23 more times.
After 24 hours, the 24 Statistics objects recorded in the previous step are rolled-up into a single
Statistics object for the day. This is repeated 7 more times.
After 7 days, the 7 Statistics objects recorded in the previous step are rolled-up into a single
Statistics object for the week. This is repeated 52 more times.
Each of these periods represents a window in time during with the Statistics are captured:
Window timeframe Description
60 seconds
A Statistics for each of the ten 5-second intervals during the last minute.
60 minutes
A Statistics for each minute during the last hour.
24 hours
A Statistics for each hour during the last day.
7 days
A Statistics for each day during the last week.
52 weeks
A Statistics for each week during the last year.
The org.modeshape.jcr.api.monitor.Window enumeration is used to represent each of these

windows in time.
Page 120 of 424
ModeShape 3
Histories
The set of Statistics objects for a particular metric during a Window is called the history of the metric,
which is represented by the org.modeshape.jcr.api.monitor.History interface:
public interface History {

/**
* Get the kind of window.
*
* @return the window type; never null
*/
public Window getWindow();
/**
* Get the total duration of this history window.
*
* @param unit the desired time unit; if null, then {@link TimeUnit#SECONDS} is used
* @return the duration
*/
public long getTotalDuration( TimeUnit unit );
/**
* Get the timestamp (including time zone information) at which this history window starts.
*
* @return the time at which this window starts
*/
public DateTime getStartTime();
/**
* Get the timestamp (including time zone information) at which this history window ends.
*
* @return the time at which this window ends
*/
public DateTime getEndTime();
/**
* Get the statistics for that make up the history.
*
* @return the statistics; never null, but the array may contain null if the window is
*
longer than the lifetime of the repository
*/
Statistics[] getStats();
}
The org.modeshape.jcr.api.value.DateTime interface is an immutable representation of

an instant in time. It includes timezone information and methods for converting or obtaining the
various representations and/or parts of the instant. It is based upon initial work by the JSR-310
effort, and is far superior to the mutable and difficult-to-use java.util.Calendar class.
Page 121 of 424
ModeShape 3
RepositoryMonitor
The org.modeshape.jcr.api.monitor.RepositoryMonitor interface can then be used to get the
available metrics and windows, as well as obtaining the history for a given metric and window:
public interface RepositoryMonitor {

/**
* Get the ValueMetric enumerations that are available for use by the caller
* with {{getHistory(ValueMetric, Window)}}.
*
* @return the immutable set of ValueMetric instances; never null but possibly
*
empty if the caller has no permissions to see any value metrics
*/
Set<ValueMetric> getAvailableValueMetrics();
/**
* Get the DurationMetric enumerations that are available for use by the caller
* with {{getHistory(DurationMetric, Window)}}.
*
* @return the immutable set of DurationMetric instances; never null but possibly
*
*/
Set<DurationMetric> getAvailableDurationMetrics();
/**
* Get the Window enumerations that are available for use by the caller with
* {{getHistory(DurationMetric, Window)}} and {{getHistory(ValueMetric, Window)}}.
*
* @return the immutable set of DurationMetric instances; never null but possibly
*
*/
Set<Window> getAvailableWindows();
/**
* Get the statics for the specified value metric during the given window in time.
* The oldest statistics will be first, while the newest statistics will be last.
*
* @param metric the value metric; may not be null
* @param windowInTime the window specifying which statistics are to be returned;
*
may not be null
* @return the history of the metrics; never null but possibly empty if there are
*
no statistics being captures for this repository
* @throws AccessDeniedException if the session does not have privileges to monitor the
repository
* @throws RepositoryException if there is an error obtaining the history
*/
public History getHistory( ValueMetric metric,
Window windowInTime )
throws AccessDeniedException, RepositoryException;
/**
* Get the statics for the specified duration metric during the given window in time.
* The oldest statistics will be first, while the newest statistics will be last.
*
Page 122 of 424
ModeShape 3
* @param metric the duration metric; may not be null
* @param windowInTime the window specifying which statistics are to be returned;
*
may not be null
* @return the history of the metrics; never null but possibly empty if there are
*
no statistics being captures for this repository
repository
*/
public History getHistory( DurationMetric metric,
Window windowInTime )
/**
* Get the longest-running activities recorded for the specified metric.
* The results contain the duration records in order of increasing duration,
* with the activity with the longest duration appearing last in the array.
*
* @param metric the duration metric; may not be null
* @return the activities with the longest durations; never null but possibly
*
empty if no such activities were performed
repository
*/
public DurationActivity[] getLongestRunning( DurationMetric metric )
And finally, your application can get the RepositoryMonitor instance from the Session's workspace,
using ModeShape's org.modeshape.jcr.api.Workspace interface that extends the standard
javax.jcr.Workspace interface:

org.modeshape.jcr.api.Workspace workspace =
(org.modeshape.jcr.api.Workspace)session.getWorkspace();
RepositoryMonitor monitor = workspace.getRepositoryManager().getRepositoryMonitor();
2.12.2 Examples
The following examples show a few ways of accessing the history of different metrics using different
windows.
Page 123 of 424
ModeShape 3
Active sessions during the last hour

This example shows how to get the history containing the number of active sessions during each minute of
the last hour:

History history = monitor.getHistory(ValueMetric.SESSION_COUNT,Window.PREVIOUS_60_MINUTES);
// Use the history information to build a graph and determine the axes labels ...
int duration = history.getTotalDuration(TimeUnit.MINUTES); // will be '60'
DateTime started = history.getStartTime();
DateTime ended = history.getEndTime();
Statistics[] stats = history.getStats(); // will contain 60 elements
Here, each Statistics object represents the number of active sessions that existed during each minute.
If, for example, all the sessions were closed in the second-to-last minute, then the second-to-last
Statistics
object will reflect some of them closing, while the first Statistics object will have average, maximum, and
minimum values of 0.
Query durations during the last day

This example shows how to obtain the statistics for the durations of queries executed during the last 24
hours:

History history =
monitor.getHistory(DurationMetric.QUERY_EXECUTION_TIME,Window.PREVIOUS_24_HOURS);
int duration = history.getTotalDuration(TimeUnit.MINUTES); // will be '1440' (or 24 x 60 )
Each Statistics object will represent the number, average, maximum, minimum, variance, and standard
deviation for the queries that were executed during an hour of the last 24 hours.
Page 124 of 424
ModeShape 3
Worst performing queries during the last day

Just as we can obtain the statistics for the queries that were submitted during the last 24 hours, we can also
get information about the longest-running queries:

// Get the 'DurationActivity' object for each long-running query, where the longest is last ...
DurationActivity[] longestQueries =
monitor.getLongestRunning(DurationMetric.QUERY_EXECUTION_TIME);
for ( DurationActivity queryActivity : longestQueries ) {
long duration = queryActivity.getDuration(TimeUnit.MILLISECONDS);
Map<String,String> payload = queryActivity.getPayload();
String query = payload.get("query");
}
Event queue backlog during the last hour

This example shows how to get the history containing the number of events in the event queue during each
minute of the last hour:

History history = monitor.getHistory(ValueMetric.EVENT_QUEUE_SIZE,Window.PREVIOUS_60_MINUTES);
int duration = history.getTotalDuration(TimeUnit.MINUTES); // will be '60'
Here, each Statistics object represents the number of events that are in the queue during each minute.
If, for example, the number of events is increasing during each minute, then ModeShape is falling behind
in notifying the listeners. This likely will happen when sessions are making frequent changes, while
registered listeners are taking too long to process the event.
Listeners should not take too long to process the event, since one thread is being used to notify all
listeners. So if your listeners might, consider having your listener enqueuing work into a separate
java.util.concurrent.Executor, where the actual work is performed on separate threads.
Also, be careful if the listener needs to look up content using a session. Generally speaking, it's not
good practice for a listener to reuse the same session on which it's registered, since all listeners
will share the same session. ModeShape 3.x is thread-safe, but any changes made by one listener
will be visible to other listeners.
Page 125 of 424
ModeShape 3
2.13 Public API

2.14 Query and search
The JCR API defines a way to query a repository for content that meets user-defined criteria. The JCR 2.0
API actually makes it possible for implementations to support multiple query languages, and the specification
requires support for two languages: JCR-SQL2 and JCR-QOM. JCR 1.0 defined two other languages (XPath
and JCR-SQL), though these languages were deprecated in JCR 2.0.
Choosing a query language
Creating queries
Executing queries
JCR-SQL and JCR-SQL2 extensions
Query Object Model extensions
Additional join types
Set operations with UNION, INTERSECT, and EXCEPT
Non-correlated subqueries
Removing duplicate rows
Limit and offset results
Depth constraints
Path constraints
Criteria on references from a node
Range criteria with BETWEEN
Set criteria with IN and NOT IN
Arithmetic operands
Search and text extraction
Page 126 of 424
ModeShape 3
2.14.1 Choosing a query language

At this time, ModeShape supports five query languages:
JCR-SQL2
JCR-SQL
XPath
JCR-JQOM (programmatic API)
full-text search (a language that reuses the full-text search expression grammar used in the second
parameter of the CONTAINS(...) function of the JCR-SQL2 language)
So which language should you choose?
You might think you should pick one based upon how quickly it can be executed, but that's not the case with
ModeShape. As well see below, ModeShape plans, optimizes, and executes all the Query instances in the
exact same way, regardless of how they were created.
With ModeShape, none of the languages are more or less efficient than the others. So the best reason to
pick one language over another is based upon the expressiveness of the language. In other words, pick the
language for each query that most easily expresses your application's needs! Sure, the JCR-SQL2
language is by far the most expressive, and is technically a superset of JCR-SQL. But sometimes it will be
easier to specify path-oriented criteria using XPath. Or sometimes you only need to do full-text search, in
which case ModeShape's full-text search language is far easier.
Not all JCR implementations execute their queries in the same way. Some (including Jackrabbit)
have completely different execution paths for different languages, meaning queries in some
languages are simply faster than equivalent queries expressed in other languages.
Page 127 of 424
ModeShape 3
2.14.2 Creating queries

There are two ways to create a JCR Query object. The first is by supplying a query expression and the
name of the query language, and this can be done with the standard JCR API:
String language = ... // e.g. javax.jcr.query.Query.JCR_SQL2
Before returning the Query, ModeShape finds a parser for the language given by the language parameter,
and uses this parser to create a language-independent object representation of the query. (Note that any
grammatical errors in the expression result in an immediate exception.) This object representation is what
JCR 2.0 calls the "Query Object Model", or QOM. After parsing, ModeShape embeds the QOM into the
Query object.
The second approach for creating a Query object is to programmatically build up the query using the
QueryObjectModelFactory. Again, this uses the standard JCR API. Here's a simple example:
javax.jcr.query.qom.QueryObjectModelFactory factory = queryManager.getQOMFactory();
// Create the parts of a query object ...
javax.jcr.query.qom.Source selector = factory.selector(...);
javax.jcr.query.qom.Constraint constraints = ...
javax.jcr.query.qom.Column[] columns = ...
javax.jcr.query.qom.Ordering[] orderings = ...
javax.jcr.query.qom.QueryObjectModel model =
factory.createQuery(selector,constraints,orderings,columns);
// The model is a query ...
javax.jcr.query.Query query = model;
Of course, the QueryObjectModelFactory can create lots variations of selectors, joins, constraints, and
orderings. ModeShape fully supports this style of creating queries, and it even offers some very useful
extensions (described below).
Page 128 of 424
ModeShape 3
2.14.3 Executing queries

As we mentioned above, all ModeShape Query objects contain the object representation of the query,
called the query object model. No matter which query language is used or whether the query was created
programmatically, the ModeShape uses the same kind of model objects to represent every single query.
So when the JCR client executes the query:
javax.jcr.query.Query query = ...

ModeShape then takes the query's object model and runs it through a series of steps to plan, validate,
optimize, and finally execute the query:
1. Planning - in this step, ModeShape converts the language-independent query object model into a
canonical relational query plan that outlines the various relational operations that need to be
performed for this query. The query plan forms a tree, with each leaf node representing an access
query against the indexes. However, this plan isn't quite ready to be used.
2. Validation - not all queries that are well-formed can be executed, so ModeShape then validates the
canonical query plan to make sure that all named selectors exist, all named properties exist on the
selectors, that all aliases are properly used, and that all identifiers are resolvable. If the query fails
validation, an exception is thrown immediately.
3. Optimization - the canonical plan should mirror the actual query model, but it may not be the most
simple or efficient plan. ModeShape runs the canonical plan through a rule-based optimizer to
produce an optimum and executable plan. For example, one rule rewrites right outer joins as left outer
joins. Another rule looks for identity joins (e.g., ISSAMENODE join criteria or equi-join criteria involving
node identifiers), and if possible removes the join altogether (replacing it with additional criteria) or
copies criteria on one side of the join to the other. Another rule removes parts of the plan that (based
upon criteria) will never return any rows. Yet another rule determines the best algorithm for joining
tuples. Overall, there are about a dozen such rules, and all are intended to make the query plans
more easily and efficiently executed.
4. Execution - the optimized plan is then executed: each access query in the plan is issued and the
resulting tuples processed and combined to form the result set's tuples.
Now that you know more about how ModeShape actually works, you can understand why
ModeShape can achieve such good query performance regardless of the language you choose to
use.
Page 129 of 424
ModeShape 3
2.14.4 SQL and JCR-SQL2 extensions

ModeShape adds several features to its support of the standard JCR-SQL and JCR-SQL2 grammars. These
extensions include support for:
1. Additional join types with FULL OUTER JOIN and CROSS JOIN
2. UNION, INTERSECT, and EXCEPT set operations
3. Non-correlated subqueries in the WHERE clause; multiple subqueries can be used in a single query,
and they can even be nested
4. Removing duplicate rows with SELECT DISTINCT ...
5. Limit the number of rows returned with LIMIT count
6. Skip initial rows with OFFSET number
7. Constrain the depth of a node with DEPTH(selectorName)
8. Constrain the path of a node with PATH(selectorName)
9. Constrain the references from a node with REFERENCE(selectorName.property) and
REFERENCE(selectorName)
10. Ranges of criteria values using BETWEEN lower AND upper and optionally specifying whether to
exclude the lower and/or upper values
11. Set criteria to specify multiple criteria values using IN and NOT IN
12. Use simple arithmetic in criteria and ORDER BY clauses, such as SCORE(type1)*3 +
SCORE(type2)
13. Use pseudo-columns to include the path, score, node name, node local name, and node depth in
result columns or in criteria
More detail of the particular extensions can be found in the JCR-SQL2 grammar.
Simply use these extensions within your JCR-SQL or JCR-SQL2 query expressions strings, and use the
standard JCR API to obtain a Query:
String expression = ... // USE THE EXTENSIONS HERE
// And use the query ...
Page 130 of 424
ModeShape 3
2.14.5 Query Object Model extensions

The extensions in the JCR-SQL and JCR-SQL2 languages can also be used when building queries
programmatically using the JCR Query Object Model API. ModeShape defines as part of its public API the
org.modeshape.jcr.api.query.qom.QueryObjectModelFactory interface that extends the
standard javax.jcr.query.qom.QueryObjectModelFactory interface, and which contains methods
providing ways to construct a QOM with the extended features.
Additional join types

The standard javax.jcr.query.qom.QueryObjectModelFactory interface uses a String to specify
the join type:
package javax.jcr.query.qom;
public interface QueryObjectModelFactory {
...
/**
* Performs a join between two node-tuple sources.
*
* The query is invalid if 'left' is the same source as 'right'.
*
* @param left the left node-tuple source; non-null
* @param right the right node-tuple source; non-null
* @param joinType either QueryObjectModelConstants.JCR_JOIN_TYPE_INNER,
*
QueryObjectModelConstants.JCR_JOIN_TYPE_LEFT_OUTER, or
*
QueryObjectModelConstants.JCR_JOIN_TYPE_RIGHT_OUTER.
* @param joinCondition the join condition; non-null
* @return the join; non-null
* @throws InvalidQueryException if a particular validity test is possible on this method,
*
the implemention chooses to perform that test (and not leave it until later,
*
on {@link #createQuery}), and the parameters given fail that test
* @throws RepositoryException if the operation otherwise fails
*/
public Join join( Source left,
Source right,
String joinType,
JoinCondition joinCondition ) throws InvalidQueryException,
RepositoryException;
...
}
In addition to the three standard constants, ModeShape supports two additional constant values:
javax.jcr.query.qom.QueryObjectModelConstants.JCR_JOIN_TYPE_INNER
javax.jcr.query.qom.QueryObjectModelConstants.JCR_JOIN_TYPE_LEFT_OUTER
javax.jcr.query.qom.QueryObjectModelConstants.JCR_JOIN_TYPE_RIGHT_OUTER
org.modeshape.jcr.api.query.qom.QueryObjectModelConstants.JCR_JOIN_TYPE_CROSS
org.modeshape.jcr.api.query.qom.QueryObjectModelConstants.JCR_JOIN_TYPE_FULL_OUTER
Page 131 of 424
ModeShape 3
Set operations with UNION, INTERSECT, and EXCEPT

Creating a set query is very similar to creating a normal SELECT type query, but instead the following on
org.modeshape.jcr.api.query.qom.QueryObjectModelFactory are used:
package org.modeshape.jcr.api.query.qom;
...
/**
* Creates a query with one or more selectors.
*
* @param source the node-tuple source; non-null
* @param constraint the constraint, or null if none
* @param orderings zero or more orderings; null is equivalent to a zero-length array
* @param columns the columns; null is equivalent to a zero-length array
* @param limit the limit; null is equivalent to having no limit
* @param isDistinct true if the query should return distinct values; or false if no
*
duplicate removal should be performed
* @return the select query; non-null
*
the implemention chooses to perform that test and the parameters given fail that
*
test. See the individual QOM factory methods for the validity criteria
*
of each query element.
* @throws RepositoryException if another error occurs.
*/
public SelectQuery select( Source source,
Constraint constraint,
Ordering[] orderings,
Column[] columns,
Limit limit,
boolean isDistinct ) throws InvalidQueryException,
/**
* Creates a query command that effectively appends the results of the right-hand query
* to those of the left-hand query.
*
* @param left the query command that represents left-side of the set operation;
*
non-null and must have columns that are equivalent and union-able to those
*
of the right-side query
* @param right the query command that represents right-side of the set operation;
*
*
of the left-side query
* @param all true if duplicate rows in the left- and right-hand side results should
*
be included, or false if duplicate rows should be eliminated
*
the implemention chooses to perform that test and the parameters given fail
*
that test. See the individual QOM factory methods for the validity criteria
*
*/
Page 132 of 424
ModeShape 3
public SetQuery union( QueryCommand left,
QueryCommand right,
Limit limit,
boolean all ) throws InvalidQueryException, RepositoryException;
/**
* Creates a query command that returns all rows that are both in the result of the
* left-hand query and in the result of the right-hand query.
*
*
*
*
*
*
*
*
*
*/
public SetQuery intersect( QueryCommand left,
QueryCommand right,
Limit limit,
/**
* Creates a query command that returns all rows that are in the result of the left-hand
* query but not in the result of the right-hand query.
*
*
*
*
*
*
*
*
*
*/
public SetQuery except( QueryCommand left,
QueryCommand right,
Limit limit,
Page 133 of 424
ModeShape 3
...
}
Note that the select(...) method returns a SelectQuery while the union(...), intersect(...)
and except(...) methods return a SetQuery. The SelectQuery and SetQuery interfaces are defined
by ModeShape and both extend ModeShape's QueryCommand interface. This interface is then used in the
methods to create SetQuery.
The SetQuery object is not executable. To create the corresponding javax.jcr.Query object, pass the
SetQuery to the following method on
org.modeshape.jcr.api.query.qom.QueryObjectModelFactory:
...
/**
* Creates a set query.
*
* @param command set query; non-null
* @return the executable query; non-null
*
*
*
*/
public SetQueryObjectModel createQuery( SetQuery command ) throws InvalidQueryException,
...
}
The resulting SetQueryObjectModel extends javax.jcr.query.Query and SetQuery and can be

executed and treated similarly to the standard javax.jcr.query.qom.QueryObjectModel (that also
extends javax.jcr.query.Query).
Page 134 of 424
ModeShape 3
correlated subqueries
ModeShape defines a Subquery interface that extends the standard
javax.jcr.query.qom.StaticOperand interface, and thus can be used on the right-hand side of any
Criteria:
public interface Subquery extends StaticOperand {

/**
* Gets the {@link QueryCommand} that makes up the subquery.
*
* @return the query command; non-null
*/
public QueryCommand getQuery();
}
Subqueries can be created by passing a QueryCommand into this

org.modeshape.jcr.query.qom.QueryObjectModelFactory method:
...
/**
* Creates a subquery that can be used as a {@link StaticOperand} in another query.
*
* @param subqueryCommand the query command that is to be used as the subquery
* @return the constraint; non-null
*
the implemention chooses to perform that test (and not leave it until later,
*
on {@link #createQuery}), and the parameters given fail that test
*/
public Subquery subquery( QueryCommand subqueryCommand ) throws InvalidQueryException,
...
}
The resulting Subquery is a StaticOperand that can then be used to create a Criteria.
Page 135 of 424
ModeShape 3
Removing duplicate rows

The org.modeshape.jcr.query.qom.QueryObjectModelFactory interface includes a variation of
the standard QueryObjectModeFactory.select(...)
method with an additional isDistinct flag that controls whether duplicate rows should be removed:
...
/**
*
*
*
*
*
*/
Column[] columns,
Limit limit,
...
}
Limit and offset results

ModeShape defines a Limit interface as a top-level object that can be used to create queries that limit the
number of rows and/or skip a number of initial rows:
Page 136 of 424
ModeShape 3
public interface Limit {

/**
* Get the number of rows skipped before the results begin.
*
* @return the offset; always 0 or a positive number
*/
public int getOffset();
/**
* Get the maximum number of rows that are to be returned.
*
* @return the maximum number of rows; always positive, or equal to Integer.MAX_VALUE if
there is no limit
*/
public int getRowLimit();
/**
* Determine whether this limit clause is necessary.
*
* @return true if the number of rows is not limited and there is no offset, or false
otherwise
*/
public boolean isUnlimited();
/**
* Determine whether this limit clause defines an offset.
*
* @return true if there is an offset, or false if there is no offset
*/
public boolean isOffset();
}
These range constraints can be constructed using this

Page 137 of 424
ModeShape 3
...
/**
* Evaluates to a limit on the maximum number of tuples in the results and the
* number of rows that are skipped before the first tuple in the results.
*
* @param rowLimit the maximum number of rows; must be a positive number, or
Integer.MAX_VALUE if there is to be a
*
non-zero offset but no limit
* @param offset the number of rows to skip before beginning the results; must be 0 or a
positive number
* @return the operand; non-null
*
the implemention chooses to perform that test (and not leave it until later, on
createQuery),
*
and the parameters given fail that test
*/
public Limit limit( int rowLimit,
int offset ) throws InvalidQueryException, RepositoryException;
...
}
The Limit objects can then be used when creating queries using a variation of the standard
QueryObjectModeFactory.select(...) defined in the
org.modeshape.jcr.query.qom.QueryObjectModelFactory interface:
Page 138 of 424
ModeShape 3
...
/**
*
*
*
*
*
*/
Column[] columns,
Limit limit,
...
}
Similarly, the Limit objects can be passed to the ModeShape-specific except(...), union(...),
intersect(...) methods, too.
Page 139 of 424
ModeShape 3
Depth constraints
ModeShape defines a DepthPath interface that extends the standard
javax.jcr.query.qom.DynamicOperand interface, and thus can be used as part of a WHERE clause to
constrain the depth of the nodes accessed by a selector:
public interface NodeDepth extends javax.jcr.query.qom.DynamicOperand {

/**
* Get the selector symbol upon which this operand applies.
*
* @return the one selector names used by this operand; never null
*/
public String getSelectorName();
}

...
/**
* Evaluates to a LONG value equal to the depth of a node in the specified selector.
*
* The query is invalid if selector is not the name of a selector in the query.
*
* @param selectorName the selector name; non-null
*
createQuery),
*
*/
public NodeDepth nodeDepth( String selectorName ) throws InvalidQueryException,
...
}
Page 140 of 424
ModeShape 3
Path constraints
ModeShape defines a NodePath interface that extends the standard
javax.jcr.query.qom.DynamicOperand interface, and thus can be used as part of a WHERE clause to
constrain the path of nodes accessed by a selector:
public interface NodePath extends javax.jcr.query.qom.DynamicOperand {

/**
*
*/
}

...
/**
* Evaluates to a PATH value equal to the prefix-qualified path of a node in the specified
selector.
*
* The query is invalid if selector is not the name of a selector in the query.
*
*
createQuery),
*
*/
public NodePath nodePath( String selectorName ) throws InvalidQueryException,
...
}
Criteria on references from a node

ModeShape defines a ReferenceValue interface that extends the standard
javax.jcr.query.qom.DynamicOperand interface, and thus can be used as part of a WHERE or ORDER
BY clause:
Page 141 of 424
ModeShape 3
public interface ReferenceValue extends DynamicOperand {

...
/**
*
*/
/**
* Get the name of the one reference property.
*
* @return the property name; or null if this operand applies to any reference property
*/
public String getPropertyName();
}
These reference value operand allow a query to easily place constraints on a particular REFERENCE
property or (more importantly) any REFERENCE properties on the nodes. The former is a more simple
alternative to using a regular comparison constraint with the REFERENCE property on one side and the "
jcr:uuid" property on the other. The latter effectively means "where the node references (with any
property) some other nodes", and this is something that standard JCR-SQL2 cannot represent.
They are created using these org.modeshape.jcr.query.qom.QueryObjectModelFactory
methods:
Page 142 of 424
ModeShape 3
...
/**
* Creates a dynamic operand that evaluates to the REFERENCE value of the any property
* on the specified selector.
*
* The query is invalid if:
* - selector is not the name of a selector in the query, or
* - property is not a syntactically valid JCR name.
*
*
createQuery),
*
*/
public ReferenceValue referenceValue( String selectorName ) throws InvalidQueryException,
/**
* Creates a dynamic operand that evaluates to the REFERENCE value of the specified
* property on the specified selector.
*
* The query is invalid if:
* - selector is not the name of a selector in the query, or
* - property is not a syntactically valid JCR name.
*
* @param propertyName the reference property name; non-null
*
createQuery),
*
*/
public ReferenceValue referenceValue( String selectorName,
String propertyName ) throws InvalidQueryException,
...
}
Range criteria with BETWEEN

ModeShape defines a Between interface that extends the standard
javax.jcr.query.qom.Constraint interface, and thus can be used as part of a WHERE clause:
Page 143 of 424
ModeShape 3
public interface Between extends Constraint {

/**
* Get the dynamic operand specification.
*
* @return the dynamic operand; never null
*/
public DynamicOperand getOperand();
/**
* Get the lower bound operand.
*
* @return the lower bound; never null
*/
public StaticOperand getLowerBound();
/**
* Get the upper bound operand.
*
* @return the upper bound; never null
*/
public StaticOperand getUpperBound();
/**
* Return whether the lower bound is to be included in the results.
*
* @return true if the {@link #getLowerBound() lower bound} is to be included, or false
otherwise
*/
public boolean isLowerBoundIncluded();
/**
* Return whether the upper bound is to be included in the results.
*
* @return true if the {@link #getUpperBound() upper bound} is to be included, or false
otherwise
*/
public boolean isUpperBoundIncluded();
}

Page 144 of 424
ModeShape 3
...
/**
* Tests that the value (or values) defined by the supplied dynamic operand are
* within a specified range. The range is specified by a lower and upper bound,
* and whether each of the boundary values is included in the range.
*
* @param operand the dynamic operand describing the values that are to be constrained
* @param lowerBound the lower bound of the range
* @param upperBound the upper bound of the range
* @param includeLowerBound true if the lower boundary value is not be included
* @param includeUpperBound true if the upper boundary value is not be included
*
createQuery),
*
*/
public Between between( DynamicOperand operand,
StaticOperand lowerBound,
StaticOperand upperBound,
boolean includeLowerBound,
boolean includeUpperBound ) throws InvalidQueryException,
...
}
To create a NOT BETWEEN ... criteria, simply create the Between criteria object, and then pass that into
the standard QueryObjectModelFactory.not(Criteria) method.
Page 145 of 424
ModeShape 3
Set criteria with IN and NOT IN

ModeShape defines a SetCriteria interface that extends the standard
javax.jcr.query.qom.Constraint interface, and thus can be used as part of a WHERE clause:
public interface SetCriteria extends Constraint {

/**
* Get the dynamic operand specification for the left-hand side of the set criteria.
*
* @return the dynamic operand; never null
*/
public DynamicOperand getOperand();
/**
* Get the static operands for this set criteria.
*
* @return the static operand; never null and never empty
*/
public Collection<? extends StaticOperand> getValues();
}
These set constraints can be constructed using this

...
/**
* Tests that the value (or values) defined by the supplied dynamic operand are
* found within the specified set of values.
*
* @param operand the dynamic operand describing the values that are to be constrained
* @param values the static operand values; may not be null or empty
*
createQuery),
*
*/
public SetCriteria in( DynamicOperand operand,
StaticOperand... values ) throws InvalidQueryException,
...
}
To create a NOT IN criteria, simply create the IN criteria to get a SetCriteria object, and then pass that
into the standard QueryObjectModelFactory.not(Criteria) method.
Page 146 of 424
ModeShape 3
Arithmetic operands
ModeShape defines an ArithmeticOperand interface that extends the
javax.jcr.query.qom.DynamicOperand, and thus can be used anywhere a DynamicOperand can be
used.
public interface ArithmeticOperand extends DynamicOperand {

/**
* Get the operator for this binary operand.
*
* @return the operator; never null
*/
public String getOperator();
/**
* Get the left-hand operand.
*
* @return the left-hand operator; never null
*/
public DynamicOperand getLeft();
/**
* Get the right-hand operand.
*
* @return the right-hand operator; never null
*/
public DynamicOperand getRight();
}
These can be constructed using additional

org.modeshape.jcr.query.qom.QueryObjectModelFactory methods:
...
/**
* Create an arithmetic dynamic operand that adds the numeric value of the two supplied
operand(s).
*
* @param left the left-hand-side operand; not null
* @param right the right-hand-side operand; not null
* @return the dynamic operand; non-null
*
createQuery),
*
*/
public ArithmeticOperand add( DynamicOperand left,
DynamicOperand right ) throws InvalidQueryException,
Page 147 of 424
ModeShape 3
/**
* Create an arithmetic dynamic operand that subtracts the numeric value of the second
operand from the numeric value of the
* first.
*
*
createQuery),
*
*/
public ArithmeticOperand subtract( DynamicOperand left,
/**
* Create an arithmetic dynamic operand that multplies the numeric value of the first
operand by the numeric value of the
* second.
*
*
createQuery),
*
*/
public ArithmeticOperand multiply( DynamicOperand left,
/**
* Create an arithmetic dynamic operand that divides the numeric value of the first operand
by the numeric value of the
* second.
*
*
createQuery),
*
*/
public ArithmeticOperand divide( DynamicOperand left,
...
}
Page 148 of 424
ModeShape 3
2.14.6 Search and text extraction

The full-text search language and JCR-SQL2's full-text search constraint both have the ability to find nodes
using a simpler search-engine-like expression with wildcards and phrases.
One can pretty easily imagine how ModeShape performs these matches against a node's name and
properties containing STRING, LONG, DATE, DOUBLE, DECIMAL, NAME, and PATH values. But what
about BINARY values? In order to determine whether the search-engine-like search expressions match,
doesn't ModeShape have to determine what text is contained within each BINARY value?
The short answer is that yes, ModeShape can only match against the BINARY value if it can extract the text
from that value. And this is where text extraction come into play.
A text extractor is a component that knows how to extract searchable text from a BINARY value. Each text
extract describes whether it can process files of a particular MIME type, and if it can then ModeShape will
(when necessary) call the extractor to obtain for a supplied BINARY value the searchable text.
2.15 Registering custom node types

As described in Defining custom node types, the JCR 2.0 specification defines the Compact Node Definition
(CND) format to easily and compactly specify node type definitions, but uses this format only within the
specification. Instead, the only standard API for registering custom node types is via the standard
programmatic API.
ModeShape fully supports this standard API, but it also defines a non-standard API for reading node type
definitions from either CND files or the older Jackrabbit XML format. This non-standard API is described in
this section.
2.15.1 Registering using the standard API

2.15.2 Registering using CND files
ModeShape defines in its public API a org.modeshape.jcr.nodetype.NodeTypeManager interface
that extends the standard javax.jcr.nodetype.NodeTypeManager interface:
public interface NodeTypeManager extends javax.jcr.nodetype.NodeTypeManager {

/**
* Read the supplied stream containing node type definitions in the standard JCR 2.0 Compact
Node Definition (CND) format or
* non-standard Jackrabbit XML format, and register the node types with this repository.
*
* @param stream the stream containing the node type definitions in CND format
Page 149 of 424
ModeShape 3
* @param allowUpdate a boolean stating whether existing node type definitions should be
modified/updated
* @throws IOException if there is a problem reading from the supplied stream
* @throws InvalidNodeTypeDefinitionException if the <code>NodeTypeDefinition</code> is
invalid.
* @throws NodeTypeExistsException if <code>allowUpdate</code> is <code>false</code> and the
<code>NodeTypeDefinition</code>
*
specifies a node type name that is already registered.
* @throws UnsupportedRepositoryOperationException if this implementation does not support
node type registration.
*/
void registerNodeTypes( InputStream stream,
boolean allowUpdate )
throws IOException, InvalidNodeTypeDefinitionException, NodeTypeExistsException,
UnsupportedRepositoryOperationException,
/**
* Read the supplied file containing node type definitions in the standard JCR 2.0 Compact
*
* @param file the file containing the node types
modified/updated
invalid.
*
*/
void registerNodeTypes( File file,
boolean allowUpdate ) throws IOException, RepositoryException;
/**
* Read the supplied stream containing node type definitions in the standard JCR 2.0 Compact
*
* @param url the URL that can be resolved to the file containing the node type definitions
in CND format
modified/updated
invalid.
*
*/
Page 150 of 424
ModeShape 3
void registerNodeTypes( URL url,
boolean allowUpdate ) throws IOException, RepositoryException;
}
Simply cast the NodeTypeManager instance obtained from the Workspace.getNodeTypeManager()

method:

Workspace workspace = session.getWorkspace();
org.modeshape.jcr.api.nodetype.NodeTypeManager nodeTypeMgr =
(org.modeshape.jcr.api.nodetype.NodeTypeManager) workspace.getNodeTypeManager();
// Then register the node types in one or more CND files
// using a Java File object ...
File myCndFile = ...
nodeTypeManager.registerNodeTypes(myCndFile,true);
// or a URL that is resolvable to a CND file ...
URL myCndUrl = ...
nodeTypeManager.registerNodeTypes(myCndUrl,true);
// or an InputStream to the content of a CND file ...
InputStream myCndStream = ...
nodeTypeManager.registerNodeTypes(myCndStream,true);
Alternatively, you can cast the result of the Session.getWorkspace() method to

org.modeshape.jcr.api.Workspace, which overrides the getNodeTypeManager() method to return
org.modeshape.jcr.api.nodetype.NodeTypeManager:

org.modeshape.jcr.api.Workspace workspace = (org.modeshape.jcr.api.Workspace)
session.getWorkspace();
org.modeshape.jcr.api.nodetype.NodeTypeManager nodeTypeMgr = workspace.getNodeTypeManager();
// Then register the node types in one or more CND files ...
Page 151 of 424
ModeShape 3
Registering CND files via configuration

In addition to using the ModeShape API as described above, it is possible to configure a repository to import,
at startup, one or more CND files using the following format:
{
"name" : "Repository with node types",
"storage" : {
"transactionManagerLookup" :
"org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
},
"workspaces" : {
"predefined" : ["ws1", "ws2"],
"allowCreation" : true
},
"node-types" : ["cnd/cars.cnd", "cnd/aircraft.cnd"]
}
where the node-types attribute accepts an array of strings, representing paths to CND files, accessible at
runtime.
If CND files are configured to be imported at repository startup, they will overwrite each time any
pre-existing node types with the same name that have been registered previously.
2.15.3 Jackrabbit XML format

ModeShape also supports the older non-standard Jackrabbit format for defining node types, and only to
make it easier for people to switch from Jackrabbit to ModeShape. The Jackrabbit 2.x no longer uses this
format, and Jackrabbit 1.x only used this XML format for built-in node types and discouraged users from
modifying it. However, some users of Jackrabbit 1.x still added their custom node types to this file.
Use the standard CND format wherever possible, and use this non-standard XML format only if
you're trying to switch from Jackrabbit to ModeShape (with as little work as possible). Once you're
convinced to use ModeShape, then convert your XML files to CND files.
The DTD for the non-standard XML files can be found here.
Page 152 of 424
ModeShape 3
2.16 Sequencing
Sequencers
Automatic Sequencers
Manual Sequencers
Built-in sequencers
Custom sequencers
Configuring a automatic sequencer
Input path
Output paths
Workspaces in input and output paths
Example path expression
Waiting for automatic sequencing
Many repositories are used (at least in part) to manage files and other artifacts, including service definitions,
policy files, images, media, documents, presentations, application components, reusable libraries,
configuration files, application installations, databases schemas, management scripts, and so on. Most JCR
repository implementations will store those files and maybe index them for searching.
But ModeShape does more. ModeShape sequencers can automatically unlock the structured information
buried within all of those files, and this useful content derived from your files is then stored back in the
repository where your client applications can search, access, and analyze it using the JCR API. Sequencing
is performed in the background, so the client application does not have to wait for (or even know about) the
sequencing operations.
The following diagram shows conceptually how these automatic sequencers do this.
As of ModeShape 3.6.0.Final, your applications can use a session to explicitly invoke a sequencer on a
specified property. We call these manual sequencers. Any generated output is included in the session's
transient state, so nothing is persisted until the application calls session.save().
Page 153 of 424
ModeShape 3
Prior to ModeShape 3.6.0.Final, ModeShape only had support for automatic sequencers.
2.16.1 Sequencers
Sequencers are just POJOs that implement a specific interface, and when they are called they simply
process the supplied input, extract meaningful information, and produce an output structure of nodes that
somehow represents that meaningful information. This derived information can take almost any form, and it
typically varies for each sequencer. For example, ModeShape comes with an image sequencer that extracts
the simple metadata from different kinds of image files (e.g., JPEG, GIF, PNG, etc.). Another example is the
Compact Node Definition (CND) sequencer that processes the CND files to extract and produce a structured
representation of the node type definitions, property definitions, and child node definitions contained within
the file. A third example is a sequencer that works on XML Schema Documents might parse the XSD
content and generate nodes that mirror the various elements, and attributes, and types defined within the
schema document.
Sequencers allow a ModeShape repository to help you extract more meaning from the artifacts you already
are managing, and makes it much easier for applications to find and use all that valuable information. All
without your applications doing anything extra.
Each repository can be configured with any number of sequencers. Each one includes a name, the POJO
class name, an optional classpath (for environments with multiple named classloaders), and any number of
POJO-specific fields. Upon startup, ModeShape creates each sequencer by instantiating the POJO and
setting all of the fields, then initializing the sequencer so it can register any namespaces or node type
definitions.
There are two kinds of sequencers, automatic and manual.
Automatic Sequencers
An automatic sequencer has a path expression that dictates which content in the repository the sequencer is
to operate upon. These path expressions are really patterns and look somewhat like simple regular
expressions. When persisted content in the repository changes, ModeShape automatically looks to see
which (if any) sequencers might be able to run on the changed content. If any of the sequencers do match,
ModeShape automatically calls them by supplying the changed content. At that point, the sequencer then
processes the supplied content and generates the output, and ModeShape then saves that generated output
to the repository.
To use an automatic sequencer, simply add or change content in the repository that matches the
sequencers' path expression. For example, if an XSD sequencer is configured for nodes with paths like "
/files//*.xsd", then just simply upload a file into that location and save it. ModeShape will detect that
the XSD sequencer should be called, and will do the rest. The generated content will magically appear in the
repository.
Page 154 of 424
ModeShape 3
Manual Sequencers
A manual sequencer is simply a sequencer that is configured without path expressions. Because no path
expressions are provided, ModeShape cannot determine when/where these sequencers should be applied.
Instead, manual sequencers are intended to be called by client applications.
For example, consider that a session just uploaded a file at "/files/schemas/Customers.xsd", and this
node has a primary type of "nt:file". (This means the file's content is stored in the "jcr:data" property
the "jcr:content" child node.) The session has not yet saved any of this information, so it is still in the
session's transient state. The following code shows how an XSD sequencer configured with name "XSD
Sequencer" is manually invoked to place the generated content directly under the "
/files/schemas/Customers.xsd" node (and adjacent to the "jcr:content" node):
Node fileNode = session.getNode("/files/schemas/Customers.xsd");

Property content = fileNode.getProperty("jcr:content/jcr:data");
Node output = fileNode; // could be anywhere!
boolean success = session.sequence("XSD Sequencer", content, output);
The sequence(...) method returns true if the sequencer generated output, or "false" if the sequencer
couldn't use the input and instead did nothing.
Remember that when the sequence(...) does return, any generated output is only in the session's
transient state and "session.save()" must be called to persist this state.
2.16.2 Built-in sequencers

ModeShape comes with sequencer implementations for a variety of file types:
Page 155 of 424
ModeShape 3
Input files
Derives
XML
A node is created for each XML element, properties are created for each XML attribute, and
Documents each declared namespace is registered in the workspace.

XML
A node structure that represents the structure and semantics of the XSD, including the
Schema
attribute declarations, element declarations, simple type definitions, complex type definitions,
Documents import statements, include statements, attribute group declarations, annotations, other
(XSDs)
components, and even attributes with a non-schema namespace.
WSDL 1.1
A node structure that represents the WSDL file's messages, port types, bindings, services,
files
types (including embedded XML Schemas), documentation, and extension elements

(including HTTP, SOAP and MIME bindings).
ZIP files
Extracts the files and folders contained in the archive file, representing them as nt:file and
nt:folder nodes. The resulting files will be candidates for further sequencing.
Delimited
A simple node structure reflecting the rows of data fields.
and
fixed-width
text files
DDL files
A node structure that represents the parsed data definition statements from SQL-92, Oracle,
Derby, and PostgreSQL. The resulting structure is largely the same for all dialects, though
some dialects have non-standard additions to their grammar that result in dialect-specific
additions to the graph structure.
Teiid
A rich node structure containing all the objects defined in the models, including the
relational
catalogs/schemas, tables, views, columns, primary keys, foreign keys, indexes, procedures,
models
procedure results, extension properties, and data source information. The structure will also
contain the select, update, insert and delete transformations in the case of virtual models.
Teiid
A node structure that mirrors the relational model files, XSDs, and additional metadata. The
virtual
resulting relational model files will be candidates for further sequencing.
databases
Java files
A node structure representing the Java source or class file's package structure, class
declarations, class and member attribute declarations, class and member method
declarations with signature (but not implementation logic), enumerations with each
enumeration literal value, annotations, and JavaDoc information for all of the above.
Image files
A node containing the image metadata, including file format, image size, number of bits per
pixel, number of images, comments, and physical dimensions.
MP3 files
A node that contains the metadata for the audio files, including the track's title, author, album
name, year, and comment.
Please see the Built-in sequencers section of the documentation for more detail on all of these sequencers,
including how to configure them and the structure of the output they generate.
Page 156 of 424
ModeShape 3
2.16.3 Custom sequencers

As mentioned earlier, a sequencer is actually just a plain old Java object (POJO). Creating a sequencer is
pretty straightforward: create a Java class that extends a single abstract class, package it up for use, and
then configure your repository to use it. We walk you through all these steps in the Custom sequencers
section of the documentation.
2.16.4 Configuring a automatic sequencer

Each sequencer must be configured to describe the areas or types of content that the sequencer is capable
of handling. This is done by specifying these patterns using path expressions that identify the nodes (or node
patterns) that should be sequenced and where to store the output generated by the sequencer.
A path expression consist of two parts: a selection criteria (or an input path) and an output path:
inputPath => outputPath
Input path
The inputPath part defines an expression for the path of a node that is to be sequenced. Input paths consist
of '/' separated segments, where each segment represents a pattern for a single node's name (including the
same-name-sibling indexes) and '@' signifies a property name.
Let's first look at some simple examples:
Input
Description
Path
/a/b
Match node "b" that is a child of the top level node "a". Neither node may have any
same-name-sibilings.
/a/*
Match any child node of the top level node "a".
/a/*.txt
Match any child node of the top level node "a" that also has a name ending in ".txt".
/a/*.txt
Match any child node of the top level node "a" that also has a name ending in ".txt".
/a/b/@c
Match the property "c" of node "/a/b".
/a/b[2]
The second child named "b" below the top level node "a".
/a/b[2,3,4] The second, third or fourth child named "b" below the top level node "a".
/a/b[*]
Any (and every) child named "b" below the top level node "a".
//a/b
Any node named "b" that exists below a node named "a", regardless of where node "a" occurs.
Again, neither node may have any same-name-sibilings.
Page 157 of 424
ModeShape 3
With these simple examples, you can probably discern the most important rules. First, the ' *' is a wildcard
character that matches any character or sequence of characters in a node's name (or index if appearing in
between square brackets), and can be used in conjunction with other characters (e.g., " *.txt").
Second, square brackets (i.e., '[' and ']') are used to match a node's same-name-sibiling index. You can put
a single non-negative number or a comma-separated list of non-negative numbers. Use '0' to match a node
that has no same-name-sibilings, or any positive number to match the specific same-name-sibling.
Third, combining two delimiters (e.g., "//") matches any sequence of nodes, regardless of what their names
are or how many nodes. Often used with other patterns to identify nodes at any level matching other
patterns. Three or more sequential slash characters are treated as two.
Many input paths can be created using just these simple rules. However, input paths can be more
complicated. Here are some more examples:
Input Path
Description
/a/(b|c|d)
Match children of the top level node "a" that are named "b", "c" or "d".
None of the nodes may have same-name-sibling indexes.
/a/b[c/d]
Match node "b" child of the top level node "a", when node "b" has a child
named "c", and "c" has a child named "d". Node "b" is the selected node,
while nodes "c" and "d" are used as criteria but are not selected.
/a(/(b|c|d|)/e)[f/g/@something] Match node "/a/b/e", "/a/c/e", "/a/d/e", or "/a/e" when they also have a child
"f" that itself has a child "g" with property "something". None of the nodes
may have same-name-sibling indexes.
These examples show a few more advanced rules. Parentheses (i.e., '(' and ')') can be used to define a set
of options for names, as shown in the first and third rules. Whatever part of the selected node's path appears
between the parentheses is captured for use within the output path, similar to regular expressions. Thus, the
first input path in the previous table would match node " /a/b", and "b" would be captured and could be used
within the output path using "$1", where the number used in the output path identifies the parentheses. Here
are some examples of what's captured by the parenthesis and available for use in the output path:
Input Path
$1
$2
$3
/a/(b|c|d)
"b" or "c" or "d"
n/a
n/a
/a/b[c/d]
n/a
n/a
n/a
/a(/(b|c|d|)/e)[f/g/@something] "/b/e" or "/c/e" or "/d/e" or "/e" "b" or "c" or "d" or "" n/a
Square brackets can also be used to specify criteria on a node's properties or children. Whatever appears in
between the square brackets does not appear in the selected node. This distinction between the selected
path and the changed path becomes important when writing custom sequencers.
Page 158 of 424
ModeShape 3
Output paths
The outputPath part of a path expression defines where the content derived by the sequencer should be
stored.
Typically, this points to a location in a different part of the repository, but it can actually be left off if the
sequenced output is to be placed directly under the selected node. The output path can also use any of the
capture groups used in the input path.
Workspaces in input and output paths

So far, we've talked about how input paths and output paths are independent of the workspace. However,
there are times when it's desirable to configure sequencers to only work against content in a specific
workspace. In these cases, it is possible to specify the workspace names before the path. For example:
Input Path
Description
:default:/a/(b|c|d) Match nodes in the "default" workspace within any source source that are children of
the top level node "a" and named "b", "c" or "d". None of the nodes may have
same-name-sibling indexes.
:/a/(b|c|d)
Match nodes in any within any source source that are children of the top level node "a"
and named "b", "c" or "d". None of the nodes may have same-name-sibling indexes.
(This is equivalent to the path "/a/(b|c|d)".)
Again, the rules are pretty straightforward. You can leave off the workspace name, or you can prepend the
path with "workspaceNamePattern:", where "workspaceNamePattern" is a regular-expression pattern
used to match the applicable workspace names. A blank pattern implies any match, and is a shorthand
notation for the ".*" regular expression. Note that the repository names may not include forward slashes
(e.g., '/') or colons (e.g., ':').
Example path expression

Let's look at an example sequencer path expression:
default://(\*.(jpg\|jpeg\|gif\|bmp\|pcx\|png)\[*])\[/jcr:content@jcr:data] => meta:/images/\$1
This matches a changed "jcr:data" property on a node named "jcr:content[1]" that is a child of a node
whose name ends with ".jpg", ".jpeg", ".gif", ".bmp", ".pcx", or ".png" ( that may have any
same-name-sibling index) appearing at any level in the "default" workspace. Note how the selected path
capture the filename (the segment containing the file extension), including any same-name-sibling index.
This filename is then used in the output path, which is where the sequenced content is placed under the "
/images" node in the "meta" workspace.
Page 159 of 424
ModeShape 3
So, consider a PNG image file is stored in the "default" workspace in a repository configured with an
image sequencer and the aforementioned path expression, and the file is stored at "
/jsmith/photos/2011/08/09/reunion.png" using the standard "nt:file" pattern. This means that
an "nt:file" node named "reunion.png" is created at the designated path, and a child node named "
jcr:content" will be created with primary type of "nt:resource" and a "jcr:data" binary property (at
which the image file's content is store).
When the session is saved with these changes, ModeShape discovers that the
{{/jsmith/photos/2011/08/09/reunion.png/jcr:content/jcr:data}}
property satisfies the criteria of the sequencer, and calls the sequencer's execute(...) method with the
selected node, input node, input property and output node of " /images" in the "meta" workspace. When the
execute() method completes successfully, the session with the change in the "meta" workspace are
saved and the content is immediately available to all other sessions using that workspace.
Page 160 of 424
ModeShape 3
2.16.5 Waiting for automatic sequencing

When your application creates or uploads content that will kick off a sequencing operation, the sequencing is
actually done asynchronously. If you want to be notified when the sequencing is complete, you can use
ModeShape's observation feature to register a listener for the sequencing event.
The first step is to create a class that implements "javax.jcr.observation.EventListener".
Normally this is pretty easy, but in our case we want to block until the listener is notified via a separate
thread. An easy way to do this is to use a java.util.concurrent.CountDownLatch, and to count
down the latch as soon as we get our event. (If we carefully register the listener using criteria for only the
sequencing output we're interested in, we'll know we'll only receive one event.)
Here's our implementation that captures from the first event whether the sequencing was successful and the
path of the output node, and then counts down the latch:
Page 161 of 424
ModeShape 3
public class SequencingListener implements javax.jcr.observation.EventListener {

private final CountDownLatch latch;
private volatile String sequencedNodePath;
private volatile boolean successfulSequencing;
public SequencingListener( CountDownLatch latch ) {
this.latch = latch;
}
@Override
public void onEvent( javax.jcr.observation.EventIterator events ) {
if ( sequencedNodePath != null ) return;
try {
javax.jcr.observation.Event event = (javax.jcr.observation.Event)events.nextEvent();
this.sequencedNodePath = event.getPath();
this.successfulSequencing = event.getType() ==
org.modeshape.jcr.observation.Event.Sequencing.NODE_SEQUENCED;
latch.countDown();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
public boolean isSequencingSuccessful() {
return this.successfulSequencing;
}
public String getSequencedNodePath() {
return sequencedNodePath;
}
}
We could then register this using the public API:
Page 162 of 424
ModeShape 3

ObservationManager observationManager = session.getWorkspace().getObservationManager();
String outputPath = .. // the path at or below which the output is to be placed
// Listen for sequencing completion or failure events, via the ALL type ...
int eventTypes = org.modeshape.jcr.api.observation.Event.Sequencing.ALL;
boolean isDeep = true; // if outputPath is ancestor of the sequencer output, false if identical
String[] uuids = null; // Don't care about UUIDs of nodes for sequencing events
String[] nodeTypes = null; // Don't care about node types of output nodes for sequencing events
boolean noLocal = false; // We do want events for sequencing happen locally (as well as
remotely)
// Now create a listener implementation that will be called when the event is here ...
CountDownLatch latch = new CountDownLatch(1);
SequencingListener listener = new SequencingListener(latch);
observationManager.addEventListener(listener,eventTypes,outputPath,isDeep,
uuids, nodeTypes, noLocal);
// Now, block until the latch is decremented (by the listener) or when our max wait time is
exceeded
latch.await(15, TimeUnit.SECONDS);
if ( listener.isSequencingSuccessful() ) {
// Grab the output produced by the sequencer ...
} else {
// Handle the failure ...
}
Page 163 of 424
ModeShape 3
3 Using ModeShape
This page is a placeholder to document various aspects of using ModeShape, including
deployment, configuration, and general usage. These topics will be reorganized.
Deploying to web and app servers

RepositoryFactory and configuration files
RepositoryFactory and JNDI
Lookup Repository in JNDI
Deploying to JBoss AS7
Deploying to other web or app servers
Lookup Repositories in JNDI
3.1 Deploying to web and app servers

ModeShape can pretty easily be deployed to servlet containers and application servers. There are two
options:
Use the RepositoryFactory mechanism and specify in the URL the location of the repository
configuration
Use a JNDI ObjectFactory to automatically create the javax.jcr.Repository instance and
register it in JNDI, then access it in JNDI
3.1.1 RepositoryFactory and configuration files

3.1.2 RepositoryFactory and JNDI
3.1.3 Lookup Repository in JNDI
One of the more popular ways to find a Repository instance is to use JNDI, though this only works in
environments like web servers or application servers that contain a JNDI implementation. It also assumes
that a Repository instance has already been registered in JNDI; how this is done is specific to the
environment.
Page 164 of 424
ModeShape 3

When ModeShape is installed and deployed into a JBoss AS7 installation, the ModeShape will automatically
register each deployed repository in JNDI, which by default is at "/jcr/repositoryName" (where "
repositoryName" is the name of the repository). Simply use the JBoss AS7 administration tools to
configure and deploy as many repositories as needed, including customizing the JNDI location of each
repository.
Then in your application, simply look up each Repository instance from JNDI:

javax.jcr.Repository repository = (javax.jcr.Repository) envCtx.lookup("jcr/myrepo");
Page 165 of 424
ModeShape 3

There are several ways of deploying ModeShape and registering the Repository instances in JNDI. One is
to simply create a web application that sets up ModeShape and its repositories, where each repository
configuration file specifies the location in JNDI where ModeShape should register that repository.
However, many servlet containers and application servers provide a way to configure a JNDI
ObjectFactory that will create the necessary objects as soon as a client uses JNDI to look up the object
at a particular location; after that, the objects will be registered in JNDI and found. ModeShape provides an
ObjectFactory implementation, so simply configure your server to use it, and deploy your applications to
look in JNDI for ModeShape's Repository instances.
Although each server configures the ObjectFactory instances differently, they all basically define the
following:
The name of the ObjectFactory implementation class. For ModeShape, this is "
org.modeshape.jcr.JndiRepositoryFactory".
Custom properties. For ModeShape, there are two for each repository:
The "configFile" property that specifies the location for the JSON repository configuration
file.
The "repositoryName" property that specifies the name of the repository. This value must
match the name in the configuration file.
Note that that although the "repositoryName" can be found in the configuration file, specifying it does
allow the factory to quickly find the named repository if it were already deployed, without having to read the
configuration file.
Here's an example of a fragment of the conf/context.xml for Tomcat that registers ModeShape's
Repositories implementation in JNDI at "jcr", and deploys two repositories using a different JSON
configuration file for each one:
<Resource name="jcr/myrepo"
auth="Container"
factory="org.modeshape.jcr.JndiRepositoryFactory"
repositoryName="My Repository"
configFile="/resource/path/to/repository-config.json" />
Simply provide a similar fragment for each repository that is to be registered in JNDI.
Then in your application, simply look up the Repository instance from the correct JNDI location:

javax.jcr.Repository repository = (javax.jcr.Repository) envCtx.lookup("jcr/myrepo");
Page 166 of 424
ModeShape 3
3.1.4 Lookup Repositories in JNDI

This us approach requires using a ModeShape-specific extension to the standard JCR API. The
org.modeshape.jcr.api.Repositories interface represents a collection of named repository
instances. The ModeShape engine implements this interface, so you can register the ModeShape engine in
JNDI at a particular location, and have your application use JNDI to find the Repositories instance and
get the Repository instance by name.
You might be wondering why you'd want to use this mechanism, since it relies upon non-standard APIs.
Well, this mechanism was designed for one particular use case that can't really use any of the other
(preferred) mechanisms: a web application that works with any available repository, but needs to know which
repositories are available.
Consider a web application that serves as a web-based front end for a bunch of repositories. Typically, the
repository name is part of the URL, and so this application works regardless of which or how many
repositories are deployed as long as the client includes the name of a running repository in the URL. But
how does the client know which repository is available? Normally, the client would just ask the web service
for the list of repositories. But that's not possible using any of the aforementioned mechanisms, and why we
created the Repositories interface:
public interface Repositories {

/**
* Get the names of the available repositories.
*
* @return the immutable set of repository names provided by this server; never null
*/
Set<String> getRepositoryNames();
/**
* Return the JCR Repository with the supplied name.
*
* @param repositoryName the name of the repository to return; may not be null
* @return the repository with the given name; never null
* @throws javax.jcr.RepositoryException if no repository exists with the given name or
there is an error communicating with
*
the repository
*/
javax.jcr.Repository getRepository( String repositoryName ) throws
javax.jcr.RepositoryException;
}
As you can see, these two methods allow the application to discover which repositories are available and to
obtain any of the named repositories. And this will work even when repositories are deployed or undeployed.
Page 167 of 424
ModeShape 3

When ModeShape is installed and deployed into a JBoss AS7 installation, the ModeShape's
Repositories implementation is registered in JNDI at "/jcr". Simply use the JBoss AS7 administration
tools to configure and deploy repositories.
Then in your application, simply look up the Repositories instance from JNDI, and get the names of the
available repositories and/or get a Repository with a particular name.

org.modeshape.jcr.api.Repositories repositories =
(org.modeshape.jcr.api.Repositories) envCtx.lookup("jcr");
// Get the names of the available repositories ...
Set<String> repoNames = repositories.getRepositoryNames();
// Get the repository given a name ...
String repoName = //...
javax.jcr.Repository repo = repositories.getRepository(repoName);

There are several ways of deploying ModeShape and registering the Repositories instance in JNDI. One
is to simply create a web application that sets up ModeShape and registers it in JNDI; as long as this web
app is deployed first, your other web applications will be able to find it.
However, many servlet containers and application servers provide a way to configure a JNDI
ObjectFactory that will create the necessary objects as soon as a client uses JNDI to look up the object
at a particular location. ModeShape provides an ObjectFactory implementation, so simply configure your
server to use it, and deploy your applications to look in JNDI for ModeShape's Repositories instance.
Although each server configures the ObjectFactory instances differently, they all basically define the
following:
The name of the ObjectFactory implementation class. For ModeShape, this is "
org.modeshape.jcr.JndiRepositoryFactory".
Custom properties. For ModeShape, this is the "configFiles" property with a value that is a
comma-separated list of locations for each of the JSON repository configuration files that should be
deployed.
Note that "configFiles" is plural. Be sure to use this property name; don't use "configFile"
unless you want to register the repository instance.
Page 168 of 424
ModeShape 3
Here's an example of a fragment of the conf/context.xml for Tomcat that registers ModeShape's
Repositories implementation in JNDI at "jcr", and deploys two repositories using a different JSON
configuration file for each one:
<Resource name="jcr"
auth="Container"
factory="org.modeshape.jcr.JndiRepositoryFactory"
configFiles="/resource/path/to/first/repository-config.json,
/resource/path/to/second/repository-config.json" />
Then in your application, simply look up the Repositories instance from JNDI, and get the names of the
available repositories and/or get a Repository with a particular name.

org.modeshape.jcr.api.Repositories repositories =
(org.modeshape.jcr.api.Repositories) envCtx.lookup("jcr");
// Get the names of the available repositories ...
Set<String> repoNames = repositories.getRepositoryNames();
// Get the repository given a name ...
String repoName = //...
javax.jcr.Repository repo = repositories.getRepository(repoName);
Unregistering the Repositories object from JNDI will shut down the ModeShape engine, and the
JndiRepositoryFactory will recreate it if needed.
3.2 ModeShape in Java applications

ModeShape makes it easy to use JCR repositories within web and Java EE applications deployed to virtually
any web or application server. ModeShape makes this even easier with JBoss AS7, since ModeShape can
be installed, managed and monitored as a true JBoss AS7 subsystem.
But ModeShape is also small and lightweight enough that you can very easily embed it into your own Java
SE applications. And doing so is remarkably easy. The only thing you need to determine is how much control
and management your application will need to have over the ModeShape repositories. On one hand, if your
application needs to just look up and use one or more JCR Repository instances, then it could use the JCR
API as we've seen before. On the other hand, your application may need more control over dynamically
deploying, monitoring, changing the configuration, and undeploying individual repositories. In this case, your
application can use the ModeShape-specific API.
Page 169 of 424
ModeShape 3
The ModeShape Engine

Using the ModeShape Engine API
Creating the engine and deploying repositories
Using a programmatic Infinispan configuration (advanced)
Modifying the configuration programmatically before deployment (advanced)
Modifying the configuration of a deployed repository (advanced)
Shutting down and undeploying Repositories
Shutting down the engine
Use RepositoryFactory and the JCR API
Pros and cons
Configuring repositories
ModeShape repository configuration files
Variables
Infinispan configuration file
Clustering
Cluster name and JGroups
Storage
Indexes
3.2.1 The ModeShape Engine

ModeShape provides a single component, called the ModeShape engine, that controls and manages your
repositories. The engine is just a simple Java class, called ModeShapeEngine, that your application
instantiates, starts, uses to (dynamically) deploy and undeploy repositories, and stops the engine before
your application shuts down. There are two ways to do this: use the ModeShape-specific API, or use only
the JCR API (even to manage and use multiple repositories in a single application). We'll cover both
approaches, including talking about the pros and cons of each.
Using the ModeShape Engine API

The ModeShape Engine API allows your application to fully control the ModeShape repositories and the
lifecycle of all repositories, and is best-suited for applications that dynamically create and manage multiple
repositories, or that need explicit control over ModeShape's lifecycle.
There are primarily two classes that are involved: ModeShapeEngine and RepositoryConfiguration.
Creating the engine and deploying repositories

The ModeShapeEngine class represents a container for named javax.jcr.Repository instances, and
can
Page 170 of 424
ModeShape 3
dynamically deploy new Repository instances

start and stop Repository instances
change the configuration of a Repository, even when it is running and being used
obtain the names of all deployed Repository instances
dynamically undeploy Repository instances
shut down the entire engine while (gracefully or immediately) shutting down any running Repository
instances
The ModeShapeEngine class is thread-safe, so your application can use multiple threads to do any of
these operations. Each ModeShapeEngine instance is completely independent of the others, so your
application can even create multiple ModeShapeEngine instances within the same JVM. However, most
applications will simply need a single instance.
Most ModeShape components are thread-safe and able to be safely used concurrently by multiple
threads. This includes the ModeShapeEngine and implementations of javax.jcr.Repository,
javax.jcr.Session, javax.jcr.Node, javax.jcr.Property, and other JCR interfaces.
And it also includes immutable classes like RepositoryConfiguration. Remember, however,
that each Session instance can contain transient changes, so do not have multiple threads
sharing a Session to perform writes - the threads will succeed in making concurrent changes, but
the transient state of the Session will be a combination of all the changes and calls to
Session.save() will result in strange persisted states and potential invalid content states.
Your application can create a new ModeShapeEngine with its no-argument constructor:
org.modeshape.jcr.ModeShapeEngine engine = new ModeShapeEngine();
In this state, the engine exists in a minimal state and needs to be started before it can be used. To do this,
call the start() method, which will block while the engine initializes its small internal state:
engine.start();
A new repository is deployed by reading in its JSON configuration document (which we'll learn about later)
and then passing that to the engine:
RepositoryConfiguration config = RepositoryConfiguration.read(...); // from a file, URL, stream,

or content String
javax.jcr.Repository repo = engine.deploy(config);
The deploy(...) method first validates the JSON document to make sure it is structurally correct and
satisfies the schema; any problems result in an exception containing the validation errors. If the configuration
is valid and there isn't already a deployed repository with the same name, the deploy(...) method will
then create and return the Repository instance.
Page 171 of 424
ModeShape 3
Each repository can also be started and stopped, although unlike the engine a repository will automatically
start when your application attempts to create a Session.
One advantage of using the Engine API is that your application can get the names of the deployed
Repository instances and, given a repository name, can return the running state of the Repository as
well as the Repository instance:
Set<String> names = engine.getRepositoryNames();

for ( String name : names ) {
State state = engine.getRepositoryState(name);
if ( State.RUNNING == state ) {
// do something with this info
}
Repository repo = engine.getRepository(name);
}
Note that the Repository doesn't need to be running in order to get it. In fact, each Repository instance
can be started explicitly or will automatically start as soon as the Repository.login(...) method is
called.
Using a programmatic Infinispan configuration (advanced)

ModeShape's repository configuration files will usually reference an Infinispan configuration file. Sometimes,
you want to programmatically define the Infinispan configuration.
To do this, simply use Infinispan's public API to create and/or modify an existing Infinispan configuration for
a cache, and then set up your RepositoryConfiguration to use a special " Environment" object:
Page 172 of 424
ModeShape 3
RepositoryConfiguration config = ... // Obtain by reading or editing

// Use Infinispan's API to create a cache configuration. You could also use Infinispan's API to
read in
// an existing configuration and edit it. But for this example, we'll do something trivial like
set the cache
// mode to "synchronous distributed"...
String cacheName = "mycachename";
ConfigurationBuilder configurationBuilder = new ConfigurationBuilder();
configurationBuilder.clustering().cacheMode(CacheMode.DIST_SYNC);
Configuration ispnConfig = configurationBuilder.build();
// Create a local environment that we'll set up to own the external components ModeShape needs
...
LocalEnvironment environment = new LocalEnvironment();
if ( useOurOwnCacheContainer ) {
// Optionally define our own cache container. This is entirely optional
String cacheContainerName = "mycontainer";
CacheContainer cacheContainer = new DefaultCacheManager();
environment.addCacheContainer(cacheContainerName, cacheContainer);
environment.defineCache(cacheContainerName, cacheName, ispnConfig);
} else {
environment.defineCache(cacheName, ispnConfig);
}
// Now obtain a clone of the repository configuration, except that the result should use our
local environment ...
config = config.with(environment);
Lines 6-9 involve defining the name of the Infinispan cache and using the Infinispan API to build a new
cache Configuration. (Note you should define the whole configuration or read in an existing configuration
to modify.)
On line 12 we instantiate a new org.modeshape.jcr.LocalEnvironment object that owns all
non-ModeShape components. Then register your cache configuration and/or cache container:
If you want to define and manage your own cache container, simply instantiate it, register it with the
environment, and then define your cache (lines 16-19); or
If you just want to use a default container, you can simply define your cache configuration (line 21).
In both cases, be sure that the value of cacheName matches the "cacheName" value in the repository
configuration.
Finally, obtain a copy of your original RepositoryConfiguration that uses your LocalConfiguration
instance.
Page 173 of 424
ModeShape 3
Modifying the configuration programmatically before deployment (advanced)

Sometimes your application will need to review or modify a repository configuration. If you need to do this
before you deploy the repository, then you can edit the JSON document using ModeShape's editor API.
Here's a very simple example:
// Read in the existing configuration ...

RepositoryConfiguration config = RepositoryConfiguration.read("path/to/config.json");
// Edit the document ...
Editor editor = config.edit();
editor.setString(RepositoryConfiguration.FieldName.JNDI_NAME, "new-jndi-name");
// Create a new configuration with the edited document ...
RepositoryConfiguration newConfig = new RepositoryConfiguration(editor,config.getName());
// Deploy the new configuration ...
javax.jcr.Repository repo = engine.deploy(config);
At this point, you can deploy the new configuration:
javax.jcr.Repository repo = engine.deploy(newConfig);
or even write out the configuration to a JSON file:
OutputStream stream = ...

org.infinispan.schematic.document.Json.write(newConfig.getDocument(),stream);
or write the JSON to a string:
String json = org.infinispan.schematic.document.Json.write(newConfig.getDocument());
Modifying the configuration of a deployed repository (advanced)

Sometimes you want to be able to to change the configuration of a repository that is already deployed and
running.
Each Repository instance keeps a reference to its immutable RepositoryConfiguration. But that
configuration can be edited to alter the repository's configuration even if that Repository is running and
being used by JCR clients. Here's the basic workflow for changing the configuration of a deployed
Repository:
Page 174 of 424
ModeShape 3
String repoName = ...

RepositoryConfiguration deployedConfig = engine.getRepositoryConfiguration(repoName);
// Create an editor ...
Editor editor = deployedConfig.edit();
// Use the editor to modify the JSON configuration (we'll do something trivial here) ...
EditableDocument storage = editor.getOrCreateDocument(FieldName.STORAGE);
EditableDocument binaries = storage.getOrCreateDocument(FieldName.BINARY_STORAGE);
binaries.setNumber(FieldName.MINIMUM_BINARY_SIZE_IN_BYTES,8096);
// Get the changes made by the editor and validate them ...
Changes changes = editor.getChanges();
Results validationResults = deployedConfig.validate(changes);
if ( validationResults.hasErrors() ) {
// Report the errors
System.out.println(validationResults);
} else {
// Update the deployed repository's configuration with these changes ...
engine.update(repoName,changes);
}
The example obtained the RepositoryConfiguration (line 2), obtained an editor for it (line 5), and then
manipulates the JSON document on lines 8-10 to get or create the " storage" nested document, and then
inside that get or create the "binaryStorage" nested document, and inside that set the "
minimumBinarySizeInBytes" field to 8K. The example then gets the changes made by our editor (line
13), validates the changes (line 14), and either writes out the validation problems (line 17) or applies the
changes (line 18).
The engine.update(...) method call (line 18) applies the configuration in a consistent and thread-safe
manner. It first obtains an internal lock, grabs the repository's current configuration (which may have
changed since our call at line 2), applies the changes that were made by the editor, validates the
configuration, updates the running repository with the new valid configuration, and releases the internal lock.
Note that this can all be done even when there are other parts of your application that are still using the
Repository to read and update content.
Of course, some configuration changes are pretty severe, like changing the Infinispan cache where a
repository stores all its content. These kinds of changes can still be made, but will not take effect until the
repository is shutdown and re-started.
This process may seem complicated, but it means that your application doesn't have to coordinate or
centralize the changes. Instead, multiple threads can safely make changes to the same repository
configuration without having to worry about locking or synchronizing the changes. Of course, if multiple
threads make different changes to the same configuration property, the last one to be applied will win.
Page 175 of 424
ModeShape 3
Shutting down and undeploying Repositories

Repository instances can be shutdown and undeployed:
String repoName = ...

Future<Boolean> future = engine.undeploy(repoName);
future.get();
// optional, but blocks until repository is completely shutdown and removed
Note that the ModeShapeEngine.undeploy(String) called on line 2 will undeploy the repository
(meaning no new sessions can be created) and asynchronously shut the repository down (close all existing
sessions). Because it is asynchronous, the undeploy(...) method returns immediately but returns a
java.util.concurrent.Future object that the caller can optionally use to block until the repository was
completely shutdown (line 3).
Shutting down the engine

And finally, the entire engine can be shutdown:
Future<Boolean> future = engine.shutdown();

if ( future.get() ) {
// optional, but blocks until engine is completely shutdown or
interrupted
System.out.println("Shut down ModeShape");
}
Once again, the shutdown() method is asynchronous, but it returns a Future so that the caller can block
if needed. There is an alternative form of shutdown that takes a boolean parameter specifying whether the
engine should force the shutdown of all running repositories, or whether the shutdown should fail if there is
at least one running repository:
boolean forceShutdown = false;

Future<Boolean> future = engine.shutdown(forceShutdown);
if ( future.get() ) {
// optional, but blocks until engine is completely shutdown or
interrupted
System.out.println("Shut down ModeShape.");
} else {
System.out.println("At least one repository is in use, so shutdown aborted.");
}
3.2.2 Use RepositoryFactory and the JCR API

The simplest approach an Java SE application can take is to use only the JCR 2.0 API. We talked in the
Introduction to JCR how an application can use the J2SE Service Loader mechanism and JCR's
RepositoryFactory API to find a JCR Repository instance:
Page 176 of 424
ModeShape 3

}
This approach is great if your application is designed to use different JCR implementations and you don't
want to use implementation-specific APIs. You can even load the properties from a file:
java.io.InputStream stream = ... // get the stream to the properties file

java.util.Properties parameters = new Properties();
parameters.load(stream); // or reader
}
When embedding ModeShape into an application, the parameters map should contain at a minimum
single property that defines the URL to the repository's configuration file. Thus the properties file might look
like this:
org.modeshape.jcr.URL = file://path/to/configFile.json
or you can create the parameters programmatically:
Map<String,String> parameters = new HashMap<String,String>();

parameters.put("org.modeshape.jcr.URL","file://path/to/configFile.json");
}
In addition to the "org.modeshape.jcr.URL" parameter, ModeShape also looks for a "

org.modeshape.jcr.RepositoryName" parameter. Each repository configuration file contains the name
of the repository, so this "RepositoryName" parameter is not required. But providing it allows ModeShape's
RepositoryFactory implementation to see if the named Repository has already been deployed without
having to read the configuration file. If it doesn't find a Repository that was deployed with that name and that
configuration file, the factory will automatically deploy the specified configuration and return the named
repository.
As a convenience, ModeShape provides two constants you can use in your application when
programmatically creating the parameters to pass to RepositoryFactory:
Page 177 of 424
ModeShape 3
package org.modeshape.jcr.api;
public interface RepositoryFactory extends javax.jcr.RepositoryFactory {
public static final String URL = "org.modeshape.jcr.URL";
public static final String REPOSITORY_NAME = "org.modeshape.jcr.RepositoryName";
...
}
Pros and cons

Obviously the most important benefit of your applications using the JCR RepositoryFactory to find the
Repository instances is that it is completely independent of ModeShape (or any other JCR 2.0
implementation). This may be very important if your application needs to work with multiple JCR
implementations.
On the other hand, the JCR 2.0 API doesn't provide a way to manage the Repository instances. For
example, your application may want to shut down a repository after it has finished using it. Or perhaps it
wants to alter the configuration of a repository while it is in use. In these cases, your application may want to
use the ModeShape-specific Engine API.
3.2.3 Configuring repositories

Whether your application uses the JCR 2.0 RepositoryFactory to obtain its repositories or the
ModeShape Engine API to explicitly manage and access its repositories, your application will need to have a
separate ModeShape configuration file for each repository you want to use. You'll also likely want to have for
each ModeShape Engine instance one Infinispan configuration file that defines the caches used for all of the
repositories in that engine.
ModeShape repository configuration files

Each ModeShape repository is configured with a separate and independent JSON file that adheres to our
JSON Schema. Every field within the configuration has a sensible default, so actually the following is a
completely valid configuration:
Simplest.json
{ }
Note that the name of the repository is derived from the filename. It is more idiomatic, however, to at least
specify the repository name:
Page 178 of 424
ModeShape 3
Simplest.json
{
"name" : "Simplest"
}
When deployed, this configuration specifies a non-clustered repository named "Simplest" that stores the
content, binary values, and query indexes on the local file system under a local directory named "Simplest
". Of course, your very likely going to want to expressly state the various configuration fields for your
repositories.
Here's a far more complete example for a repository named "DataRepository" that uses most of the
available fields:
DataRepository.json
{
"name" : "DataRepository",
"transactionMode" : "auto",
"monitoring" : {
"enabled" : true,
},
"workspaces" : {
"predefined" : ["otherWorkspace"],
"allowCreation" : true,
},
"storage" : {
"cacheName" : "DataRepository",
"cacheConfiguration" : "infinispan_configuration.xml",
"transactionManagerLookup" =
"org.infinispan.transaction.lookup.GenericTransactionManagerLookup",
"binaryStorage" : {
"type" : "file",
"directory" : "DataRepository/binaries",
"minimumBinarySizeInBytes" : 4096
}
},
"security" : {
"anonymous" : {
"username" : "<anonymous>",
"roles" : ["readonly","readwrite","admin"],
"useOnFailedLogin" : false
},
"providers" : ["My Custom Security Provider",
"classname" :
"com.example.MyAuthenticationProvider",
},
{
"classname" :
"JAAS",
"policyName" : "modeshape-jcr",
}|MODE:Home]
},
"query" : {
"enabled" : true,
"rebuildUponStartup" : "if_missing",
"textExtracting": {
"threadPool" : "test",
Page 179 of 424
ModeShape 3
"extractors" : {
"customExtractor": {
"name" : "MyFileType extractor",
"classname" : "com.example.myfile.MyExtractor",
},
"tikaExtractor":{
"name" : "General content-based extractor",
"classname" : "tika",
}
}
},
"indexStorage" : {
"type" : "filesystem",
"location" : "DataRepository/indexes",
"lockingStrategy" : "native",
"fileSystemAccessType" : "auto"
},
"indexing" : {
"threadPool" : "modeshape-workers",
"analyzer" : "org.apache.lucene.analysis.standard.StandardAnalyzer",
"similarity" : "org.apache.lucene.search.DefaultSimilarity",
"batchSize" : -1,
"indexFormat" : "LUCENE_35",
"readerStrategy" : "shared",
"mode" : "sync",
"asyncThreadPoolSize" : 1,
"asyncMaxQueueSize" : 0,
"backend" : {
"type" : "lucene",
},
"hibernate.search.custom.overridden.property" : "value",
}
},
"sequencing" : {
"removeDerivedContentWithOriginal" : true,
"threadPool" : "modeshape-workers",
"sequencers" : {
"ZIP Sequencer" : {
"description" : "ZIP Files loaded under '/files' and extracted into
'/sequenced/zip/$1'",
"classname" : "ZipSequencer",
"pathExpressions" : ["default:/files(//)(*.zip[*])/jcr:content[@jcr:data] =>
default:/sequenced/zip/$1"],
},
"Delimited Text File Sequencer" : {
"classname" : "org.modeshape.sequencer.text.DelimitedTextSequencer",
"pathExpressions" : [MODE:Clustering])/jcr:content[@jcr:data] =>
default:/sequenced/text/delimited/$1"
],
"splitPattern" : ","
}
}
},
"clustering" : {
}
}
Page 180 of 424
ModeShape 3
Most of the field values match their defaults, although by default:
the "storage/cacheConfiguration" field is not specified, meaning an Infinispan cache
configuration is dynamically created to store the content in local memory;
the "workspaces/predefined", "query/extractors", and "sequencing/sequencers" fields
are each empty arrays;
the "query/indexing/hibernate.search.*" properties are not defined; and
the "security/providers" field defaults to an empty array, meaning only the anonymous provider
is configured.
Of course, the standard JSON formatting rules apply.
Page 181 of 424
ModeShape 3
Variables
Variables may appear anywhere within the configuration JSON document's string field values. If a variable is
to be used within a non-string field, simply use a string field within the JSON document. When ModeShape
reads in the JSON document, these variables will be replaced with the system properties of the same name,
and any resulting fields that are expected to be non-string values will be converted into the expected field
type. Any problem converting values will be reported as problems.
Here's the grammar for the variables:
variable := '${' variableNames [ ':' defaultValue ] '}'

variableNames := variableName [ ',' variableNames ]
variableName := /* any characters except ',' and ':' and '}'
defaultValue := /* any characters except
The value of each variableName is used to look up a System property via

System.getProperty(String). Note that the grammar allows specifying multiple variable names within
a single variable and to optionally specify a default value. The logic will process the multiple variable names
from let to right, until an existing system property is found; if one is found, it will stop and will not attempt to
find values for the other variables.
For example, here is part of the earlier "DataRepository.json" file, except the "cacheConfiguration"
field value has been changed to include a variable:
DataRepository.json
{
...
"storage" : {
"cacheName" : "DataRepository",
"cacheConfiguration" :
"${application.home.location}/config/infinispan_configuration.xml",
"transactionManagerLookup" =
"org.infinispan.transaction.lookup.GenericTransactionManagerLookup",
"binaryStorage" : {
"type" : "file",
"directory" : "${application.home.location}/binaries",
"minimumBinarySizeInBytes" : "${application.min.binary.size:4096}"
}
},
...
}
Note how the "minimumBinarySizeInBytes" value is a string with the variable name; this works because
ModeShape (in this case) will attempt to autoconvert the variable's replacement and default values to an
integer, which is what the JSON Schema stipulates for the " minimumBinarySizeInBytes" field.
Page 182 of 424
ModeShape 3
Infinispan configuration file

Most of the time you'll probably want to explicitly define an Infinispan configuration file for your repository or
repositories. Infinispan provides a [configuration reference] that documents the structure of their XML files.
The following is an example of a configuration file referenced by our repository configuration in the previous
section (line 14), and it defines a single cache named "DataRepository" referenced in the repository
configuration (line 13):
infinispan_configuration.xml
<infinispan
'
<global>

</global>
<default>
<!-Defines the default behavior for all caches, including those created dynamically (e.g.,
when a
repository uses a cache that doesn't exist in this configuration).
-->
</default>
<namedCache name="DataRepository">
<!-Our Infinispan cache needs to be transactional. However, we'll also configure it to
use pessimistic locking, which is required whenever applications will be concurrently
updating nodes within the same process. If you're not sure, use pessimistic locking.
-->
<transaction
lockingMode="PESSIMISTIC"/>
<!-Define the cache loaders (i.e., cache stores). Passivation is false because we want
*all*
data to be persisted, not just what doesn't fit into memory. Shared is false because
there
are no other caches sharing this file store. We set preload to false for lazy loading;
may be improved by preloading and configuring eviction.
We can have multiple cache loaders, which get chained. But we'll define just one.
-->
<loaders passivation="false" shared="false" preload="false">
<!-The 'fetchPersistentState' attribute applies when this cache joins the cluster; the
value doesn't
really matter to us in this case. See the documentation for more options.
Page 183 of 424
ModeShape 3
-->
<loader class="org.infinispan.loaders.file.FileCacheStore"
fetchPersistentState="false"
purgeOnStartup="false">

<properties>

<property name="location" value="DataRepository/storage"/>
</properties>

<!-We could use "write-behind", which actually writes to the file system
asynchronously,
which can improve performance as seen by the JCR client.
Plus changes are coalesced, meaning that if multiple changes are enqueued for the
same node, only the last one is written. (This is good much of the time, but not
always.)
<async enabled="true" flushLockTimeout="15000" threadPoolSize="5"/>
-->
</loader>
</loaders>
</namedCache>
</infinispan>
Clustering
Clustering a repository that is embedded into a Java application simply requires ensuring both ModeShape
and Infinispan are clustered properly. Some things to consider are:
1. Where will the persisted content be stored, and will that persisted data be shared? For example, the
repository's binary store should be shared amongst all processes in the cluster (e.g., they all
access/use the same file system, JDBC/MongoDB/Cassandra database, Infinispan cache), but the
Infinispan cache(s) used by a repository can either share the same storage or have their own
independent copy. Be sure to read more about clustering topologies.
2. How will the different processes communicate. This communication is all via JGroups, so a proper
JGroups configuration is essential.
3. How frequently will processes be added and removed from the cluster?
There are three areas of a repository's configuration that are related to clustering.
Using variables in the ModeShape and Infinispan configuration files is a very good practice
because it allows your application to use a single set of configuration files throughout the cluster.
Consider using a variable such as "${cluster-id}" to represent the unique identifier of the
process within the cluster. Just be sure to set the value of each variable in the system properties;
ModeShape does not provide any built-in variables.
Page 184 of 424
ModeShape 3
Cluster name and JGroups

The "clustering" section of the repository JSON configuration file specifies the name of the cluster and
the JGroups configuration. Here is an example of this section:
"clustering" : {
"clusterName" : "my-repo-cluster",
"channelConfiguration" : "config/jgroups-config.xml"
}
The ModeShape repository will only act in a clustered way if this section is defined.
Each ModeShape repository must be clustered independently of the other repositories deployed to the same
processes, so be sure that the "clusterName" value is unique. Even though there is a default value (e.g., "
ModeShape-JCR"), it is far better to explicitly set this value.
The "channelConfiguration" field defines the path to the JGroups configuration file. If this field is
absent, then the repository will use the default JGroups configuration, which may or may not work
out-of-the-box on your network. Here is a sample JGroups configuration file we use in some of our tests:
<config xmlns="urn:org:jgroups"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.1.xsd">
<UDP
tos="8"
ucast_recv_buf_size="20M"
ucast_send_buf_size="640K"
mcast_recv_buf_size="25M"
mcast_send_buf_size="640K"
loopback="true"
max_bundle_size="64K"
max_bundle_timeout="30"
enable_bundling="true"
enable_diagnostics="true"
thread_naming_pattern="cl"
timer_type="old"
timer.min_threads="4"
timer.max_threads="10"
timer.keep_alive_time="3000"
timer.queue_max_size="500"
thread_pool.enabled="true"
thread_pool.min_threads="2"
thread_pool.max_threads="8"
thread_pool.keep_alive_time="5000"
thread_pool.queue_enabled="true"
thread_pool.queue_max_size="10000"
thread_pool.rejection_policy="discard"
Page 185 of 424
ModeShape 3
oob_thread_pool.enabled="true"
oob_thread_pool.min_threads="1"
oob_thread_pool.max_threads="8"
oob_thread_pool.keep_alive_time="5000"
oob_thread_pool.queue_enabled="false"
oob_thread_pool.queue_max_size="100"
oob_thread_pool.rejection_policy="Run"/>
<PING timeout="2000"
num_initial_members="20"/>
<MERGE2 max_interval="30000"
min_interval="10000"/>
<FD_SOCK/>
<FD_ALL/>
<VERIFY_SUSPECT timeout="1500" />
<BARRIER />
<pbcast.NAKACK2 xmit_interval="1000"
xmit_table_num_rows="100"
xmit_table_msgs_per_row="2000"
xmit_table_max_compaction_time="30000"
max_msg_batch_size="500"
use_mcast_xmit="false"
discard_delivered_msgs="true"/>
<UNICAST xmit_interval="2000"
xmit_table_num_rows="100"
xmit_table_msgs_per_row="2000"
xmit_table_max_compaction_time="60000"
conn_expiry_timeout="60000"
max_msg_batch_size="500"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
max_bytes="4M"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
view_bundling="true"/>
<UFC max_credits="2M"
min_threshold="0.4"/>
<MFC max_credits="2M"
min_threshold="0.4"/>
<FRAG2 frag_size="60K" />
<RSVP resend_interval="2000" timeout="10000"/>
<pbcast.STATE_TRANSFER />

</config>
A third field in the "clustering" section of the repository's JSON configuration is the "channelProvider"
field, which specifies the fully-qualified name of an
org.modeshape.jcr.clustering.ChannelProvider implementation class. The purpose of this class
is to return a "JChannel" instance, and the default implementation does this by reading the aforementioned
"channelConfiguration" field and setting up JGroups. If you require a different way of configuring and
acquiring the JGroups channel, simply implement your own ChannelProvider and tell the repository about
it with the "channelProvider" field. Most applications don't need to worry about this field.
Storage
The "storage" section of the repository's JSON configuration file defines the Infinispan cache configuration
as well as the binary storage, and both need to be properly configured for clustering. Here's an example:
Page 186 of 424
ModeShape 3
"storage" : {
"cacheName" : "persistentRepository",
"cacheConfiguration" : "infinispan.xml",
"binaryStorage":{
"type":"file",
"directory":"storage/binaries",
"minimumBinarySizeInBytes":4096
}
}
where the "infinispan.xml" file for our example is:

<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
<global>
<globalJmxStatistics enabled="false" allowDuplicateDomains="true"/>
<transport clusterName="my-repo-cluster">
<properties>
<property name="configurationFile" value="config/jgroups-config.xml" />
</properties>
</transport>
</global>
<namedCache name="persistentRepository">
<eviction strategy="LIRS" maxEntries="600"/>

<clustering mode="replication">
<sync />
</clustering>
<transaction
lockingMode="PESSIMISTIC" />
<loaders passivation="false"
shared="false"
preload="false">
<loader class="org.infinispan.loaders.file.FileCacheStore"
fetchPersistentState="true"
purgeOnStartup="true">
<properties>
<property name="location" value="storage/repository_${cluster-id}/store" />
</properties>
</loader>
</loaders>
</namedCache>
</infinispan>
Page 187 of 424
ModeShape 3
There are a couple of things to note:
1. The "${cluster-id}" variable is used in file system paths in the Infinispan configuration. If this were
set to a unique value for each process via a system or environment variable, then each process will
have its own separate cache storage area on the file system. Since the cache is replicated, each
directory will contain a complete copy of all the content stored in this cache. (Note: if you want the
processes to share the same store, be sure that the cache store implementation supports it. For
example, the FileCacheStore should never be used as a shared store.)_
2. All processes store the binaries in the "storage/binaries" directory. Using the file system may not
work under heavy load, so in such cases you may consider using a database or Infinispan for binary
storage.
3. The ModeShape repository and Infinispan cache use the same JGroups configuration. This is
perfectly fine and in fact suggested. Also, it is perfectly acceptable for both to use the same cluster
name, since they send and expect different information through the channel.
Indexes
The "query" section of the ModeShape repository JSON configuration file defines how it is to create and
manage the indexes used for querying. And configuring the indexes properly is an essential part of clustering
ModeShape. Unfortunately, clustering the indexes can be a little tricky.
If your application does not use queries at all, disable the query and indexing system altogether by
setting "enabled" to false. This will make clustering your repository significantly easier.
One option to clustering indexes is to configure one of the processes to be the master index writer and all
others slaves, and then using a (durable) JMS queue to forward all update requests to the master process.
The indexes will be immediately updated on the master process, and then periodically copied to the slaves
on a schedule that you decide. The advantages of this approach are that it minimizes the number of writes
(since the indexes are updated only once for each change in content), and bringing up additional slave
processes automatically copy the indexes from the master. The disadvantages are that this is more
complicated to setup and maintain, each process needs access to a shared network file system, and the
queries on the slaves may show different results than the same query executed at the same time on the
master because the slave indexes are updated periodically.
Another approach is to have each process maintain its own completely isolated copy of the indexes. This
does tend to increase overall CPU load, since all processes are updating their own indexes for every change
made in the repository. It also makes it a bit more difficult to add or remove processes from the cluster, since
each process' index must be populated either by copying the indexes from another process or by rebuilding
the indexes locally. But this approach is far easier to configure and maintain, makes the processes much
more independent, and makes it more likely that the query results on each of the processes are the same.
Here is an example of the "query" section of the ModeShape repository configuration that uses the latter
technique:
Page 188 of 424
ModeShape 3
"query":{
"enabled":true,
"indexing" : {
"rebuildOnStartup": {
"when": "if_missing"
}
},
"indexStorage": {
"type":"filesystem",
"location":"storage/repository_${cluster-id}/index",
"lockingStrategy":"simple",
"fileSystemAccessType":"auto"
}
}
This enables indexing, ensuring that the indexes are completely rebuilt if they are absent when the process
starts up. The indexes themselves are store on the file system in a directory that uses the same variable we
used above. It is also very simple and straightforward.
Here is an example of a "query" section for use in the configuration for the repository acting as the master:
"query":{
"enabled": true,
"indexing" : {
},
"backend" : {
"type" : "jgroups-master",
"channelName" : "modeshape-indexing"
}
},
"indexStorage": {
"type":"filesystem-master",
"sourceLocation":"storage/clustered/master_indexes/",
"location": "storage/repository_${cluster-id}/indexes/",
"refreshInSeconds" : 1,
}
}
and a corresponding example of a "query" section for use in the configuration for the repository acting as
the slaves:
Page 189 of 424
ModeShape 3
"query":{
"enabled": true,
"rebuildUponStartup":"if_missing",
"indexing" : {
},
"backend" : {
"type" : "jgroups-slave",
"channelName" : "modeshape-indexing"
}
},
"indexStorage": {
"type":"filesystem-slave",
"sourceLocation":"storage/clustered/master_indexes/",
"location": "storage/repository_${cluster-id}/indexes/",
"refreshInSeconds" : 1,
}
}
Note that there are several differences between the master and slave configurations, but again variables
could be used to reduce it down to a single configuration file.
3.3 ModeShape and JBoss AS7 and EAP

JBoss Application Server 7 is a blazingly fast, lightweight, cluster-ready application server that you can run
locally for development, in your company's data center for testing and/or production, or even in the cloud.
And with ModeShape's kit for AS7, you can have AS7 manage your repositories while making it trivially easy
for your applications to use the JCR API.
JBoss Enterprise Application Server is the productized version of JBoss Application Server (AS). It's the
culmination of thousands of hours of QE, bug-fixes and co-ordination to make sure you can be as productive
as possible building your apps. You can use it free of charge for development, but production requires a
subscription that entitles you to lots of benefits. For more details, see the JBoss app server FAQs.
A few of the benefits of running ModeShape within AS7/EAP6.1 are:
Page 190 of 424
ModeShape 3
The application server has a small footprint, starts in just a few seconds, has a lot of tooling, and is of
course open source. Developing applications with it is a breeze.
The module classloading system means that your applications only need to see those APIs that are
necessary, and application and services can use different versions of the same libraries.
Repositories are configured centrally within the AS7/EAP configuration, and make use of the existing
Infinispan configuration and management support.
Add or remove repositories individually, without impacting the other repositories or their users
Change some aspects of a repository configuration while it is running and being used
A single repository can be used by multiple deployed applications and services
A single deployed application or service can use multiple repositories
Applications and services only see the JCR API and public ModeShape APIs
3.3.1 JBoss AS7 or EAP?

If you're using a recent version of ModeShape 3, you'll need to use EAP 6.1. The primary reason is that EAP
6.1 ships with a modern and updated version of Infinispan, and we know that ModeShape and EAP 6.1 work
quite well together. AS7.1.1 is quite old and includes an older version of Infinispan that has quite a few
issues, especially around clustering.
The table below shows the compatibility:
Server
ModeShape version
JBoss AS7.1.1
ModeShape 3.0 - 3.1
JBoss EAP 6.1.x ModeShape 3.2 and later

For more information about why we switched from AS7.1.1 to EAP6.1, please see our blog post.
The JBoss AS project has been renamed to Wildfly, and they are already pushing out some early
releases of Wildfly 8. We are watching this with great interest, and hope to soon add a second
ModeShape kit that can be installed on top of Wildfly.
3.3.2 Getting started

You can start using ModeShape with just a few steps:
1. Install JBoss AS 7.1 or EAP 6.1 and ModeShape
2. Configure your repositories
3. Write and deploy applications that use the JCR API, ModeShape's RESTful API or WebDAV API, or
JDBC API
Page 191 of 424
ModeShape 3
3.3.3 Installing ModeShape into EAP

ModeShape can be installed into an existing JBoss EAP 6.1. If you already have one, then you also have the
$JBOSS_HOME environment variable set to the installation location.
Download and install EAP6
Download and install ModeShape
Run EAP with the sample configuration in standalone mode
Run EAP with the sample configuration in clustered standalone mode
Run EAP with the sample configuration in domain mode
Next steps
Download and install EAP6

As of ModeShape 3.2, ModeShape supports JBoss EAP 6.1, whereas earlier version of ModeShape
supported AS7.1.1. See why. Installation is basically the same for both:
If you don't already have an EAP installation, download the JBoss EAP 6.1.x archive and unzip it into a
known location (e.g., /apps/jboss-eap-6.1.1). Please refer to the EAP documentation Guide for
requirements and installation instructions, and be sure the "$JBOSS_HOME" environment variable is set
correctly to the path where EAP is installed.
If you need to run ModeShape 3.0.x or 3.1, then you will need to download JBoss AS7.1.1. Simply
download the archive and unzip it into a known location (e.g., /apps/jboss-7.1.1). Please refer
to the JBoss AS 7 Getting Started Guide for requirements and installation instructions, and be sure
the "$JBOSS_HOME" environment variable is set correctly to the path where AS7 is installed.
The rest of this documentation will refer to EAP, but AS7.1.1 will apply as well, though anything
specific to AS7.1.1 will be highlighted.
Download and install ModeShape

Download the latest ModeShape EAP kit (latest version is 3.7.2.Final). This kit is a ZIP archive that is
intended to be unzipped directly into the EAP installation. Doing so will not override any of the files in the
standard EAP installation.
The directory structure of the ModeShape EAP kit is as follows:
Page 192 of 424
ModeShape 3
/docs
/schema
modeshape_1_0.xsd
/domain
/configuration
domain-modeshape.xml
modeshape-users.properties
modeshape-roles.properties
/modules
/javax/jcr/*
/org/modeshape/*
/org/hibernate/*
/org/infinispan
/org/apache/*
/standalone
/configuration/
standalone-modeshape.xml
standalone-modeshape-ha.xml
AS you can see, the kit contains installs the "modeshape_1_0.xsd" into the existing /docs/schema
directory, which is where all of the XML schemas for the application server subsystems are defined. The
modeshape_1_0.xsd schema defines the entire structure of the ModeShape subsystem fragment in the
server's XML configurations.
The kit installs modules for the JCR API (e.g., the javax.jcr packages), for ModeShape and its
components, for the Hibernate Search engine, and for Apache Lucene. The latter two modules do not exist
in the standard EAP/AS7 installation, but ModeShape explicitly avoided using the "main" slot on these
modules in the off chance that you might have already defined such modules in your installation.
The standalone/configuration or domain/configuration folders contain two files that define the
users and roles for the default security domain used by ModeShape:
You can edit these files to add users, or define your different security domain by defining for each user the
ModeShape roles found in the modeshape-roles.properties file.
The kit also installs several out-of-the-box configuration files:
Page 193 of 424
ModeShape 3
standalone/configuration/standalone-modeshape.xml - contains 2 predefined

repositories "sample" & "artifacts" and should be used when running a single EAP/AS7
instance.
standalone/configuration/standalone-modeshape-ha.xml - contains the same "sample"
& "artifacts" repositories but configured in clustered mode. This configuration should be used
when using multiple AS7/EAP nodes in replicated clustered mode.
domain/configuration/domain-modeshape.xml - contains the same configurations as in the
standalone files, split up in several profiles: ha & full-ha contain the clustered repositories while full &
default contain the standalone repositories.
Finally, the kit also deploys the "modeshape-rest", "modeshape-webdav" and "modeshape-cmis" web
applications via the above configuration files. The web applications allow users to interact with ModeShape
via the REST WebDAV and CMIS APIs respectively. If you don't want any of these applications to be
deployed when the server starts up, comment out or remove the following configuration fragments:
<subsystem xmlns="urn:jboss:domain:modeshape:1.0">

<webapp name="modeshape-rest.war"/>
<webapp name="modeshape-webdav.war"/>
<webapp name="modeshape-cmis.war"/>
In addition to the default repository roles: admin, readwrite and read the role connect is
required if a user wants to access any of the above described web applications that are deployed in
AS7/EAP. Also, beware of spaces: do not use spaces before/after the commas as that will create
incorrect role names.
Page 194 of 424
ModeShape 3
Run EAP with the sample configuration in standalone mode

At this point, you've installed ModeShape into your EAP installation, and it's ready to be used. Let's run the
server in standalone mode with the provided "standalone-modeshape.xml" configuration:
$ bin/standalone.sh -c standalone-modeshape.xml
The server process will output a few dozen lines of messages, including one that says something similar to:
...
13:56:42,126 INFO [org.jboss.as] (MSC service thread 1-16) JBAS015874: JBoss EAP 6.1.1.Final
started in 1957ms - Started 184 of 268 services (83 services are passive or on-demand)
At this point, the server is running and ready to accept requests at http://localhost:8080. If you want to see
the "sample" repository in action, use your browser to go to http://localhost:8080/modeshape-rest or
http://localhost:8080/modeshape-webdav to see http://localhost:8080/modeshape-rest/the information about
the running repositories.
The resulting page is a JSON file that is difficult to read without formatting it in an editor. At this
point, the content of the response is not important, but it simply shows that ModeShape is running
with a "sample" repository.
To stop the server, simply hit CTRL-C in the terminal where AS7/EAP is running, and it will shutdown
immediately but gracefully.
Page 195 of 424
ModeShape 3
Run EAP with the sample configuration in clustered standalone mode

If you want the check out how a ModeShape repository can be clustered, you can start a couple of AS7/EAP
nodes locally using the "standalone-modeshape-ha.xml" configuration file:
$ bin/standalone.sh -c standalone-modeshape-ha.xml -Djboss.node.name=node1
$ bin/standalone.sh -c standalone-modeshape-ha.xml -Djboss.node.name=node2

-Djboss.socket.binding.port-offset=100
As you can see, if you want to run several clustered nodes locally, there are 2 important properties you need
to provide:
jboss.node.name - a symbolic name of the running AS7/EAP instance. This is required by JGroups
to properly form the cluster members and will also be used by ModeShape when storing
repository-related data
jboss.socket.binding.port-offset - a numeric value which will be used as an offset value for
the various socket ports each instance opens. For example, by default the HTTP socket listens on
port 8080. When passing the value 100, that AS7/EAP node will listen for HTTP connections on port
8180.
Run EAP with the sample configuration in domain mode

You can also run several ModeShape nodes (clustered or non-clustered) on different machines using the
domain mode feature of AS7/EAP. Via the "domain-modeshape.xml" sample configuration file, you can
start a domain controller with ModeShape pre-configured for each of the default domain profiles: ha,
full-ha, full & default.
For more information about AS7/EAP domain mode, see:
https://docs.jboss.org/author/display/AS71/AS7+Cluster+Howto and
http://blog.akquinet.de/2012/06/29/managing-cluster-nodes-in-domain-mode-of-jboss-as-7-eap-6
Next steps
Feel free to play with the sample repository through the REST API. However, since this is really just a
sample configuration, we'll next see how to add ModeShape to other EAP configurations.
3.3.4 Configuring ModeShape in EAP

ModeShape is not included with JBoss EAP, so none of the standard configurations include any ModeShape
configurations. However, once ModeShape is installed into EAP, it's easy to add ModeShape to any AS7
configuration.
Page 196 of 424
ModeShape 3
If you need to run ModeShape 3.0.x or 3.1, then these instructions apply to configuring ModeShape
in AS7.1.1.
Step 1: Start the server

Step 2: Start the server's CLI
Step 3: Add the ModeShape subsystem
Step 4: Add a ModeShape repository
Step 4a: Add an Infinispan cache
Step 4b: Add a security domain
Step 4c: Add the repository
Advanced configuration
Advanced repository configuration
Add and remove sequencers
Specify where indexes are stored
Specify where large binary values are stored
Configuring a composite binary store
Add and remove authentication and authorization providers
Add JDBC Data Source
Add and remove external sources
Batch mode
Clustering configuration
Step 1: Start the server

Start your server in standalone mode with your favorite configuration. For example, the following starts with
the "standalone.xml" configuration file:
$ bin/standalone.sh -c=standalone.xml
Use the appropriate command for your OS. See the EAP documentation (or AS7 documentation)
for details.
Page 197 of 424
ModeShape 3
Step 2: Start the server's CLI

JBoss EAP has a very nice low-level command line interface (CLI) tool that you can use to directly
manipulate the configuration of the running server. If the server is running in domain mode, the CLI will
immediately propagate the changes to all the servers.
Start the CLI and connect to your server:
$ ./bin/jboss-cli.sh
You are disconnected at the moment. Type 'connect' to connect to the server or 'help' for the
list of supported commands.
[disconnected /] connect
[standalone@localhost:9999 /]
Step 3: Add the ModeShape subsystem

ModeShape is installed, but the current configuration doesn't know about the ModeShape subsystem. So the
next step is to add it:
[standalone@localhost:9999 /] /extension=org.modeshape:add()
{"outcome" => "success"}
[standalone@localhost:9999 /] ./subsystem=modeshape:add
The configuration's XML file (in this case "standalone.xml") is updated immediately. Watch the
configuration file as you use the CLI.
Step 4: Add a ModeShape repository

We want to add a repository, but before we do that we need to add or configure the EAP resources that the
repository will use.
Page 198 of 424
ModeShape 3
Step 4a: Add an Infinispan cache

Each ModeShape repository stores its content in an Infinispan cache. But let's put that cache in a new cache
container named "modeshape", which we can use for other repositories:
[standalone@localhost:9999 /] /subsystem=infinispan/cache-container=modeshape:add
The CLI supports tab-completion, so as you type out these paths try hitting tab to have the CLI fill
out as much as possible or show the available matches.
Now that we have our container, we'll define a local cache named " sample" (which by convention we'll
name the same as our repository, although this is not required) that uses non-XA transactions and persists
all content immediately to the "modeshape/store/sample" directory under the "standalone/data"
directory:
/subsystem=infinispan/cache-container=modeshape/local-cache=sample:add
/subsystem=infinispan/cache-container=modeshape/local-cache=sample/transaction=TRANSACTION:add(mode=NON_XA){
"outcome" => "success",
"response-headers" => {
"operation-requires-reload" => true,
"process-state" => "reload-required"
}
}
/subsystem=infinispan/cache-container=modeshape/local-cache=sample/file-store=FILE_STORE:add(path="modeshape/s
"response-headers" => {"process-state" => "reload-required"}
}
Most of the commands we've issued so far resulted in a simple successful outcome. The last two, however,
were successful but apparently require a reload of the Infinispan service. That basically means that our
changes were saved to the configuration, but the last two changes won't take effect until the next restart or
until we explicitly perform the reload (which we'll do twice):
[standalone@localhost:9999 /] :reload
{
}
Page 199 of 424
ModeShape 3
Step 4b: Add a security domain

ModeShape can use any security domain, as long as the domain defines the correct ModeShape roles for
each user. Since none of the out-of-the-box security domains includes these roles, let's create our own
security domain that uses the modeshape-users.properties and modeshape-roles.properties
files included in the "module/org/modeshape/main/conf" directory:
./subsystem=security/security-domain=modeshape-security:add(cache-type=default)
./subsystem=security/security-domain=modeshape-security/authentication=classic:add(login-modules=[{"code"=>"Us
"response-headers" => {
"operation-requires-reload" => true,
"process-state" => "reload-required"
}
}
Again, we need to reload the services for these changes to take effect:
{
}
JBoss EAP supports multiple kinds of security domains, including integration with LDAP and even
single sign-on using the local OS. Consult the JBoss AS documentation for details.
Page 200 of 424
ModeShape 3
Step 4c: Add the repository

Now that we've finished defining the services that our repository will use, we can define our ModeShape "
sample" repository:
./subsystem=modeshape/repository=sample:add(security-domain="modeshape-security",cache-name="sample",cache-con
=> "success"}
This command configures the repository to use the "sample" Infinispan cache in the "modeshapce" cache
container, and to use the "modeshape-security" security domain we created earlier.
We actually didn't need to define the security-domain="modeshape-security" attribute

because the repository would use a security domain with that name by default. Also, by default the
repository will try to use an Infinispan cache with the same name as the repository in the cache
container named "modeshape". Specifying them doesn't hurt, but any attributes that match the
default value will not be serialized to the XML configuration file.
Note that defining a repository doesn't require restart. In fact, quite a few of the ModeShape administrative
operations can take effect immediately, even when applications are actively using the repository.
Advanced configuration
With just a few commands, you can add a repository that will persist content locally. However, more
advanced configuration options are available.
Advanced repository configuration

At any point, we can see the complete definition of a repository:
Page 201 of 424
ModeShape 3
/subsystem=modeshape/repository=sample:read-resource(recursive=true)
{
"result" => {
"allow-workspace-creation" => true,
"anonymous-roles" => undefined,
"anonymous-username" => "<anonymous>",
"binary-storage" => undefined,
"cache-container" => "modeshape",
"cache-name" => "sample",
"cluster-name" => undefined,
"cluster-stack" => undefined,
"default-workspace" => "default",
"enable-monitoring" => true,
"index-storage" => undefined,
"indexing-analyzer-classname" => "org.apache.lucene.analysis.standard.StandardAnalyzer",
"indexing-analyzer-module" => undefined,
"indexing-async-max-queue-size" => 0,
"indexing-async-thread-pool-size" => 1,
"indexing-batch-size" => -1,
"indexing-mode" => "SYNC",
"indexing-reader-strategy" => "SHARED",
"indexing-thread-pool" => "modeshape-workers",
"jndi-name" => undefined,
"minimum-binary-size" => 4096,
"predefined-workspace-names" => undefined,
"rebuild-indexes-upon-startup" => "IF_MISSING",
"security-domain" => "modeshape-security",
"sequencer" => undefined,
"use-anonymous-upon-failed-authentication" => false
}
}
This shows all of the attributes, including those that are not set or set to their default values. To see more
detail about each attribute and child, use the ":read-resource-description()" command:
Page 202 of 424
ModeShape 3
/subsystem=modeshape/repository=sample:read-resource-description(recursive=true)
{
"result" => {
"description" => "ModeShape repository",
"attributes" => {
"cache-name" => {
"type" => STRING,
"description" => "The name of the Infinispan cache that is to be used for
storing this repository's content",
"expressions-allowed" => false,
"nillable" => true,
"min-length" => 1L,
"max-length" => 2147483647L,
"access-type" => "read-write",
"storage" => "configuration",
"restart-required" => "resource-services"
},
"cache-container" => {
"type" => STRING,
"description" => "The name of the Infinispan cache container that contains the
cache to be used for storing this repoitory's content",
"nillable" => true,
"min-length" => 1L,
"max-length" => 2147483647L,
},
"jndi-name" => {
"type" => STRING,
"description" => "The optional alias in JNDI where this repository is to be
registered, in addition to 'jcr/{repositoryName}",
"nillable" => true,
"min-length" => 1L,
"max-length" => 2147483647L,
},
...
We didn't show all of the output, since it's quite long. But each attribute is described and shows the criteria
for valid values, whether expressions (e.g., system variables) are allowed, and whether a restart will be
required before changes take effect.
Most of the attributes do have defaults, but some of these defaults are not listed in the descriptions because
the defaults are functions of other attributes. For example, every repository is registered in JNDI under "
jcr/repositoryName", and also under the JNDI name explicitly set with the "jndi-name" attribute.
The following table contains the list of all the attributes for a repository:
Page 203 of 424
ModeShape 3
Attribute Name
Description
allow-workspace-creation
Specifies whether authenticated and authorized JCR users c
additional workspaces beyond the predefined, system, and d
workspaces. The default value is 'true'. Set this to 'false' w

workspaces are to be fixed.
anonymous-roles
The list of string names of the roles for all anonymous users.
Anonymous logins will be disabled if the roles consists of an
string. By default, anonymous users are given all roles: 'conn

readonly', 'readwrite', and 'admin'.
anonymous-username
The username for all anonymous users. The username '
<anonymous>' is used by default (that is, 'anonymous' surro

angle brackets").
cache-container
The name of the Infinispan cache container in which the cach

found. If not provided, the "modeshape" cache container will
cache-name
The name of the Infinispan cache where repository content w
stored.If not provided, the repository name is used for the cac
name.
cluster-name
Defines the name of the communication channel used to sha
amongst all repository instances in the cluster. By default the
value, meaning the repository is not participating in a cluster.

cluster-stack
Specifies the name of the JGroups stack used by the reposito

create a channel for events when the repository is clustered.
default there is no value, meaning the repository is not partici

a cluster.
default-workspace
The name of the workspace that should be used when Sessio
created without specifying an explicit workspace name. By de

"default" workspace name is used.
enable-monitoring
Specifies whether the repository is to maintain the metrics tha
used to monitor the performance and activities. The default v

'true', meaning monitoring is enabled.
indexing-analyzer-classname
The fully-qualified name of the Lucene analyzer implementati

The default value is '
org.apache.lucene.analysis.standard.StandardA
'.
indexing-analyzer-module
The name of the module that contains the specified analyzer
value is specified by default, which means the class is visible
ModeShape engine (e.g., or any of its transitive dependencie
Page 204 of 424
ModeShape 3
indexing-async-max-queue-size
The maximum size of the queue used for asynchronous inde
default the value is '0'. The value is ignored if synchronous in

enabled.
indexing-async-thread-pool-size
The size of the thread pool used for asynchronous indexing.
default the value is '1'. The value is ignored if synchronous in

enabled.
indexing-batch-size
The size of the indexing batches. The default value is -1, whi
the batch sizes are unlimited.
indexing-mode
The concurrency mode for indexing. The value must be eithe

or 'ASYNC'.
indexing-reader-strategy
The strategy for sharing (or not sharing) index readers. The v
must be either 'SHARED' or 'NOT_SHARED'.
indexing-thread-pool
The name of the thread pool that the repository indexing syst
should use. By default, the value is 'modeshape-workers'.
jndi-name
The repository will always be bound in JNDI to the name '
jcr/{repositoryName}', but this attribute can be used to s
an additional location in JNDI where the repository is to be re

minimum-binary-size
The size threshold that dictates whether String and binary va
should be stored in the binary store. String and binary values
than this value are stored with the node, whereas string and b
values with a size equal to or greater than this limit will be sto
separately from the node and in the binary store, keyed by th
hash of the value. This is a space and performance optimizat
stores each unique large value only once. The default value i
bytes, or 4 kilobytes.
predefined-workspace-names
The names of the workspaces that the repository will ensure

create if necessary) when the repository starts up.
rebuild-indexes-upon-startup
Specifies whether the indexes need to be rebuilt immediately
each ModeShape process starts up. Must be one of 'IF_MIS

ALWAYS' or 'NEVER. By default it is 'IF_MISSING'
rebuild-upon-startup-mode
Specifies the way in which index rebuilding at startup should

performed: synchronously and asynchronously. Must be one
or 'ASYNC'. By default it is 'SYNC'
rebuild-upon-startup-include-system-content Specifies if, when rebuilding indexes at startup, the system co

area (the nodes below /jcr:system) should be indexed or
default it is 'FALSE'
security-domain
The name of the security domain that should be used for JAA
authentication. The default is 'modeshape-security'
Page 205 of 424
ModeShape 3
use-anonymous-upon-failed-authentication
Indicates that failed authentication attempts will not result in a

javax.jcr.LoginException but will instead fall back to
anonymous access. If anonymous access is not enabled, the
login attempts will throw a LoginException. The default value

false'.
default-initial-content
The file which should be treated as the default initial content

into all workspace. See initial content for more information
workspaces-initial-content
A set of (workspaceName, initial content file) pairs, which def
custom initial content files for each workspace. See initial con
more information
node-types
A sequence of node-type elements, where the value of eac
element represents a path to a CND file which should be imp
repository startup. See registering custom node types for mo

information
external-sources
A sequence of source elements, where each element conta

definition of an external source
At this point, any deployed application can use the repository (see the the chapter) for details).
Page 206 of 424
ModeShape 3
Add and remove sequencers

You can use the CLI to dynamically add and remove sequencers. Here's an example that adds to the "
sample" repository a sequencer that operates against comma-separated value (CSV) files uploaded under
the "/files" node:
/subsystem=modeshape/repository=sample/sequencer=delimited-text-sequencer:add(
classname="org.modeshape.sequencer.text.DelimitedTextSequencer",
module="org.modeshape.sequencer.text",
path-expressions=["/files(//*.csv[*])/jcr:content[@jcr:data] => /derived/text/delimited/$1"],
properties=[{ "splitPattern"=>"," }])
/subsystem=modeshape/repository=sample/sequencer=delimited-text-sequencer:read-resource()
{
"result" => {
"classname" => "org.modeshape.sequencer.text.DelimitedTextSequencer",
"module" => "org.modeshape.sequencer.text",
"path-expressions" => ["/files(//*.csv[*])/jcr:content[@jcr:data] =>
/derived/text/delimited/$1"],
"properties" => [{"splitPattern" => ","}]
}
}
Note how this particular sequencer has an additional "splitPattern" property that specifies the delimiter.
To remove a sequencer, simply invoke the "remove" operation on the appropriate item:
/subsystem=modeshape/repository=sample/sequencer=delimited-text-sequencer:remove()
Specify where indexes are stored

To specify index storage, you first need to add the index storage resource to your configuration:
/subsystem=modeshape/repository=sample/configuration=index-storage:add()
Once the index storage node is added, you can add the storage type with required/optional parameters:
/subsystem=modeshape/repository=sample/configuration=index-storage/storage-type=master-file-index-storage:add(
path=/somepath, source-path=/someotherpath)
Page 207 of 424
ModeShape 3
Specify where large binary values are stored

In the same way you specify index storage above, you first need to add the binary storage resource to your
configuration:
/subsystem=modeshape/repository=sample/configuration=binary-storage:add()
Once the binary storage node is added, you can add the storage type with required/optional parameters:
/subsystem=modeshape/repository=sample/configuration=binary-storage/storage-type=file-binary-storage:add(path=
=> "success"}
Configuring a composite binary store

Composite binary stores are different from the rest of the standard binary stores, because they can
aggregate any number of standard binary stores. Therefore configuring them via CLI is a bit different.
First, you need to configure the composite binary store in similar fashion to any other binary store:
/subsystem=modeshape/repository=sample/configuration=binary-storage/storage-type=composite-binary-storage:add(
After this, you need to make sure that each nested store has a store-name property which is unique within
the composite store and that the appropriate resource-container is used when adding the store.
Corresponding to each of the standard binary stores, the following resource-containers are available:
nested-storage-type-file - for file system binary stores
nested-storage-type-cache - for cache binary stores
nested-storage-type-db - for database binary stores
nested-storage-type-custom - for custom (user defined) binary stores
For example, if you wanted to add a file system binary store to a composite store, you would run:
/subsystem=modeshape/repository=sample/configuration=binary-storage/storage-type=composite-binary-storage/nest
path="/somepath")
If you wanted to remove this store, you would run:
/subsystem=modeshape/repository=sample/configuration=binary-storage/storage-type=composite-binary-storage/nest
Page 208 of 424
ModeShape 3
Add and remove authentication and authorization providers

You can use the CLI to dynamically add and remove custom authentication and authorization providers. For
example, if your org.modeshape.jcr.security.AuthorizationProvider implementation were
named "org.example.MyAuthProvider" and were added to a new "org.example.auth" module, then
the following command would add this provider to the "sample" repository:
/subsystem=modeshape/repository=sample/authenticator=custom:add(classname="org.example.MyAuthProvider",
module="org.example.auth")
/subsystem=modeshape/repository=sample/authenticator=jaas:read-resource()
{
"result" => {
"classname" => "org.modeshape.jcr.security.JaasProvider",
"module" => "org.modeshape",
"properties" => undefined
}
}
To remove an authentication provider, simply invoke the "remove" operation on the appropriate item:
/subsystem=modeshape/repository=sample/authenticator=custom:remove()
ModeShape can set instance-level fields on the provider instances. For example, you might want to set the
"auth-domain" field on the MyAuthProvider instance to the String value "global". To do this, simply add
them via the "properties" parameter (which is a list of documents that each contain a single name-value
pair):
/subsystem=modeshape/repository=sample/authenticator=custom:add(classname="org.example.MyAuthProvider",
module="org.example.auth", properties=[ {"foo"=>"bar"}, {"baz"=>"bam"} ] )
/subsystem=modeshape/repository=sample/authenticator=custom:read-resource()
{
"result" => {
"classname" => "org.example.MyAuthProvider",
"module" => "org.example.auth",
"properties" => [
{"foo" => "bar"},
{"baz" => "bam"}
]
}
}
Page 209 of 424
ModeShape 3
Add JDBC Data Source

First, you need to add the driver:
/subsystem=datasources/jdbc-driver=modeshape-driver:add(driver-name="modeshape-driver",
driver-module-name="org.modeshape.jdbc", driver-class-name="org.modeshape.jdbc.LocalJcrDriver")
Then, the actual datasource:
/subsystem=datasources/data-source="java:/datasources/ModeShapeDS":add(jndi-name="java:/datasources/ModeShapeD
=> "success"}
Add and remove external sources

To enable federation, one or more external sources can be added to an existing repository.
The following example show how using the CLI an external file system source (via the
FileSytemConnector) can be linked to the sample repository:
/subsystem=modeshape/repository=sample/source=fsSource:add(classname="org.modeshape.connector.filesystem.FileS
readonly="true",
projections=["default:/projection1 => /"], cacheTtlSeconds="1")
/subsystem=modeshape/repository=sample/source=fsSource:read-resource()
{
"result" => {
"cacheTtlSeconds" => "1",
"classname" => "org.modeshape.connector.filesystem.FileSystemConnector",
"module" => undefined,
"projections" => ["default:/projection1 => /"],
"properties" => [{"directoryPath" => "."}],
"queryable" => undefined,
"readonly" => "true"
}
}
Notice that there are several attributes that can be specified when adding an external source:
Page 210 of 424
ModeShape 3
classname (mandatory) - the fully qualified name of the Connector class which allows content to
be retrieved and written to that external source
module (optional) - the name of the EAP module where the above class can be found
projections (optional) - a list of projection expressions representing predefined projection paths for
the source; projections can either be defined here or programmatically using the
FederationManager.createProjection(...) method.
queryable (optional) - a flag indicating if the content exposed from the external source should be
indexed by the repository or not. By default, all content is queryable.
readonly (optional) - a flag indicating if only reads or both reads and writes are possible on the
source
cacheTtlSeconds (optional) - the number of seconds any given node from the external source is to
be held in the cache of the corresponding workspace
properties (optional) - an array of key - value pairs which allow any custom attributes to be passed
down on the Connector implementation class.
To remove an external source, just invoke remove on that source:
[standalone@localhost:9999 /] /subsystem=modeshape/repository=sample/source=fsSource:remove()
Batch mode
You can combine all these commands (except for the initial /extension=org.modeshape:add()
command) into a batch operation:
[standalone@localhost:9999 /] /extension=org.modeshape:add()
[standalone@localhost:9999 /] batch
[standalone@localhost:9999 / #] (paste the commands here)
[standalone@localhost:9999 / #] run-batch
The batch executed successfully.
Batches can be stashed or edited before they are run, and multiple commands can be easily pasted into a
batch.
Page 211 of 424
ModeShape 3
Clustering configuration
Before configuring ModeShape to run in a cluster, make sure the JGroups subsystem is present in the EAP
configuration.
After that, the following parts need configuring:
1) Replicated Infinispan caches for the repository store and the binary store:
/subsystem=infinispan/cache-container=modeshape:add(module="org.modeshape")
/subsystem=infinispan/cache-container=modeshape/transport=TRANSPORT:add(lock-timeout="60000")
/subsystem=infinispan/cache-container=modeshape/replicated-cache=sample:add(mode="SYNC",
batching="true")
/subsystem=infinispan/cache-container=modeshape/replicated-cache=sample/transaction=TRANSACTION:add(mode=NON_X
batching="true")
/subsystem=infinispan/cache-container=modeshape-binary-store/replicated-cache=sample-binary-data/transaction=T
batching="true")
/subsystem=infinispan/cache-container=modeshape-binary-store/replicated-cache=sample-binary-metadata/transacti
2) Main repository
/subsystem=modeshape/repository=sample:add(cache-container="modeshape",cache-name="sample",cluster-name="modes
3) Indexing
/subsystem=modeshape/repository=sample/configuration=index-storage:add()
/subsystem=modeshape/repository=sample/configuration=index-storage/storage-type=local-file-index-storage:add(p
4) Binary Storage
/subsystem=modeshape/repository=sample/configuration=binary-storage:add()
/subsystem=modeshape/repository=sample/configuration=binary-storage/storage-type=cache-binary-storage:add(data
metadata-cache-name="sample-binary-metadata", cache-container="modeshape-binary-store")
3.3.5 Using Repositories with JCR API in EAP

The JCR API is powerful and easy way to access or manipulate repository content from within a deployed
web application or service. And the ModeShape kit for EAP makes that trivially easy: simply get a
javax.jcr.Repository object that represents one of the repositories running within the ModeShape
subsystem, and start using the API.
Page 212 of 424
ModeShape 3
Finding the JCR Repository

Use Java EE resource injection
Get a Repository instance from JNDI
Use JCR's RepositoryFactory
Use ModeShape 'Repositories' container
Deploying JCR web applications
Specifying dependencies with MANIFEST.MF
Override dependencies with 'jboss-deployment-structure.xml'
Building the application with Maven
Building the application with other tools
Finding the JCR Repository

There are four ways to find repository instances within EAP, and which you use is up to you:
Using Java EE resource injection
Look up a Repository in JNDI
Look up ModeShape's Repositories instance and use it to find the Repository by name
Use JCR's javax.jcr.RepositoryFactory and the Service Loader
All of these methods rely upon JNDI and the fact that ModeShape registers itself and each of the
Repository instances into JNDI. The ModeShape engine (which implements the "
org.modeshape.jcr.api.Repositories interface) is registered at "jcr", while each repository is
registered at "jcr/{repositoryName}". Note that an additional JNDI location can be specified in the
repository configuration; this is optional but useful when your deploying an application that is already looking
up a Repository instance at a specific JNDI name that cannot (easily) be changed.
Consider a repository named "sample". ModeShape automatically registers it into JNDI (in the global
context) at "jcr/sample", although "java:jcr/sample" also works in EAP.
With this knowledge of how ModeShape uses JNDI, let's look at the different ways of getting a hold of
Repository instances.
Page 213 of 424
ModeShape 3
Use Java EE resource injection

JBoss EAP is a Java EE compliant application server, which means that your code can use Java EE
resource injection to automatically get a reference to the Repository instance. Here's a snippet from a
ManagedBean example that has the "sample" repository injected automatically:
@ManagedBean
public class MyBean {
@Resource(mappedName="java:/jcr/sample")
private javax.jcr.Repository repository;
...
}
When your application is deployed, the server will automatically start the "sample" repository. And when your
application is undeployed, the server will automatically stop the "sample" repository (unless there are other
applications or subsystems using it). This works because the EAP deployer encounters the @Resource
annotation and automatically adds a dependency for the application on the JNDI binding service associated
with the specified JNDI name, which in the case of ModeShape is dependent upon relevant Repository
instance.
Using resource injection is by-far the easiest approach.
Get a Repository instance from JNDI

Another very simple approach is to simply look up the Repository instance in JNDI:

javax.jcr.Repository repository = (javax.jcr.Repository) context.lookup("jcr/sample");
Directly looking up the repository in JNDI is perhaps the second easiest way to get a ModeShape repository,
and this approach was the idiomatic way to do this in JCR 1.0. But JCR 2.0 added a different approach. You
might want to consider using this approach if you deploy your application to multiple containers (including
some non-EE containers).
Page 214 of 424
ModeShape 3
Use JCR's RepositoryFactory

Not all deployment environments have JNDI support. If your components that use JCR are deployed reused
in other applications that are deployed to environments that don't have JNDI or Java EE support non-EE
environments, you may want to consider the idiomatic way to look up JCR repositories. The JCR 2.0
specification defines a pattern that uses the Java SE Service Loader facility to find
javax.jcr.RepositoryFactory instances and use them to get your Repository instance. This
mechanism also works with ModeShape in EAP:
String configUrl = "jndi:jcr/sample";

Map<String, String> parameters = java.util.Collections.singletonMap("org.modeshape.jcr.URL",
configUrl);
javax.jcr.Repository repository = null;
for (RepositoryFactory factory : java.util.ServiceLoader.load(RepositoryFactory.class)) {
}
As shown above, ModeShape's RepositoryFactory implementations look for a single "

org.modeshape.jcr.URL" parameter that should be a URL of the form "jndi:jndiName". Since our
"sample" repository is registered into JNDI at "jcr/sample", we just use "jndi:jcr/sample" for the URL.
Use ModeShape 'Repositories' container

If your application is just looking up a Repository instance, then the approaches covered above will
certainly work and are recommended because they're very simple, straightforward, and depend only on the
JCR API. However, sometimes your applications need to do more than just look up Repository instances.
For example, perhaps your application needs to know which repositories exist.
For these situations, ModeShape provides an implementation of the '
org.modeshape.jcr.api.Repositories' interface that defines several useful methods:
Page 215 of 424
ModeShape 3
public interface Repositories {

/**
* Get the names of the available repositories.
*
* @return the immutable set of repository names provided by this server; never null
*/
Set<String> getRepositoryNames();
/**
* Return the JCR Repository with the supplied name.
*
* @param repositoryName the name of the repository to return; may not be null
* @return the repository with the given name; never null
* @throws javax.jcr.RepositoryException if no repository exists with the given name or
there is an error communicating with
*
the repository
*/
javax.jcr.Repository getRepository( String repositoryName ) throws
}
The getRepositoryNames() method returns an immutable set of names of all existing repositories, while
the getRepository(String) method obtains the JCR repository with the specified name.
ModeShape always registers the implementation of this interface in JNDI at the " jcr" (or "java:jcr")
name. Therefore, the following code shows how to directly look up a repository named "sample" using this
interface:

Repositories repositories = (Repositories) context.lookup("jcr");
javax.jcr.Repository repository = repositories.get("sample");
The Repositories object can also be used with the RepositoryFactory-style mechanism. In this case,
the URL should contain "jndi:jcr?repositoryName=repositoryName". For example, the following
code finds the "sample" repository using this technique:

Map<String, String> params = new HashMap<String, String>();
params.put(org.modeshape.jcr.api.RepositoryFactory.URL, "jndi:jcr?repositoryName=sample");
}
or, by separating the URL and repository name:
Page 216 of 424
ModeShape 3

Map<String, String> params = new HashMap<String, String>();
params.put(org.modeshape.jcr.api.RepositoryFactory.URL, "jndi:jcr");
params.put(org.modeshape.jcr.api.RepositoryFactory.REPOSITORY_NAME, "sample");
}
And finally, you can even use resource-injection:
@ManagedBean
public class MyBean {
@Resource(mappedName="java:/jcr")
private org.modeshape.jcr.api.Repositories repositories;
...
}
Deploying JCR web applications

The server's modular classloading system means that your application will only see those Java APIs that
your applications use. But the JCR API is not one of the standard JEE APIs, so your application needs to
explicitly say that it needs the JCR API (and optionally the ModeShape API).
We plan to provide and look for CDI annotations to automatically add in these dependencies for
your application. Until then, you have to do it manually.
There are two ways to manually specify the modules that your application uses:
1. Specify dependencies in your 'MANIFEST.MF' file
2. Override dependencies with the 'jboss-deployment-structure.xml' file
Page 217 of 424
ModeShape 3
Specifying dependencies with MANIFEST.MF

Edit your application's 'META-INF/MANIFEST.MF' file and add the following line:
Dependencies: javax.jcr, org.modeshape.jcr.api export services, org.modeshape export services
This line does a couple of things. First, it gives your application visibility to the standard JCR API and to the
ModeShape public API. Second, it ensures that the ModeShape service is running by the time your
application needs it. And finally, it exports any services (e.g., RepositoryFactory implementations) so
that the ServiceLoader can find them. For more detail, see the EAP classloading documentation (or the AS7
classloading documentation).
If you modify the MANIFEST.MF file, make sure to include a newline character at the end of the file.
Override dependencies with 'jboss-deployment-structure.xml'

This file is a JBoss-specific deployment descriptor that can be used to control class loading in a fine grained
manner. Like the MANIFEST.MF, this file can be used to add dependencies. If can also prevent automatic
dependencies from being added, define additional modules, change an EAR deployment's isolated class
loading behaviour, and add additional resource roots to a module.
<jboss-deployment-structure>
...
<deployment>
...
<dependencies>
...

<module name="javax.jcr" />
<module name="org.modeshape.jcr.api" services="import" />
<module name="org.modeshape" services="import" />
...
</dependencies>
...
</deployment>
...
</jboss-deployment-structure>
For additional information, see JBoss Deployment Structure File (or for AS7).
Building the application with Maven

Because ModeShape and EAP are built with Maven, using Maven to build and test your application is
extremely easy and recommended. In your application's POM file, simply include ModeShape as a
"provided" dependency. You can do this for each of the artifacts you need, but it is far easier to simply use
ModeShape's BOM in your "<dependencyManagement>" section. Here's an example of what this looks
like, including how to specify the Java EE 6 APIs:
Page 218 of 424
ModeShape 3
POM file should specify ModeShape's BOM in the dependencyManagement section

<project ...>

<dependencyManagement>
<dependencies>


<dependency>
<groupId>org.jboss.bom</groupId>
<artifactId>jboss-javaee-6.0-with-tools</artifactId>
<version>1.0.0.M11</version>
<type>pom</type>
<scope>import</scope>
</dependency>

<dependency>
<groupId>org.modeshape.bom</groupId>
<artifactId>modeshape-bom-jbosseap</artifactId>
<version>3.5.0.Final</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>

</project>
Specify the ModeShape version that you want to use on line 28. Because the
modeshape-bom-jbossas is a BOM, it includes default "version" and "scope" values for all of
ModeShape's artifacts and (transitive) dependencies.
Page 219 of 424
ModeShape 3
Before ModeShape 3.2, the name of the BOM artifact was different and line 27 should instead read
<artifactId>modeshape-bom-jbossas</artifactId>.
Those BOMs essentially add default values for lots of Java EE6 and ModeShape artifacts and
dependencies, respectively. But those are just defaults. To make them available in your application, you
must add dependencies for all the artifacts you'll directly use. In the case of ModeShape, this is probably just
the JCR API and ModeShape's public API:
POM file should specify ModeShape JARs as 'provided'
<dependencies>
...

<dependency>
<groupId>javax.jcr</groupId>
<artifactId>jcr</artifactId>
</dependency>

<dependency>
<groupId>org.modeshape</groupId>
<artifactId>modeshape-jcr-api</artifactId>
</dependency>
...
</dependencies>
Note here how we don't have to specify any versions or scope of these artifacts? That's because they're
specified in the BOMs used in the <dependencyManagement> section. (In the case of these two artifacts,
the default scope is "provided", which means that Maven makes them available for compilation, but since
they are "provided" by the EAP+ModeShape runtime environment will not include them in any produced
artifacts like WAR files.)
Warning
Your deployable web applications and services should not contain any of the JARs from
ModeShape, JCR API, Infinispan, Hibernate Search, Lucene, Joda Time, Tika, or any of the other
libraries that ModeShape uses. Doing so will result in (convoluted) deployment errors regarding
class incompatibilities or class cast exceptions. If your code needs to directly use these libraries,
simply add them as a dependency in your MANIFEST.mf file or
jboss-deployment-structure.xml file.
Page 220 of 424
ModeShape 3
Building the application with other tools

If you're using a non-Maven tool to build your application, be sure that the resulting deployments (e.g., WAR
and EAR files) do not contain any of the JARs from ModeShape, JCR API, Infinispan, Hibernate Search,
Lucene, Joda Time, or any of the other libraries that ModeShape uses. These are provided by the
ModeShape subsystem in EAP. If your code needs any of these, simply add them as a dependency in your
MANIFEST.mf file or jboss-deployment-structure.xml file.
3.3.6 Using Repositories with REST in EAP

ModeShape's RESTful API was intended to be used by HTTP clients, so it's quite easy to write a simple
client application to read and write repository content using the RESTful API. However, since our trusty web
browser is indeed a simple HTTP client, we can use it to directly interact with the RESTful API. It might not
be pretty, but it works beautifully.
The RESTful API is installed automatically when installing ModeShape into JBoss EAP or JBoss
AS7. Other than this, there is nothing special about the RESTful API running within those servers.
The RESTful API is nothing more than a simple JAX-RS web application that is packaged as a WAR file, and
the kit automatically deploys this WAR file for us. So let's use this to easily check the health and availability
of the repositories. To do so, simply point your browser to http://localhost:8080/modeshape-rest/, and you
should see a JSON response that is similar to the following:
{
"sample": {
"repository": {
"name": "sample",
"resources": {
"workspaces": "/modeshape-rest/sample"
},
"metadata": {
"option.retention.supported": "false",
...
}
}
}
}
The response document lists the named repositories that are available. In this case, there is only one
"sample" repository, and its nested document provides the name, resources and meatdata for the repository.
The "resources" nested document contains the usable (relative) link that we can use to get more information
about the repository.
To do this, we need to issue a GET to the resource at http://localhost:8080/modeshape-rest/sample, which
we can do by pointing our browser to this URL. When we do this, the RESTful service will return a JSON
response document describing the "sample" repository:
Page 221 of 424
ModeShape 3
{
"default": {
"workspace": {
"name": "default",
"resources": {
"query": "/modeshape-rest/sample/default/query",
"items": "/modeshape-rest/sample/default/items"
}
}
}
}
This document describes the repository and lists the named workspaces. In this case, there is a single
"default" workspace, and two resources available for use:
http://localhost:8080/modeshape-rest/sample/default/items exposes the repository's nodes via
RESTful methods
http://localhost:8080/modeshape-rest/sample/default/query allows RESTful clients to POST queries
and receive responses containing the results
We can't issue a POST with our web browser (without some HTML/JavaScript content on the page), we can
continue to navigate the content of the "default" workspace in the "sample" repository. For example, if we
point our browser to http://localhost:8080/modeshape-rest/sample/default/items, we'll get a response that
describes the root node of that workspace:
{
"properties": {
"jcr:primaryType": "mode:root",
"jcr:uuid": "81513257505d64/"
},
"children": ["jcr:system"]
}
Here we see that the root node has two properties, "jcr:primaryType" and "jcr:uuid" (since the node
is also 'mix:referenceable'), and one child node, "jcr:system". We can append the child name to our
URL (e.g., http://localhost:8080/modeshape-rest/sample/default/items/jcr:system) to get the information
about the "/jcr:system" node:
{
"properties": {
"jcr:primaryType": "mode:system"
},
"children": ["jcr:nodeTypes", "jcr:versionStorage", "mode:namespaces", "mode:locks"]
}
Page 222 of 424
ModeShape 3
Here we see the "/jcr:system" node has only one property but has four children. Let's look at the "
mode:namespaces" child node by pointing our browser to
http://localhost:8080/modeshape-rest/sample/default/items/jcr:system/mode:namespaces to get its JSON
representation:
{
"properties": {
"jcr:primaryType": "mode:namespaces"
},
"children": ["jcr", "nt", "mix", "sv", "mode", "xml", "xmlns", "xs", "xsi"]
}
Again, we see only one property, while there are 9 children: 1 for each registered namespace, where the
node name is the namespace prefix. We can get the JSON representation of the "jcr" namespace by
pointing our browser to
http://localhost:8080/modeshape-rest/sample/default/items/jcr:system/mode:namespaces/jcr :
{
"properties": {
"jcr:primaryType": "mode:namespace",
"mode:generated": "false",
"mode:uri": "http:\/\/www.jcp.org\/jcr\/1.0"
}
}
And here we see that the "/jcr:system/mode:namespaces/jcr" node has three properties and no
children.
3.3.7 Using Repositories with WebDAV in EAP

The ModeShape kit for EAP includes a WebDAV interface that clients can use to access, create, update,
and delete update nt:file and nt:folder nodes in the repositories, treating these nodes as if they were
simply files and folders on a network file system. Many applications and operating systems are WebDAV
clients and can be used with ModeShape. For example, you can mount a repository (or parts of it) as a
network drive on most operating systems, and then upload or download files and folders using standard OS
operations and graphical tools. All ModeShape repositories can be accessed, and authentication is done
using the ModeShape 'connect' role (although this can be customized).
The WebDAV service is packaged as a WAR file and is automatically deployed when the ModeShape kit is
installed. Simply undeploy it if it is not needed.
The rest of this page describes connecting to the WebDAV service and advanced configuration.
Connecting to the repository with WebDAV
Configuring the ModeShape WebDAV Server
Page 223 of 424
ModeShape 3
Connecting to the repository with WebDAV

The WebDAV service is available on your EAP instance at the URL:
http://localhost:8080/modeshape-webdav/repositoryName/workspaceName/pathInWorkspace
where
repositoryName is the (required) name of the repository you want to connect to
workspaceName is the (required) name of the workspace to be accessed
pathInWorkspace is the JCR path to the top-level nt:folder (or nt:file) node to be accessed
For example, the following images shows how to use the Finder on OS X to connect to the " default"
workspace in the "artifacts" sample repository included by default in the ModeShape installation.
Connecting to a repository over WebDAV

Once connected, your operating system or WebDAV client can simply access the content of the repository
as if it were just a regular file system.
Page 224 of 424
ModeShape 3
Using the file system to navigate repository content
Configuring the ModeShape WebDAV Server

The ModeShape WebDAV server is deployed as a WAR and configured mostly through its web configuration
files located within the deployment (at standalone/deployments/modeshape-webdav.war).
The WEB-INF/web.xml defines several parameters:
Page 225 of 424
ModeShape 3
Parameter Name
Description
org.modeshape.web.jcr.REPOSITORY_PROVIDER
The fully-qualified name of the class tha
org.modeshape.web.jcr.spi.Rep
interface. Unless you are using the Mod
server to connect to a different JCR imp

should never change.
org.modeshape.jcr.URL
This parameter, specific to the
FactoryRepositoryProvider imple
specifies the JNDI URL of the ModeSha

implementation.
org.modeshape.web.jcr.webdav.CONTENT_MAPPER_CLASS_NAME
The fully-qualified name of the class tha
org.modeshape.web.jcr.webdav.
interface, which is responsible for mapp
to WebDAV responses. The DefaultC
implementation maps nodes with type n
nt:file to WebDAV folders and files,
other parameters). You can provide you
implementation to map WebDAV conte

content or structures.
org.modeshape.web.jcr.webdav.NEW_FOLDER_PRIMARY_TYPE_NAME
Each folder created through the WebDA
created as a node with this primary nod
org.modeshape.web.jcr.webdav.NEW_RESOURCE_PRIMARY_TYPE_NAME Each resource (e.g., file) created throug
servlet will be created as a node with th

type.
org.modeshape.web.jcr.webdav.NEW_CONTENT_PRIMARY_TYPE_NAME
Content created through the WebDAV s
created as a node with the primary nod

org.modeshape.web.jcr.webdav.RESOURCE_PRIMARY_TYPE_NAMES
Nodes with any of the primary node typ
comma-delimited list will be exposed to

file nodes.
org.modeshape.web.jcr.webdav.CONTENT_PRIMARY_TYPE_NAMES
Nodes with any of the primary node typ
comma-delimited list will be exposed to
content nodes (that is, nodes that have

files).
There are also sections dictate how authentication is to be performed:
Page 226 of 424
ModeShape 3
<!-The ModeShape WebDAV implementation leverages the HTTP credentials to for authentication
and authorization within the JCR repository. Unless the repository provides for anonymous
access, it makes no sense to try to log into the JCR repository without credentials, so
this constraint helps lock down the repository.
This should generally not be modified.
-->
<security-constraint>
<display-name>ModeShape WebDAV</display-name>
<web-resource-collection>
<web-resource-name>WebDAV</web-resource-name>
<url-pattern>/*</url-pattern>
</web-resource-collection>
<auth-constraint>
<!-A user must be assigned this role to connect to any JCR repository, in addition to
needing the READONLY or READWRITE roles to actually read or modify the data.
-->
<role-name>connect</role-name>
</auth-constraint>
</security-constraint>
<!-Any auth-method will work for ModeShape.
-->
<login-config>
<auth-method>BASIC</auth-method>
</login-config>
BASIC is used this example for simplicity.
<!-This must match the role-name in the auth-constraint above.

-->
<security-role>
<role-name>connect</role-name>
</security-role>
3.3.8 Using Repositories with JDBC in EAP

ModeShape provides a JDBC-compliant API which allows clients to connect & query a repository via JDBC.
The ModeShape kit for EAP comes prepackaged with a module named org.modeshape.jdbc which
contains a java.sql.Driver implementation
that allows JDBC clients to connect to existing repositories.
Configuration
The easiest way to access a ModeShape repository via JDBC is to configure a ModeShape datasource and
a driver inside of EAP.
Page 227 of 424
ModeShape 3
The following example shows a configuration snippet from a EAP standalone.xml file, which exposes via
JDBC, the workspace "extra" from a repository named "artifacts"
<datasource jndi-name="java:/datasources/ModeShapeDS" enabled="true" use-java-context="true"

pool-name="ModeShapeDS">
<connection-url>jdbc:jcr:jndi:jcr?repositoryName=artifacts</connection-url>
<driver>modeshape</driver>
<connection-property name="workspace">extra</connection-property>
<security>
<user-name>admin</user-name>
<password>admin</password>
</security>
</datasource>
<drivers>
<driver name="modeshape" module="org.modeshape.jdbc">
<driver-class>org.modeshape.jdbc.LocalJcrDriver</driver-class>
</driver>
</drivers>
JDBC Driver
As you can see from the above snippet, configuring the ModeShape JDBC driver requires the following
attributes:
name
a symbolic name for the JDBC driver, which will be used by the datasource
module
the EAP module name which contains ModeShape's jdbc driver implemenation
driver-class the fully qualified class name of the java.sql.Driver implementation
DataSource
For each repository you want to access, you will need to configure a DataSource in the EAP configuration
file. In the example above, the following attributes are defined:
jndi-name
The name under which the datasource should be registered in JNDI by EAP.
Please note that, at the moment, EAP only allows datasources to be registered
under a name beginning either with java:/ or java:jboss/
connection-url
A JNDI url, which points ModeShape to an existing repository. This format of

this url is: jdbc:jcr:jndi:jcr:?repositoryName=
driver
The name of the JDBC driver, as described by the above section
security
The username & password which will be passed to the connection, when
attempting to access a repository. Inside of EAP, those are normally taken
from the modeshape-security domain.
connection-property Any additional properties which can be passed to the connection. For example,
to access a specific workspace of a repository, the workspace property can
be defined.
Page 228 of 424
ModeShape 3
Usage
Once a datasource has been configured and the application server has started up, the datasource can be
accessed from JNDI and queries executed against the configured repository.
For example:
@Resource( mappedName = "datasources/ModeShapeDS" )

private DataSource modeshapeDS;
....
Connection connection = modeshapeDS.getConnection();
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery("SELECT [jcr:primaryType], [jcr:mixinTypes],
[jcr:path], [jcr:name] FROM [nt:unstructured] ORDER BY [jcr:path]");
Queries
The query language used should be JCR-SQL2.
However, since JCR Nodes cannot be exposed directly via JDBC, the only way to return the path-related
and score information is through additional columns in the result. While such columns could "magically"
appear in the result set, doing so is not compatible with JDBC applications that dynamically build queries
based upon database metadata. Such applications require the columns to be properly described in database
metadata, and the columns need to be used within queries.
ModeShape attempts to solve these issues by directly supporting a number of "pseudo-columns" within
JCR-SQL2 queries, wherever columns can be used. These "pseudo-columns" include:
jcr:score is a column of type DOUBLE that represents the full-text search score of the node, which
is a measure of the node's relevance to the full-text search expression. ModeShape does compute
the scores for all queries, though the score for rows in queries that do not include a full-text search
criteria may not be reliable.
jcr:path is a column of type PATH that represents the normalized path of a node, including
same-name siblings. This is the same as what would be returned by the getPath() method of Node
. Examples of paths include "/jcr:system" and "/foo/bar[3]".
jcr:name is a column of type NAME that represents the node name in its namespace-qualified form
using namespace prefixes and excluding same-name-sibling indexes. Examples of node names
include "jcr:system", "jcr:content", "ex:UserData", and "bar".
mode:localName is a column of type STRING that represents the local name of the node, which
excludes the namespace prefix and same-name-sibling index. As an example, the local name of the
"jcr:system" node is "system", while the local name of the "ex:UserData[3]" node is "UserData".
mode:depth is a column of type LONG that represents the depth of a node, which corresponds
exactly to the number of path segments within the path. For example, the depth of the root node is 0,
whereas the depth of the "/jcr:system/jcr:nodeTypes" node is 2.
All of these are exposed in the database metadata, allowing potential clients to detect & use them.
Page 229 of 424
ModeShape 3
3.3.9 Administering Repositories in JBoss EAP

Once the ModeShape standalone server for EAP 6.1 is running, you can use the application server's
command line management tool (CLI) to view the ModeShape subsystem. The CLI allows you to view the
attributes and metrics of ModeShape repositories, as well as, view the installed web applications.
More information about the EAP Command Line Interface (CLI) is available here.
Getting Started
Navigation
Managed Resource Commands
Example Output
RHQ Plugin
Getting Started
To start the CLI, run this command:
<eap-install-dir>/bin/jboss-cli.sh
When the CLI first starts, you will see something like this:
You are disconnected at the moment. Type 'connect' to connect to the server or 'help' for the
list of supported commands.
[disconnected /]
Enter "connect" to establish a connection your EAP managed instance:
[disconnected /] connect
Press the tab key at the command line prompt to trigger tab-completion!
At any time enter "ls" to view child managed node names and current values for properties. Also, enter "
pwd" to see your current location in the management model.
Page 230 of 424
ModeShape 3
Navigation
The CLI allows you to navigate the management model just as if you were navigating a file system. Here are
some helpful commands that can be run at the root of the management system. But each of these
commands can be modified to run using relative paths also.
To navigate to the ModeShape subsystem:
cd /subsystem=modeshape
To navigate to the ModeShape repositories:
cd /subsystem=modeshape/repository
To navigate to a specific repository:
cd /subsystem=modeshape/repository=<repository-name>
To navigate to the ModeShape web applications:
cd /subsystem=modeshape/webapp
To navigate to a specific ModeShape web application:
cd /subsystem=modeshape/webapp=<web-app-war>
To navigate to a specific ModeShape repository's CND sequencer:
cd /subsystem=modeshape/repository=<repository-name>/sequencer=cnd-sequencer
Page 231 of 424
ModeShape 3
Managed Resource Commands

To view attribute values, metric values, and child managed nodes of a resource, like a repository, just enter "
ls" for a listing. A nicer way to see the same information in a much more readable form is to enter the
following:
:read-resource
A metric is a special runtime attribute.
Other helpful commands:

To obtain descriptions, possible attribute types, and possible child types:
:read-resource-description
To obtain current metric values:
:read-resource(include-runtime=true)
To obtain current attribute values recursively for child nodes so that you do not have to navigate to
each child node:
:read-resource(recursive-depth=10)
Example Output
[standalone@localhost:9999 repository=artifacts] :read-resource(include-runtime=true)
{
"result" => {
"allow-workspace-creation" => false,
"anonymous-roles" => undefined,
"anonymous-username" => "<anonymous>",
"authenticator" => undefined,
"cache-container" => undefined,
"cache-name" => undefined,
"cluster-name" => undefined,
"cluster-stack" => undefined,
"configuration" => undefined,
"default-initial-content" => undefined,
"default-workspace" => "default",
"document-optimization-child-count-target" => undefined,
"document-optimization-child-count-tolerance" => undefined,
Page 232 of 424
ModeShape 3
"document-optimization-initial-time" => "00:00",
"document-optimization-interval" => 24,
"document-optimization-thread-pool" => "modeshape-opt",
"enable-monitoring" => true,
"enable-queries" => true,
"event-count-previous-24-hours" => undefined,
"event-count-previous-52-weeks" => undefined,
"event-count-previous-60-minutes" => [
0L,
0L,
0L
],
"event-count-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"event-count-previous-7-days" => undefined,
"event-queue-size-previous-24-hours" => undefined,
"event-queue-size-previous-52-weeks" => undefined,
"event-queue-size-previous-60-minutes" => [
0L,
0L,
0L
],
"event-queue-size-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"event-queue-size-previous-7-days" => undefined,
"garbage-collection-initial-time" => "00:00",
"garbage-collection-interval" => 24,
"garbage-collection-thread-pool" => "modeshape-gc",
"indexing-analyzer-classname" => "org.apache.lucene.analysis.standard.StandardAnalyzer",
"indexing-analyzer-module" => undefined,
"indexing-async-max-queue-size" => 1,
"indexing-async-thread-pool-size" => 1,
"indexing-batch-size" => -1,
"indexing-mode" => "SYNC",
Page 233 of 424
ModeShape 3
"indexing-reader-strategy" => "SHARED",
"indexing-thread-pool" => "modeshape-indexing-workers",
"jndi-name" => undefined,
"listener-count-previous-24-hours" => undefined,
"listener-count-previous-52-weeks" => undefined,
"listener-count-previous-60-minutes" => [
0L,
0L,
0L
],
"listener-count-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"listener-count-previous-7-days" => undefined,
"minimum-binary-size" => undefined,
"minimum-string-size" => undefined,
"node-changes-previous-24-hours" => undefined,
"node-changes-previous-52-weeks" => undefined,
"node-changes-previous-60-minutes" => [
1699L,
213L,
205L
],
"node-changes-previous-60-seconds" => [
0L,
0L,
168L,
0L,
0L,
0L,
205L,
22L,
0L,
0L,
105L,
0L
],
"node-changes-previous-7-days" => undefined,
"node-types" => undefined,
"open-scoped-lock-count-previous-24-hours" => undefined,
"open-scoped-lock-count-previous-52-weeks" => undefined,
"open-scoped-lock-count-previous-60-minutes" => [
0L,
0L,
0L
],
"open-scoped-lock-count-previous-60-seconds" => [
Page 234 of 424
ModeShape 3
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"open-scoped-lock-count-previous-7-days" => undefined,
"predefined-workspace-names" => [
"default",
"other",
"extra"
],
"query-count-previous-24-hours" => undefined,
"query-count-previous-52-weeks" => undefined,
"query-count-previous-60-minutes" => [
0L,
0L,
0L
],
"query-count-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"query-count-previous-7-days" => undefined,
"query-execution-time-previous-24-hours" => undefined,
"query-execution-time-previous-52-weeks" => undefined,
"query-execution-time-previous-60-minutes" => [
0L,
125L,
9L
],
"query-execution-time-previous-60-seconds" => [
0L,
9L,
0L,
0L,
0L,
0L,
3L,
0L,
0L,
Page 235 of 424
ModeShape 3
4L,
0L,
0L
],
"query-execution-time-previous-7-days" => undefined,
"rebuild-upon-startup" => "NEVER",
"rebuild-upon-startup-include-system-content" => false,
"rebuild-upon-startup-mode" => "ASYNC",
"security-domain" => "modeshape-security",
"sequenced-count-previous-24-hours" => undefined,
"sequenced-count-previous-52-weeks" => undefined,
"sequenced-count-previous-60-minutes" => [
0L,
0L,
0L
],
"sequenced-count-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"sequenced-count-previous-7-days" => undefined,
"sequenced-queue-size-previous-24-hours" => undefined,
"sequenced-queue-size-previous-52-weeks" => undefined,
"sequenced-queue-size-previous-60-minutes" => [
0L,
0L,
0L
],
"sequenced-queue-size-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"sequenced-queue-size-previous-7-days" => undefined,
"sequencer-execution-time-previous-24-hours" => undefined,
"sequencer-execution-time-previous-52-weeks" => undefined,
"sequencer-execution-time-previous-60-minutes" => [
0L,
0L,
Page 236 of 424
ModeShape 3
0L
],
"sequencer-execution-time-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"sequencer-execution-time-previous-7-days" => undefined,
"session-count-previous-24-hours" => undefined,
"session-count-previous-52-weeks" => undefined,
"session-count-previous-60-minutes" => [
12L,
264L,
1148L
],
"session-count-previous-60-seconds" => [
267L,
268L,
528L,
528L,
530L,
531L,
746L,
876L,
878L,
880L,
1148L,
1182L
],
"session-count-previous-7-days" => undefined,
"session-lifetime-previous-24-hours" => undefined,
"session-lifetime-previous-52-weeks" => undefined,
"session-lifetime-previous-60-minutes" => [
4221L,
0L,
0L
],
"session-lifetime-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
Page 237 of 424
ModeShape 3
0L
],
"session-lifetime-previous-7-days" => undefined,
"session-saves-previous-24-hours" => undefined,
"session-saves-previous-52-weeks" => undefined,
"session-saves-previous-60-minutes" => [
0L,
23L,
42L
],
"session-saves-previous-60-seconds" => [
0L,
0L,
42L,
0L,
0L,
0L,
19L,
2L,
0L,
0L,
21L,
0L
],
"session-saves-previous-7-days" => undefined,
"session-scoped-lock-count-previous-24-hours" => undefined,
"session-scoped-lock-count-previous-52-weeks" => undefined,
"session-scoped-lock-count-previous-60-minutes" => [
0L,
0L,
0L
],
"session-scoped-lock-count-previous-60-seconds" => [
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L,
0L
],
"session-scoped-lock-count-previous-7-days" => undefined,
"source" => undefined,
"system-content-indexing-mode" => "DISABLED",
"use-anonymous-upon-failed-authentication" => false,
"workspace-count-previous-24-hours" => undefined,
"workspace-count-previous-52-weeks" => undefined,
"workspace-count-previous-60-minutes" => [
1L,
1L,
1L
],
"workspace-count-previous-60-seconds" => [
Page 238 of 424
ModeShape 3
1L,
1L,
1L,
1L,
1L,
1L,
1L,
1L,
1L,
1L,
1L,
1L
],
"workspace-count-previous-7-days" => undefined,
"workspaces-cache-container" => undefined,
"workspaces-initial-content" => [("default" => "initial-content-default.xml")],
"sequencer" => {
"delimited-text-sequencer" => undefined,
"fixed-width-text-sequencer" => undefined,
"ddl-sequencer" => undefined,
"java-source-sequencer" => undefined,
"java-class-sequencer" => undefined,
"cnd-sequencer" => undefined,
"teiid-model-sequencer" => undefined,
"teiid-vdb-sequencer" => undefined,
"xsd-sequencer" => undefined,
"wsdl-sequencer" => undefined,
"xml-sequencer" => undefined,
"zip-sequencer" => undefined,
"image-sequencer" => undefined,
"mp3-sequencer" => undefined
},
"text-extractor" => {"tika-extractor" => undefined}
}
}
[standalone@localhost:9999 repository=artifacts]
Page 239 of 424
ModeShape 3
RHQ Plugin
Another way to view ModeShape managed AS 7 components is to use RHQ. The image below shows the
RHQ console.
3.4 ModeShape in web applications

This section is incomplete, and will eventually show the specifics (and hopefully examples) of
configuring and using ModeShape in non-AS7 servlet containers and application servers. If you
use a particular pattern or wish to contribute an example, please let us know!
For the most part, the best way to use ModeShape within a web application deployed to Tomcat, Glassfish
or other containers or application servers is to simply embed it into your web application. At that point, it
should be very similar to ModeShape in Java applications.
If you have several web apps that share the same ModeShape repositories, embedding ModeShape into
each and using the same configuration files should work (as long as the apps share the same classloader).
Or, you could create a single web app to manage the ModeShape repositories (including registering them
into JNDI, either via configuration or via programmatic use of JNDI) and have the other web applications
simply look up the repository as found on Using Repositories with JCR API in EAP.
Page 240 of 424
ModeShape 3
3.4.1 ModeShape's JCA Adapter

Introduction
Java Connector Architecture (JCA) or J2EE JCA is a generic architecture based on Java technology that is
used for connecting and integrating the legacy systems including databases,the EIS (Enterprise Information
Systems) and JCA standard compliant application servers as part of EAI (Enterprise Application Integration)
solutions.
As a analogy, just as the Java Database Connectivity (JDBC) is a database connectivity standard that is
specifically used to connect Java EE applications to databases(generally relational based), the JCA is a
generic architecture that provides a standardized legacy system-independent way to communicate with the
different backend systems for J2EE including any type of databases and applications to be integrated for the
software project.
Modeshape's JCA adapter implements Java Connector Architecture and mediates communication between
the Java EE server and ModeShape repository. It provides seemless access to the repository through the
same JCR API.
Connection factory interface

javax.jcr.Repository
Connection interface
javax.jcr.Session
Resource adaptor configuration properties

Modeshape's JCA resource adapter defines the following configuration properties
Property
Property type
Description
name
repositoryURL java.lang.String Defines path to the configuration of the Modeshape repository. Can be
an URL or an absolute path
Page 241 of 424
ModeShape 3
Installing Resource adapter

To use this connector, it must be configured and deployed on the application server. The connector is
packaged as a RAR file (resource adapter archive) and contain a ra.xml which describes its deployment
characteristics. However the actual name of the resource is specified when it is deploed by application
server. The deployment process is quite specific for application server, so please refer to the documentation
for your application server to get more details about JCA connector installation process. Basically it is
enough to deploy RAR archive alongside with datasource descriptor, however additional step with class
loading tuning might be required. Bellow is an example how to get deployed JCA adaptor in JBoss 5.
Datasource descriptor:
<connection-factories>
<tx-connection-factory>
<jndi-name>/jcr/datastore</jndi-name>
<xa-transaction/>
<rar-name>modeshape-jca-3.3-SNAPSHOT.rar</rar-name>
<connection-definition>javax.jcr.Repository</connection-definition>
<config-property name="repositoryURL"
type="java.lang.String">file:/d:/Temp/modeshapecfg.json</config-property>
<max-pool-size>9</max-pool-size>
</tx-connection-factory>
</connection-factories>
And class loading tuning
<classloading xmlns="urn:jboss:classloading:1.0"
name="modeshape-datastore.ear"
parent-domain="NoHibernateNoJBLoggingDomain"
domain="modeshape-datastore.ear"
import-all="true"
export-all="NON_EMPTY"
parent-first="false"
excluded="javax.transaction,javax.transaction.xa">
</classloading>
Access example
To access it within the application, simply fetch the factory by its JNDI name.
//lookup repository using JNDI name (JCA connection factory)

Repository repository = (Repository)ic.lookup(location);
//Get the session (JCA connection) just by login to the repository
session = repository.login();
Page 242 of 424
ModeShape 3
3.5 ModeShape's REST Service

ModeShape's RESTful API was intended to be used by HTTP clients, so it's quite easy to write a simple
client application to read and write repository content using the RESTful API. However, since our trusty web
browser is indeed a simple HTTP client, we can use it to directly interact with the RESTful API. It might not
be pretty, but it works beautifully.
The RESTful API is nothing more than a simple JAX-RS web application that is packaged as a WAR file and
that comes in 2 flavors:
modeshape-web-jcr-rest-war - is a war file artifact available via Maven that can be deployed into any
servlet container. To access it, include the following dependency in your project's POM:
<dependency>
<groupId>org.modeshape</groupId>
<artifactId>modeshape-web-jcr-rest-war</artifactId>
<version>${modeshape.version}</version>
<type>war</type>
</dependency>
ModeShape's AS7 kit - once installed into AS7, in provides the RESTful API out-of-the-box, via a web
application
Both web applications use Basic HTTP Authentication and require a role named connect to be present in the
authenticated user's set of roles.
ModeShape 3 provides two different versions of the RESTful API:
1. REST Service 2.x - the version which was included in ModeShape 2 and which has been deprecated.
However, for backwards compatibility it is still accessible using the v1 URL prefix:
http://<host>:<port>/<context>/v1/
2. REST Service 3.x - a newer version which is an extension of the old one, plus a number of additional
improvements.
3.5.1 REST Service 2.x

Represents the version of the RESTful API distributed with ModeShape 2. It has been deprecated in
ModeShape 3, but is still available using the v1 URL prefix. It provides the following methods:
Page 243 of 424
ModeShape 3
1. Retrieve a list of available repositories

2. Retrieve a list of workspaces for a repository
3. Retrieve a node or a property
4. Create a node
5. Update a node or a property
6. Delete a node or a property
7. Execute a JCR query

URL: http://<host>:<port>/<context>/v1/
HTTP Method: GET
Produces: application/json; text/html; text/plain;
Default Output: text/plain
Response Code (if successful): OK
Response Format:
{
"repo":
{
"repository":
{
"name": "repo",
"resources":
{
"workspaces": "/resources/v1/repo"
},
"metadata":
{
"option.retention.supported": "false",
"query.xpath.doc.order": "false",
...
}
}
}
}
Page 244 of 424
ModeShape 3

URL: http://<host>:<port>/<context>/v1/<repository_name>
HTTP Method: GET
Response Format:
{
"default":
{
"workspace":
{
"name": "default",
"resources":
{
"query": "/resources/v1/repo/default/query",
"items": "/resources/v1/repo/default/items"
}
}
}
}
Page 245 of 424
ModeShape 3

Retrieves an item at a given path.
URL: http://<host>:<port>/<context>/v1/<repository_name>/<workspace_name>/items/<item_path>
HTTP Method: GET
Optional Query Parameters:
depth - a numeric value indicating how many level of children should be retrieved under the node
located at path. A negative value indicates all children
mode:depth - same as the above
Response Format:
{
"properties":
{
"jcr:primaryType": "mode:system"
},
"children":
[
"jcr:nodeTypes",
"jcr:versionStorage",
"mode:namespaces",
"mode:locks"
]
}
Page 246 of 424
ModeShape 3
4. Create a node
Creates a node at the given path, using the body of request as JSON content
URL: http://<host>:<port>/<context>/v1/<repository_name>/<workspace_name>/items/<node_path>
HTTP Method: POST
Request Content-Type: accepts any, but for this to work it has to be a valid JSON object
Response Code (if successful): CREATED
mode:includeNode - indicates if the entire node should be returned in the response or just the path to
the new node.
Request Format:
{ "properties":{
"jcr:primaryType":"nt:unstructured",
"testProperty":"testValue",
"multiValuedProperty":["value1", "value2"]
},
"children":{
"childNode":{
"properties":{
"nestedProperty":"nestedValue"
}
}
}
}
Response Format:
{"properties":{
"multiValuedProperty":["value1", "value2"],
"testProperty":"testValue"
}, "children":{
"childNode":{
"properties":{
}
}
}}
Page 247 of 424
ModeShape 3

Updates a node or a property at the given path, using the body of request as JSON content
HTTP Method: PUT
Request Content-Type: accepts any, but for this to work it has to be a valid JSON object
Request Format:
Node: same as the one used when creating
Property:
{"testProperty":"some_new_value"}
Response Format:
Node: same as one used when creating
Property:

Deletes the node or the property at the given path.
HTTP Method: DELETE
Produces: none

Executes a JCR query in either: XPath, SQL or SQL2 format, returning a JSON object in response.
URL: http://<host>:<port>/<context>/v1/<repository_name>/<workspace_name>/query
Page 248 of 424
ModeShape 3
HTTP Method: POST
Request Content-Type: application/jcr+sql; application/jcr+xpath; application/jcr+sql2;
application/jcr+search
offset - the index in the result set where to start the retrieval of data
limit - the maximum number of rows to return
Response Format:
{
"types":
{
"nt:base.jcr:primaryType": "STRING",
"nt:base.jcr:mixinTypes": "STRING",
"nt:base.jcr:path": "STRING",
"nt:base.jcr:name": "STRING",
"nt:base.jcr:score": "DOUBLE",
"nt:base.mode:localName": "STRING",
"nt:base.mode:depth": "LONG"
},
"rows":
[
{
"nt:base.jcr:primaryType": "mode:root",
"nt:base.jcr:path": "/",
"nt:base.jcr:name": "",
"nt:base.jcr:score": "0.3535533845424652",
"nt:base.mode:localName": "",
"nt:base.mode:depth": "0"
},
{
"nt:base.jcr:primaryType": "mode:locks",
"nt:base.jcr:path": "/jcr:system/mode:locks",
"nt:base.jcr:name": "mode:locks",
"nt:base.jcr:score": "0.3535533845424652",
"nt:base.mode:localName": "locks",
"nt:base.mode:depth": "2"
}
]
}
Page 249 of 424
ModeShape 3
3.5.2 REST Service 3.x

Represents the default version of the RESTful API distributed with ModeShape 3. It provides the following
methods:
4. Create a node
7. Retrieve a node by its identifier
8. Update a node by its identifier
9. Delete a node by its identifier
11. Create multiple nodes
12. Update multiple items
13. Delete multiple items
14. Retrieve a node type
15. Import a CND file (via request content)
16. Import a CND file (via "multipart/form-data")
17. Retrieve a binary property
18. Create a binary property (via request content)
19. Update a binary property (via request content)
20. Create/Update a binary property (via "multipart/form-data")
21. Obtain a query plan for a JCR query
22. Reordering nodes
23. Moving nodes
Page 250 of 424
ModeShape 3

URL: http://<host>:<port>/<context>
HTTP Method: GET
Default Output: text/html
Response Format:
{
"repositories":[
{
"name":"repo",
"workspaces":"http://localhost:8090/resources/repo",
"metadata":{
"custom.rep.name":"repo",
"custom.rep.workspace.names":"default",}
.....
}
}
]
}
Page 251 of 424
ModeShape 3

URL: http://<host>:<port>/<context>/<repository_name>
HTTP Method: GET
Response Format:
{
"workspaces":[
{
"name":"default",
"repository":"http://localhost:8090/resources/repo",
"items":"http://localhost:8090/resources/repo/default/items",
"query":"http://localhost:8090/resources/repo/default/query",
"binary":"http://localhost:8090/resources/repo/default/binary",
"nodeTypes":"http://localhost:8090/resources/repo/default/nodetypes"
}
]
}
Page 252 of 424
ModeShape 3

Retrieves an item at a given path.
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/items/<item_path>
HTTP Method: GET
Response Format:
{
"self":"http://localhost:8090/resources/repo/default/items/someNode",
"up":"http://localhost:8090/resources/repo/default/items/",
"id":"319a0554-3504-4984-b54b-3a9367caac92",
"jcr:primaryType":"{http://www.modeshape.org/1.0}root",
"jcr:uuid":"319a0554-3504-4984-b54b-3a9367caac92",
"children":{
"jcr:system":{
"self":"http://localhost:8090/resources/repo/default/items/jcr:system",
"id": "0a851519-e87d-4e02-b399-0503aa70ab3f"
}
}
}
4. Create a node
Creates a node at the given path, using the body of request as JSON content
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/items/<node_path>
HTTP Method: POST
Default Output: application/json
Request Content-Type: application/json
Response Code (if successful): CREATED
Page 253 of 424
ModeShape 3
Request Format:
{
"children":{
"childNode":{
}
}
}
Response Format:
{
"self":"http://localhost:8090/resources/repo/default/items/testNode",
"id":"bf171df0-daa2-481d-a48a-b3965cd69d9c",
"jcr:primaryType":"{http://www.jcp.org/jcr/nt/1.0}unstructured",
"multiValuedProperty":[
"value1",
"value2"
],
"children":{
"childNode":{
"self":"http://localhost:8090/resources/repo/default/items/testNode/childNode",
"up":"http://localhost:8090/resources/repo/default/items/testNode",
"id":"113e6eea-cbd2-4837-8344-5b28bbfd695c",
}
}
}
Multiple properties with the same name

The different JSON specs that are out there have conflicting views on whether multiple keys with
the same name should be allowed or not. See this for more information.
ModeShape's REST service follows the ECMA JSON specification, which allows multiple keys with
the same name, the effective behavior being that the last key always wins.
Page 254 of 424
ModeShape 3

Updates a node or a property at the given path, using the body of request as JSON content
HTTP Method: PUT
Request Format:
Node: same as the one used when creating
Property:
Response Format:
Node: same as one used when creating
Property:

Deletes the node or the property at the given path. If a node is being deleted, this will also delete all of its
descendants.
HTTP Method: DELETE
Produces: none
Response Code (if successful): NO_CONTENT
Page 255 of 424
ModeShape 3
7. Retrieve a node by its identifier

Retrieves a node with a specified identifier. This is equivalent to the
Session.getNodeByIdentifier(String) method, where the identifier is obtained from the "id" field
(or the "jcr:uuid" field if the node is mix:referenceable) in a previous response. Remember that node
identifiers are generated by the repository, are opaque (and are not always UUIDs), and always remains the
same for a given node (even when moved or renamed) until the node is destroyed.
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/nodes/<node_id>
HTTP Method: GET
Response Format:
{
"self":"http://localhost:8090/resources/repo/default/items/someNode",
"id":"319a0554-3504-4984-b54b-3a9367caac92",
"jcr:primaryType":"{http://www.modeshape.org/1.0}root",
"jcr:uuid":"319a0554-3504-4984-b54b-3a9367caac92",
"children":{
"jcr:system":{
"self":"http://localhost:8090/resources/repo/default/items/jcr:system",
"id": "0a851519-e87d-4e02-b399-0503aa70ab3f"
}
}
}
Page 256 of 424
ModeShape 3
8. Update a node by its identifier

Updates a node with the given identifier, using the body of request as JSON content. The identifier must be
obtained from the "id" field in a previous response.
HTTP Method: PUT
Request Format:
Node: same as the one used when creating a node
Property:
Response Format:
Node: same as one used when creating a node
Property:
9. Delete a node by its identifier

Deletes the node with the given identifier, and all of its descendants. The identifier must be obtained from the
"id" field in a previous response.
HTTP Method: DELETE
Produces: none
Response Code (if successful): NO_CONTENT
Page 257 of 424
ModeShape 3

Executes a JCR query in either: XPath, SQL or SQL2 format, returning a JSON object in response.
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/query
HTTP Method: POST
Response Format:
{
"columns":{
"jcr:path":"STRING",
"jcr:score":"DOUBLE",
"foo":"STRING"
},
"rows":[
{
"jcr:path":"/{}testNode/{}child[2]",
"jcr:score":"0.8575897812843323",
"foo":"value",
"mode:uri":"http://localhost:8090/resources/repo/default/items/testNode/child[2]"
},
{
"jcr:path":"/{}testNode/{}child[3]",
"jcr:score":"0.8575897812843323",
"foo":"value",
"mode:uri":"http://localhost:8090/resources/repo/default/items/testNode/child[3]"
}
]
}
11. Create multiple nodes

Creates multiple nodes (bulk operation) in the repository, using a single session. If any of the nodes cannot
be created, the entire operation fails.
URL: _http://<host>:<port>/<context>/<repository_name>/<workspace_name>/items
Page 258 of 424
ModeShape 3
HTTP Method: POST
Request Format:
{
"testNode/child/subChild" : {
},
"testNode/child" : {
},
"testNode/otherChild" : {
"children":{
"otherSubChild":{
}
}
}
}
Response Format:
Page 259 of 424
ModeShape 3
[
{
"self":"http://localhost:8090/resources/repo/default/items/testNode/child",
"id":"0ef2edc9-c873-4a2f-805e-2950b98225c6",
"value1",
"value2"
],
},
{
"self":"http://localhost:8090/resources/repo/default/items/testNode/child/subChild",
"up":"http://localhost:8090/resources/repo/default/items/testNode/child",
"id":"fb6f4d82-33e1-4bc1-8048-d1f9a685779b",
"value1",
"value2"
],
},
{
"self":"http://localhost:8090/resources/repo/default/items/testNode/otherChild",
"id":"da12f5f9-4ab9-48d7-a159-07144e378d54",
"value1",
"value2"
],
"children":{
"otherSubChild":{
"self":"http://localhost:8090/resources/repo/default/items/testNode/otherChild/otherSubChild",
"up":"http://localhost:8090/resources/repo/default/items/testNode/otherChild"
"id":"21ea01f5-e41c-4aea-9087-e241e02a4b2d",
}
}
}
]
Page 260 of 424
ModeShape 3
12. Update multiple items

Updates multiple nodes and/or properties (bulk operation) in the repository, using a single session. If any of
the items cannot be updated, the entire operation fails.
HTTP Method: PUT
Request Format: same as the one used when creating multiple nodes.
Response Format: same as the one used when creating multiple nodes.
13. Delete multiple items

Deletes multiple items (bulk operation) in the repository, using a single session. If any of the items cannot be
removed, the entire operation fails.
HTTP Method: DELETE
Produces: none;
Request Format:
["testNode/otherChild", "testNode/child",
"testNode/child/subChild"]
14. Retrieve a node type

Retrieves the information about a registered node type in the repository.
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/nodetypes/node_type_name
HTTP Method: GET
Page 261 of 424
ModeShape 3
Response Format:
{
"nt:base":{
"mixin":false,
"abstract":true,
"queryable":true,
"hasOrderableChildNodes":false,
"propertyDefinitions":[
{
"jcr:primaryType":{
"requiredType":"Name",
"declaringNodeTypeName":"nt:base",
"mandatory":true,
"multiple":false,
"autocreated":true,
"protected":true,
"fullTextSearchable":true,
"onParentVersion":"COMPUTE"
}
},
{
"jcr:mixinTypes":{
"mandatory":false,
"multiple":true,
"autocreated":false,
"protected":true,
}
}
],
"subTypes":[
"http://localhost:8090/resources/repo/default/nodetypes/mode:lock",
"http://localhost:8090/resources/repo/default/nodetypes/mode:locks",
....
]
}
}
15. Import a CND file (via request content)

Imports a CND file into the Repository, using the entire request body stream as the content of the CND. If
you were using curl, this would be the equivalent of curl -d
URL: _http://<host>:<port>/<context>/<repository_name>/<workspace_name>/nodetypes
HTTP Method: POST
Page 262 of 424
ModeShape 3
Response Format:
[
{
"nt:base":{
"mixin":false,
"abstract":true,
"queryable":true,
{
"jcr:primaryType":{
"mandatory":true,
"multiple":false,
"autocreated":true,
"protected":true,
}
},
{
"jcr:mixinTypes":{
"mandatory":false,
"multiple":true,
"protected":true,
}
}
],
"subTypes":[
"http://localhost:8090/resources/repo/default/nodetypes/mode:lock",
...
]
}
},
{
"nt:unstructured":{
"mixin":false,
"abstract":false,
"queryable":true,
"hasOrderableChildNodes":true,
{
"*":{
"requiredType":"undefined",
"declaringNodeTypeName":"nt:unstructured",
Page 263 of 424
ModeShape 3
"mandatory":false,
"multiple":true,
"protected":false,
"onParentVersion":"COPY"
}
},
{
"*":{
"requiredType":"undefined",
"declaringNodeTypeName":"nt:unstructured",
"mandatory":false,
"multiple":false,
"protected":false,
}
}
],
"superTypes":[
"http://localhost:8090/resources/repo/default/nodetypes/nt:base"
]
}
},
{
"mix:created":{
"mixin":true,
"abstract":false,
"queryable":true,
{
"jcr:created":{
"requiredType":"Date",
"declaringNodeTypeName":"mix:created",
"mandatory":false,
"multiple":false,
"protected":true,
}
},
{
"jcr:createdBy":{
"requiredType":"String",
"declaringNodeTypeName":"mix:created",
"mandatory":false,
"multiple":false,
"protected":true,
}
}
],
Page 264 of 424
ModeShape 3
"subTypes":[
"http://localhost:8090/resources/repo/default/nodetypes/nt:hierarchyNode"
]
}
}
]
16. Import a CND file (via "multipart/form-data")

Imports a CND file into the Repository when the CND file came from a form submission, where the name of
the HTML element is file. If you were using curl, this would be the equivalent of curl -F
URL: _http://<host>:<port>/<context>/<repository_name>/<workspace_name>/nodetypes
HTTP Method: POST
Request Content-Type: multipart/form-data
Response Format: the same as when importing a CND via the request body.
17. Retrieve a binary property

Retrieves the content of a binary property from the repository, at a given path, by streaming it to the
response.
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/binary/binary_property_path
HTTP Method: GET
Produces: the mime-type of the binary, or a default mime-type
mimeType - a string which can be provided by the client, in case it already knows the expected
mimetype of the binary stream. Otherwise, ModeShape will try to detect the mimetype using its own
detectors mechanism
contentDisposition - a string which will be returned as the Content-Disposition response header. If
none provide, the default is: attachment;filename=property_parent_name
Page 265 of 424
ModeShape 3
18. Create a binary property (via request content)

Creates a new binary property in the repository, at the given path, using the entire request body stream as
the content of the binary. If you were using curl, this would be the equivalent of curl -d
HTTP Method: POST
Response Format:
{
"testProperty":"http://localhost:8090/resources/repo/default/binary/testNode/testProperty",
"self":"http://localhost:8090/resources/repo/default/items/testNode/testProperty",
"up":"http://localhost:8090/resources/repo/default/items/testNode"
}
19. Update a binary property (via request content)

Updates the content of a binary property in the repository, at the given path, using the entire request body
stream as the content of the binary. If you were using curl, this would be the equivalent of curl -d
HTTP Method: POST, PUT
Response Format: the same as in the case when creating a new binary property
Page 266 of 424
ModeShape 3
20. Create/Update a binary property (via "multipart/form-data")

Creates or updates the content of a binary property in the repository, at the given path, when the content
came from a form submission, where the name of the HTML element is file.
If you were using curl, this would be the equivalent of curl -F
HTTP Method: POST
Request Content-Type: multipart/form-data
Response Format: the same as in the case when creating a new binary property
21. Obtain a query plan for a JCR query

Obtain the query plan for an XPath, SQL or SQL2 query, returning the string representation of the query
plan.
URL: http://<host>:<port>/<context>/<repository_name>/<workspace_name>/queryPlan
HTTP Method: POST
Response Format (as "application/json"):
Page 267 of 424
ModeShape 3
{
"statement":"SELECT * FROM [nt:unstructured] WHERE ISCHILDNODE('\/testNode')",
"language":"JCR-SQL2",
"abstractQueryModel":"SELECT * FROM [nt:unstructured] WHERE
ISCHILDNODE([nt:unstructured],'\/testNode')",
"queryPlan": [
"Access [nt:unstructured]",
" Project [nt:unstructured] <PROJECT_COLUMNS=[[nt:unstructured].[jcr:primaryType] AS
[nt:unstructured.jcr:primaryType], [nt:unstructured].[jcr:mixinTypes] AS
[nt:unstructured.jcr:mixinTypes], [nt:unstructured].[jcr:path] AS [nt:unstructured.jcr:path],
[nt:unstructured].[jcr:name] AS [nt:unstructured.jcr:name], [nt:unstructured].[jcr:score] AS
[nt:unstructured.jcr:score], [nt:unstructured].[mode:localName] AS
[nt:unstructured.mode:localName], [nt:unstructured].[mode:depth] AS
[nt:unstructured.mode:depth]], PROJECT_COLUMN_TYPES=[STRING, STRING, STRING, STRING, DOUBLE,
STRING, LONG]>",
"
Select [nt:unstructured]
<SELECT_CRITERIA=ISCHILDNODE([nt:unstructured],'\/testNode')>",
"
Select [nt:unstructured] <SELECT_CRITERIA=[nt:unstructured].[jcr:primaryType] =
'nt:unstructured'>",
"
Source [nt:unstructured] <SOURCE_NAME=__ALLNODES__,
SOURCE_COLUMNS=[jcr:frozenUuid(STRING), mode:sharedUuid(REFERENCE), mode:sessionScope(BOOLEAN),
jcr:defaultValues(STRING), mode:projectedNodeKey(STRING), jcr:mixinTypes(STRING),
jcr:frozenPrimaryType(STRING), jcr:defaultPrimaryType(STRING), jcr:statement(STRING),
jcr:lastModifiedBy(STRING), jcr:mimeType(STRING), jcr:hasOrderableChildNodes(BOOLEAN),
jcr:etag(STRING), jcr:encoding(STRING), jcr:root(REFERENCE), jcr:supertypes(STRING),
jcr:successors(REFERENCE), jcr:primaryItemName(STRING), jcr:hold(STRING), jcr:workspace(STRING),
jcr:description(STRING), jcr:primaryType(STRING), mode:externalNodeKey(STRING),
mode:derivedFrom(STRING), mode:isHeldBySession(BOOLEAN), jcr:baseVersion(REFERENCE),
jcr:lastModified(DATE), jcr:mergeFailed(REFERENCE), mode:derivedAt(DATE),
jcr:requiredPrimaryTypes(STRING), jcr:multiple(BOOLEAN), mode:generated(BOOLEAN),
jcr:activityTitle(STRING), jcr:lifecyclePolicy(REFERENCE), jcr:isMixin(BOOLEAN),
jcr:availableQueryOperators(STRING), jcr:childVersionHistory(REFERENCE), jcr:content(REFERENCE),
jcr:autoCreated(BOOLEAN), mode:alias(STRING), jcr:createdBy(STRING),
jcr:isFullTextSearchable(BOOLEAN), jcr:uuid(STRING), jcr:onParentVersion(STRING),
mode:expirationDate(DATE), jcr:lockIsDeep(BOOLEAN), jcr:copiedFrom(REFERENCE),
jcr:isDeep(BOOLEAN), jcr:title(STRING), jcr:versionableUuid(STRING),
jcr:versionHistory(REFERENCE), jcr:isAbstract(BOOLEAN), jcr:predecessors(REFERENCE),
jcr:lockOwner(STRING), mode:sha1(STRING), jcr:repository(STRING), jcr:created(DATE),
jcr:frozenMixinTypes(STRING), mode:lockedKey(STRING), jcr:text(STRING), jcr:host(STRING),
jcr:configuration(REFERENCE), jcr:port(STRING), mode:workspace(STRING),
jcr:nodeTypeName(STRING), jcr:data(BINARY), jcr:isQueryable(BOOLEAN), jcr:language(STRING),
jcr:isQueryOrderable(BOOLEAN), jcr:mandatory(BOOLEAN), jcr:isCheckedOut(BOOLEAN),
jcr:protected(BOOLEAN), jcr:sameNameSiblings(BOOLEAN), jcr:requiredType(STRING),
jcr:protocol(STRING), mode:lockingSession(STRING), jcr:messageId(STRING), jcr:id(REFERENCE),
mode:uri(STRING), jcr:valueConstraints(STRING), jcr:retentionPolicy(REFERENCE),
jcr:activity(REFERENCE), jcr:currentLifecycleState(STRING), jcr:path(STRING), jcr:name(STRING),
jcr:score(DOUBLE), mode:localName(STRING), mode:depth(LONG)], SOURCE_ALIAS=nt:unstructured>"
]
}
Note that the JSON response contains several fields, including the original query statement, the language,
the abstract query model (or AQM, which is always equivalent to the JCR-SQL2 form of the query), and the
query plan (as an array of strings).
Page 268 of 424
ModeShape 3
Response Format (as "text/plain"):
Access [nt:unstructured]
Project [nt:unstructured] <PROJECT_COLUMNS=[[nt:unstructured].[jcr:primaryType] AS
[nt:unstructured.jcr:primaryType], [nt:unstructured].[jcr:mixinTypes] AS
[nt:unstructured.jcr:mixinTypes], [nt:unstructured].[jcr:path] AS [nt:unstructured.jcr:path],
[nt:unstructured].[jcr:name] AS [nt:unstructured.jcr:name], [nt:unstructured].[jcr:score] AS
[nt:unstructured.jcr:score], [nt:unstructured].[mode:localName] AS
[nt:unstructured.mode:localName], [nt:unstructured].[mode:depth] AS
[nt:unstructured.mode:depth]], PROJECT_COLUMN_TYPES=[STRING, STRING, STRING, STRING, DOUBLE,
STRING, LONG]>
Select [nt:unstructured] <SELECT_CRITERIA=ISCHILDNODE([nt:unstructured],'/testNode')>
Select [nt:unstructured] <SELECT_CRITERIA=[nt:unstructured].[jcr:primaryType] =
'nt:unstructured'>
Source [nt:unstructured] <SOURCE_ALIAS=nt:unstructured, SOURCE_NAME=__ALLNODES__,
SOURCE_COLUMNS=[jcr:frozenUuid(STRING), mode:sharedUuid(REFERENCE), mode:sessionScope(BOOLEAN),
jcr:defaultValues(STRING), mode:projectedNodeKey(STRING), jcr:mixinTypes(STRING),
jcr:frozenPrimaryType(STRING), jcr:defaultPrimaryType(STRING), jcr:statement(STRING),
jcr:lastModifiedBy(STRING), jcr:mimeType(STRING), jcr:hasOrderableChildNodes(BOOLEAN),
jcr:etag(STRING), jcr:encoding(STRING), jcr:root(REFERENCE), jcr:supertypes(STRING),
jcr:successors(REFERENCE), jcr:primaryItemName(STRING), jcr:hold(STRING), jcr:workspace(STRING),
jcr:description(STRING), jcr:primaryType(STRING), mode:externalNodeKey(STRING),
mode:derivedFrom(STRING), mode:isHeldBySession(BOOLEAN), jcr:baseVersion(REFERENCE),
jcr:lastModified(DATE), jcr:mergeFailed(REFERENCE), mode:derivedAt(DATE),
jcr:requiredPrimaryTypes(STRING), jcr:multiple(BOOLEAN), mode:generated(BOOLEAN),
jcr:activityTitle(STRING), jcr:lifecyclePolicy(REFERENCE), jcr:isMixin(BOOLEAN),
jcr:availableQueryOperators(STRING), jcr:childVersionHistory(REFERENCE), jcr:content(REFERENCE),
jcr:autoCreated(BOOLEAN), mode:alias(STRING), jcr:createdBy(STRING),
jcr:isFullTextSearchable(BOOLEAN), jcr:uuid(STRING), jcr:onParentVersion(STRING),
mode:expirationDate(DATE), jcr:lockIsDeep(BOOLEAN), jcr:copiedFrom(REFERENCE),
jcr:isDeep(BOOLEAN), jcr:title(STRING), jcr:versionableUuid(STRING),
jcr:versionHistory(REFERENCE), jcr:isAbstract(BOOLEAN), jcr:predecessors(REFERENCE),
jcr:lockOwner(STRING), mode:sha1(STRING), jcr:repository(STRING), jcr:created(DATE),
jcr:frozenMixinTypes(STRING), mode:lockedKey(STRING), jcr:text(STRING), jcr:host(STRING),
jcr:configuration(REFERENCE), jcr:port(STRING), mode:workspace(STRING),
jcr:nodeTypeName(STRING), jcr:data(BINARY), jcr:isQueryable(BOOLEAN), jcr:language(STRING),
jcr:isQueryOrderable(BOOLEAN), jcr:mandatory(BOOLEAN), jcr:isCheckedOut(BOOLEAN),
jcr:protected(BOOLEAN), jcr:sameNameSiblings(BOOLEAN), jcr:requiredType(STRING),
jcr:protocol(STRING), mode:lockingSession(STRING), jcr:messageId(STRING), jcr:id(REFERENCE),
mode:uri(STRING), jcr:valueConstraints(STRING), jcr:retentionPolicy(REFERENCE),
jcr:activity(REFERENCE), jcr:currentLifecycleState(STRING), jcr:path(STRING), jcr:name(STRING),
jcr:score(DOUBLE), mode:localName(STRING), mode:depth(LONG)]>
The text response only contains the string representation of the query plan.
Page 269 of 424
ModeShape 3
22. Reordering nodes

Assuming you create a parent node POSTing the following request:
{
"children":{
"child1":{
"prop":"child1"
},
"child2":{
"prop":"child2"
},
"child3":{
"prop":"child3"
}
}
}
Then you can reorder its children by issuing a PUT request with the following format:
{
"children":{
"child3":{
},
"child2":{
},
"child1":{
}
}
}
Page 270 of 424
ModeShape 3
23. Moving nodes

In order to move a node using the REST service, 2 steps are required:
1. Retrieve the node which should be moved and store its ID (the id member of the JSON response)
2. Edit the parent-to-be node (aka. the new parent) via a PUT request which contains the ID of the node:
{
"children":{
"child1":{
},
"child2":{
},
"child3":{
},
"41e666ff-0997-4ee0-9eb8-b41319f9f403": {
}
}
}
Page 271 of 424
ModeShape 3
4 Query language grammars

ModeShape supports multiple query languages, including all four languages defined in the JCR 1.0
specification and JCR 2.0 specification. This section of the documentation covers each of ModeShape's
supported languages.
4.1 SQL2
The JCR-SQL2 query language is defined by the JCR 2.0 specification as a way to express queries using
strings that are similar to SQL. This query language is an improvement over the earlier JCR-SQL language,
providing among other things far richer specifications of joins and criteria.
Extensions to JCR-SQL2
Page 272 of 424
ModeShape 3
Extended JCR-SQL2 Grammar
Queries
Source
Joins
Equi-join condition
Same-node join condition
Child-node join condition
Descendant-node join condition
Constraints
And constraint
Or constraint
Not constraint
Comparison constraint
Between constraint
Property existence constraint
Set constraint
Full-text search constraint
Same-node constraint
Child-node constraint
Descendant-node constraint
Reverse Like constraint
Path and name
Static operand
Bind variable
Subquery
Dynamic operand
Property value operand
Reference value operand
Length operand
Node name operand
Node local name operand
Node depth operand
Node path operand
Full text search score operand
Lowercase operand
Uppercase operand
Arithmetic operand
Ordering
Columns
Limit and offset
Pseudo-columns
Custom Locales
Full-text search grammar
Page 273 of 424
ModeShape 3
Example JCR-SQL2 queries
The basics
Using columns in constraints
Inner joins
Other joins
Set operations: unions, intersections, and complements
Subqueries
4.1.1 Extensions to JCR-SQL2

ModeShape includes full support for the complete JCR-SQL2 query language defined by the specification.
However, ModeShape adds several extensions to make it even more powerful:
Support for the "FULL OUTER JOIN" and "CROSS JOIN" join types, in addition to the "LEFT OUTER
JOIN", "RIGHT OUTER JOIN" and "INNER JOIN" types defined by JCR-SQL2. Note that "JOIN" is a
shorthand for "INNER JOIN". For detail, see the grammar for joins.
Support for the UNION, INTERSECT, and EXCEPT set operations on multiple result sets to form a
single result set. As with standard SQL, the result sets being combined must have the same columns.
The UNION operator combines the rows from two result sets, the INTERSECT operator returns the
difference between two result sets, and the EXCEPT operator returns the rows that are common to two
result sets. Duplicate rows are removed unless the operator is followed by the ALL keyword. For
detail, see the grammar for set queries.
Removal of duplicate rows in the results, using "SELECT DISTINCT ..." expression. For detail, see
the grammar for queries.
Limiting the number of rows in the result set with the "LIMIT count" clause, where count is the
maximum number of rows that should be returned. This clause may optionally be followed by the "
OFFSET number" clause to specify the number of initial rows that should be skipped. For detail, see
the grammar for limits and offsets.
Additional dynamic operands "DEPTH(selectorName)" and "PATH(selectorName)" that enable
placing constraints on the node depth and path, respectively. These dynamic operands can be used
in a manner similar to "NAME(selectorName)" and "LOCALNAME(selectorName)" that are
defined by JCR-SQL2. Note in each of these cases, the selectorName is optional if there is only
one selector in the query. For detail, see the grammar for dynamic operands.
Additional dynamic operand "REFERENCE(selectorName.propertyName)" and "REFERENCE(
selectorName)" that enables placing constraints on one or any of the reference properties,
respectively, and which can be used in a manner similar to the standard dynamic operand "
PropertyValue(selectorName.propertyName)". Note in each of these cases, the
selectorName is optional if there is only one selector in the query, and that the propertyName can
be excluded if the constraint should apply to all reference properties. For detail, see the grammar for
dynamic operands.
Support for the "IN" and "NOT IN" clauses to more easily and concisely supply multiple of discrete
static operands. For example, "WHERE ... [my:type].[prop1] IN (3,5,7,10,11,50) ...".
For detail, see the grammar for set constraints.
Page 274 of 424
ModeShape 3
Support for the "BETWEEN" clause to more easily and concisely supply a range of discrete operands.
For example, "WHERE ... [my:type].[prop1] BETWEEN 3 EXCLUSIVE AND 10 ...". For
detail, see the grammar for between constraints.
Support for simple arithmetic in numeric-based criteria and order-by clauses. For example, "...
WHERE SCORE(type1) + SCORE(type2) > 1.0" or "... ORDER BY (SCORE(type1) *
SCORE(type2)) ASC, LENGTH(type2.property1) DESC". For detail, see the grammar for
order-by clauses.
Support for (non-correlated) subqueries in the WHERE clause, wherever a static operand can be used.
Subqueries can even be used within another subquery. All subqueries must return a single column,
and each row's single value will be treated as a literal value. If the subquery is used in a clause that
expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the
subquery is used in a clause that allows multiple values (e.g., "IN (...)"), then all of the subquery's
rows will be used. For example, this expression "WHERE ... [my:type].[prop1] IN ( SELECT
[my:prop2] FROM [my:type2] WHERE [my:prop3] < '1000' ) AND ..." will use the
results of the subquery as the literal values in the IN clause. See the subqueries section for more
information.
Support for several pseudo-columns ("jcr:path", "jcr:score", "jcr:name", "mode:localName",
and "mode:depth") that can be used in the SELECT, equijoin, and WHERE clauses. These
pseudo-columns make it possible to return location-related and score information within the
QueryResult's rows. They also make queries look more like SQL, and thus may be more friendly
and easier to use in existing SQL-aware client applications. See the pseudo-columns section for more
information.
Support for NOT LIKE as an operator in comparison criteria, and which is equivalent to wrapping a
LIKE comparison criteria in a NOT(...) clause.
Support for RELIKE(...) operator that tests patterns stored in property values against a supplied
string parameter.
Support for custom Locale instances
4.1.2 Extended JCR-SQL2 Grammar

The rest of this section documents the full grammar for ModeShape's extended JCR-SQL2 support. Again,
this grammar is a strict superset of that defined by the JCR 2.0 specification. In other words, ModeShape will
support any JCR-SQL2 query that uses the standard grammar, but it will also support queries that make use
of the ModeShape extensions.
Page 275 of 424
ModeShape 3
Queries
The top-level rule for ModeShape's extended JCR-SQL2 grammar is QueryCommand, which consists of
both Query and SetQuery:
QueryCommand ::= Query | SetQuery

SetQuery ::= Query ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query
{ ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query }
Query ::= 'SELECT' ['DISTINCT'] columns
'FROM' Source
['WHERE' Constraint]
['ORDER BY' orderings]
[Limit]
ModeShape adds the concept of a set query, which is a query that performs a union, intersection, or
complement of the results of two other queries. Set queries are common in SQL (which is essentially a set
manipulation language) and are a very useful tool that would otherwise require significant processing of the
results of multiple queries by the application. By supporting set queries, the application merely needs to
declare that set operation be performed, and ModeShape will perform all the work before returning the
results.
ModeShape also adds the ability to use "SELECT DISTINCT", which eliminates duplicate rows in a manner
similar to SQL.
Source
A source is a named set of tuples, which in ModeShape corresponds to the nodes of a particular named
node type. In other words, a source is equivalent to a table in a relational database. The available columns
of a source are the named properties declared on the node type.
In the JCR-SQL2 grammar, a source is either a selector (a named node type) or a join specification:
Source ::= Selector | Join

Selector ::= nodeTypeName ['AS' selectorName]
nodeTypeName ::= Name
selectorName ::= /* A string that contains only SQL-legal characters,
and which can be used elsewhere in the query to
refer to the selector. */
See the Name rule below.
Page 276 of 424
ModeShape 3
Joins
The JCR 2.0 specification does include joins in the standard JCR-SQL2 grammar, though the only defined
types of joins included inner, left outer, and right outer joins. Because SQL also defines the useful full outer
and cross join types, ModeShape adds support for these.
Join ::= left [JoinType] 'JOIN' right 'ON' JoinCondition

/* If JoinType is omitted INNER is assumed. */
left ::= Source
right ::= Source
JoinType ::= Inner | LeftOuter | RightOuter | FullOuter | Cross
Inner ::= 'INNER' ['JOIN']
LeftOuter ::= 'LEFT JOIN' | 'OUTER JOIN' | 'LEFT OUTER JOIN'
RightOuter ::= 'RIGHT OUTER' ['JOIN']
RightOuter ::= 'FULL OUTER' ['JOIN']
RightOuter ::= 'CROSS' ['JOIN']
JoinCondition ::= EquiJoinCondition |
SameNodeJoinCondition |
ChildNodeJoinCondition |
DescendantNodeJoinCondition
Each of the four kinds of join conditions are described below.
Page 277 of 424
ModeShape 3
Equi-join condition
An equijoin is a join that uses only equality comparisons in the join predicate (or join condition). Using any
other operators (e.g., '<' or '!=') in the join condition disqualifies a query from being an equi-join.
Therefore, the rules for the equi-join condition are as follows:
EquiJoinCondition ::= selector1Name'.'property1Name

'=' selector2Name'.'property2Name
selector1Name
selector2Name
property1Name
property2Name
::=
::=
::=
::=
selectorName
selectorName
propertyName
propertyName
propertyName ::= Name
where the node type referenced by the selector identified in the query with the selector1Name must
contain the property given by the property1Name literal, and similarly the node type referenced by the
selector identified in the query with the selector2Name must contain the property given by the
property2Name literal.
See also the name rule below.
Same-node join condition

An identity join is a special case of an equijoin, where the compared properties are node identifiers. Thus the
join condition of an identity join constrains the node on one sides of the join to be the same node on the
other side of the join. The standard JCR-SQL2 grammar defines a special function that makes this a little
easier to use:
SameNodeJoinCondition ::= 'ISSAMENODE(' selector1Name

',' selector2Name
[',' selector2Path] ')'
selector1Name ::= selectorName
selector2Name ::= selectorName
selector2Path ::= Path
See also the path rule below.
Page 278 of 424
ModeShape 3
Child-node join condition

A child-node join is one where the join condition constrains the node on the left side of the join to be a child
of the node on the right side of the join. The standard JCR-SQL2 grammar defines a special function that
makes it easier to specify such join conditions:
ChildNodeJoinCondition ::= 'ISCHILDNODE('

childSelectorName ','
parentSelectorName ')'
childSelectorName ::= selectorName
parentSelectorName ::= selectorName
Descendant-node join condition

A descendant-node join is one where the join condition constrains the node on the left side of the join to be a
descendant of the node on the right side of the join. The standard JCR-SQL2 grammar defines a special
function that makes it easier to specify such join conditions:
DescendantNodeJoinCondition ::= 'ISDESCENDANTNODE('

descendantSelectorName ','
ancestorSelectorName ')'
descendantSelectorName ::= selectorName
ancestorSelectorName ::= selectorName
Constraints
The Query rule defined above included a "WHERE" clause that can define multiple constraints on the nodes
included in the results. The standard JCR-SQL2 grammar defined several such constraints, including and, or
, not, comparison, property existence, full-text search, same-node, child-node, and descendant-node
constraints. ModeShape supports all of these, but adds two others: between and set constraints.
Constraint ::= ConstraintItem | '(' ConstraintItem ')'

ConstraintItem ::= And | Or | Not
Comparison | Between |
PropertyExistence |
SetConstraint |
FullTextSearch |
SameNode |
ChildNode |
DescendantNode |
Relike
Each of these types of constraints are described below.
Page 279 of 424
ModeShape 3
And constraint
An and constraint stipulates that a node (or record or tuple) is included only if two other constraints are both
true.
And ::= constraint1 'AND' constraint2

constraint1 ::= Constraint
Or constraint
An or constraint stipulates that a node (or record or tuple) is included if either of two other constraints are
true.
Or ::= constraint1 'OR' constraint2

Not constraint
The not qualifier will negate another constraint, requiring that a node (or record or tuple) is included if the
other constraint is not true.
Not ::= 'NOT' constraint

constraint ::= Constraint
Page 280 of 424
ModeShape 3
Comparison constraint
A comparison constraint requires that the value for a node described by the dynamic operand on the left side
of the operator is to be compared to a static literal value. The term "dynamic operand" is used in the
JCR-SQL2 grammar because its value can only be determined during query evaluation.
Comparison ::= DynamicOperand Operator StaticOperand

Operator ::= '=' | '!=' | '<' | '<=' | '>' | '>=' | 'LIKE' | 'NOT LIKE'
The DynamicOperand and StaticOperand rules are defined below.

The behavior of the operators is dictated by the JCR 2.0 specification and matches how Value objects are
compared:
If the DynamicOperand evaluates to null, the constraint is not satisfied.
If the '=" operator is used, the value that the DynamicOperand evaluates to must equal the
StaticOperand value for the constraint to be satisfied.
If the '!=" operator is used, the value that the DynamicOperand evaluates to must not equal the
If the '<" operator is used, the value that the DynamicOperand evaluates to must be less than the
If the '<=" operator is used, the value that the DynamicOperand evaluates to must be less than or
equal to the StaticOperand value for the constraint to be satisfied.
If the '>" operator is used, the value that the DynamicOperand evaluates to must be greater than the
If the '>=" operator is used, the value that the DynamicOperand evaluates to must be greater than or
equal to the StaticOperand value for the constraint to be satisfied.
If the "LIKE" operator is used, the constraint is only satisfied if the value that the DynamicOperand
evaluates to match the pattern specified by the string literal StaticOperand, where in the pattern:
the character "%" matches zero or more characters, and
the character "_" (underscore) matches exactly one character, and
the string "\x" matches the character "x", and
all other characters match themselves
If the "NOT LIKE" operator is used, the constraint is only satisfied if the value that the
DynamicOperand evaluates to not match the pattern specified by the string literal StaticOperand,
where in the pattern:
the character "%" matches zero or more characters, and
the character "_" (underscore) matches exactly one character, and
the string "\x" matches the character "x", and
all other characters match themselves
Also, note that, unlike SQL, the standard JCR-SQL2 grammar does not allow the left-hand side and
right-hand sides of a comparison constraint to be swapped.
Page 281 of 424
ModeShape 3
Between constraint
The between constraint is one of the extensions defined by ModeShape, and allows a query to more easily
represent a range of static values than using only the constraints available in the standard JCR-SQL2
grammar. The between constraint is based on the similar expression in SQL.
Between ::= DynamicOperand ['NOT'] 'BETWEEN'

lowerBound ['EXCLUSIVE'] 'AND'
upperBound ['EXCLUSIVE']
lowerBound ::= StaticOperand
upperBound ::= StaticOperand
Again, the DynamicOperand and StaticOperand rules are defined below.
Property existence constraint

A property existence constraint stipulates that a property does indeed exist on a node that is of the node
type specified by the named selector. ModeShape does allow the " NOT" qualifier to be excluded, which turns
the constraint into a stipulation that the property does not exist on the node.
PropertyExistence ::= [selectorName'.']propertyName 'IS' ['NOT'] 'NULL'

/* If only one selector exists in this query,
explicit specification of the selectorName
preceding the propertyName is optional */
Page 282 of 424
ModeShape 3
Set constraint
Like the between constraint, the set constraint is a ModeShape extension to the standard JCR-SQL2
grammar that allows what would normally be a complicated combination of standard JCR-SQL2 constraints
to be more easily represented with a single, simple expression. Again, this constraint is patterned after the
similar expression in SQL.
SetConstraint ::= [selectorName '.']propertyName ['NOT'] 'IN'

'(' firstStaticOperand
{',' additionalStaticOperand }
')'
firstStaticOperand ::= StaticOperand
additionalStaticOperand ::= StaticOperand
Note that multiple static operands can be included in the comma-separated list. The StaticOperand rule is
defined below.
Although this rule seems complicated, it's actually very straightforward. The following query selects all the
properties defined on the "acme:taggable" node type, returning only those "taggable" nodes with a "
acme:tagname" value of "tag1", "tag2", "tag3", or "tag4":
SELECT * FROM [acme:taggable] as tagged

WHERE tagged.[acme:tagName] IN ('tag1','tag2','tag3','tag4')
Even this trivial query is quite a bit simpler and easier to understand than if the query had used only the
constraints defined by the standard JCR-SQL2 grammar:
SELECT * FROM [acme:taggable]

WHERE tagged.[acme:tagName] =
OR tagged.[acme:tagName] =
as tagged
'tag1'
'tag2'
'tag3'
'tag4'
Imagine how complicated a query might be with multiple joins, multiple criteria, and many values to be
compared for one or several different properties.
Page 283 of 424
ModeShape 3
Full-text search constraint

FullTextSearch ::= 'CONTAINS('
([selectorName'.']propertyName | selectorName'.*')
',' ''' fullTextSearchExpression''' ')'
fullTextSearchExpression ::= FulltextSearch
The full-text search expression is a string literal that adheres to the full-text search grammar described
below.
An example query selects all the properties defined on the " acme:taggable" node type, returning only
those "taggable" nodes with a "acme:tagname" value that contains the "foo" term within the value:

WHERE CONTAINS(tagged.[acme:tagName],'foo')
Same-node constraint
The same-node constraint stipulates that the node appearing in the selector with the given name has a path
that matches the literal path provided.
SameNode ::= 'ISSAMENODE(' [selectorName ','] Path ')'

where the Path rule is defined below.

Because this standard constraint clause is not really like traditional SQL, ModeShape defines a ' jcr:path'
pseudo-column that can be used in comparison constraints and that allows for using other comparison
operators, including "LIKE".
Page 284 of 424
ModeShape 3
Child-node constraint
The child-node constraint stipulates that the node appearing in the selector with the given name is a child of
a node with a path that matches the literal path provided.
ChildNode ::= 'ISCHILDNODE(' [selectorName ','] Path ')'

See also ModeShape's 'jcr:path' pseudo-column that can be used in comparison constraints and that
allows for using other comparison operators, including "LIKE". And because the right hand side (i.e., static
operand) of a "LIKE" expression can involve wildcards, it may be easier and more understandable to use
the pseudo-column.
Descendant-node constraint
The descendant-node constraint stipulates that the node appearing in the selector with the given name is a
descendant of a node with a path that matches the literal path provided.
DescendantNode ::= 'ISDESCENDANTNODE(' [selectorName ','] Path ')'

See also ModeShape's 'jcr:path' pseudo-column that can be used in comparison constraints and that
allows for using other comparison operators, including "LIKE". And because the right hand side (i.e., static
operand) of a "LIKE" expression can involve wildcards, it may be easier and more understandable to use
the pseudo-column.
Page 285 of 424
ModeShape 3
Reverse Like constraint
The standard JCR-SQL2 LIKE operator takes as input a fixed pattern and then attempts to find nodes whose
value for the given property matches the pattern. However, sometimes the property values might actually
store the patterns, and you want to find all property patterns that match a given fixed string parameter. The
RELIKE (or "reverse like") makes this possible. To make this function more clear from the standard LIKE,
the operands are reversed so that the pattern is the second parameter RELIKE.
For example, if the property of node "my:namePattern" on a node of type "my:country" contains a
matching pattern of country name (e.g., "%Monaco%"), then we can find all "my:country" nodes that have a
name pattern that matches "Principality of Monaco":
SELECT *
FROM [my:country] AS country
WHERE RELIKE("Principality of Monaco", country.[my:namePattern]);
Path and name

Many of the rules above have used paths and names, and the rules for these are defined as follows:
Name ::= '[' quotedName ']' | '[' simpleName ']' | simpleName

quotedName ::= /* A JCR Name (see the JCR specification) */
simpleName ::= /* A JCR Name that contains only SQL-legal
characters (namely letters, digits, and underscore) */
Path ::= '[' quotedPath ']' | '[' simplePath ']' | simplePath
quotedPath ::= /* A JCR Path that contains non-SQL-legal characters */
simplePath ::= /* A JCR Path (rather Name) that contains only SQL-legal
characters (namely letters, digits, and underscore) */
Note that JCR-SQL2 surrounds identifiers with square brackets (e.g., '[' and ']'), allowing names to contain
a ':' character needed with namespaced names. If the names or paths only contain valid SQL characters,
then they do not need to be quoted.
Static operand
In the standard JCR-SQL2 grammar, a static operand appears on the right-hand side of an operator, and
represents an expression whose value can be determined by static analysis of the query (e.g., when the
query is parsed). In particular, a static operand in the standard JCR-SQL2 grammar comprised of either a
literal value or a variable.
Page 286 of 424
ModeShape 3
In SQL, however, the expression that appears on the right-hand side of an operator is not always able to be
determined at query parse time. An example is a subquery, which appears on the right hand side but
obviously can only be evaluated into values during query execution time. Since standard JCR-SQL2 does
not include any such features, the term "static operand" is technically valid.
In addition to literal values and variables, ModeShape also supports subqueries appearing on the right-hand
side of an operator. So this grammar continues to use the "static operand" term for easy comparison with the
standard JCR-SQL2 grammar, but the term has a different (and expanded) semantic than in the standard
grammar.
Therefore, the rules for what ModeShape allows on the right-hand side of an operator in a constraint is as
follows:
StaticOperand ::= Literal | BindVariableValue | Subquery

Literal ::= CastLiteral | UncastLiteral
CastLiteral ::= 'CAST(' UncastLiteral ' AS ' PropertyType ')'
PropertyType ::= 'STRING' | 'BINARY' | 'DATE' |
'LONG' | 'DOUBLE' | 'DECIMAL' |
'BOOLEAN' | 'NAME' | 'PATH' |
'REFERENCE' | 'WEAKREFERENCE' |
'URI'
UncastLiteral ::= UnquotedLiteral |
''' UnquotedLiteral ''' |
'"' UnquotedLiteral '"'
UnquotedLiteral ::= /* String form of a JCR Value,
as defined in the JCR specification */
where the grammar rules for BindVariableValue and Subquery are defined below.
Page 287 of 424
ModeShape 3
Bind variable
The standard JCR-SQL2 grammar supports using variable names within a query, where the values for those
variables are bound to the Query object before execution. In the query, the variable names are prefixed with
a '$' character and are otherwise normal JCR name:
BindVariableValue ::= '$'bindVariableName

bindVariableName ::= /* A string that conforms to the JCR Name syntax,
though the prefix does not need to be a
registered namespace prefix. */
So, consider this simple query that selects all the properties defined on the " acme:taggable" node type,
and that returns only those "taggable" nodes with a "acme:tagname" value that matches the value of the "
tagValue" variable:

WHERE tagged.[acme:tagName] = $tagValue
This query could be evaluated using the JCR API as follows:
javax.jcr.Session session = // ...
javax.jcr.query.QueryManager mgr = session.getWorkspace().getQueryManager();
// Bind a value to the variable ...
Value tag = session.getValueFactory().create("foo");
query.bindVariable("tagValue",tag);
Obviously multiple variables can be used in a query expression, but a value must be bound to every variable
before the Query object can be executed.
Page 288 of 424
ModeShape 3
Subquery
The standard JCR-SQL2 grammar does not support subqueries. But subqueries are such a useful feature,
so ModeShape supports using multiple subqueries within a single query. In fact, subqueries are nothing
more than a QueryCommand, which if you'll remember is the top-level rule in ModeShape's grammar. That
means that subqueries can be any query, and you can even include subqueries within a subquery!
Subquery ::= '(' QueryCommand

QueryCommand
')' |
Strictly speaking, ModeShape only supports non-correlated subqueries, which means that they can actually
be evaluated independently (outside the context of the containing query).
Additionally, because subqueries appear on the right-hand side of an operator, all subqueries must return a
single column, and each row's single value will be treated as a literal value. If the subquery is used in a
clause that expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the
subquery is used in a clause that allows multiple values (e.g., "IN (...)"), then all of the subquery's rows
will be used.
For example, in the following query fragment, the first value in each row of the subquery's results will be
used within the IN clause of the outer query:
WHERE ... [my:type].[prop1] IN

( SELECT [my:prop2] FROM [my:type2]
WHERE [my:prop3] < '1000' )
AND ...
However, changing the IN clause to a comparison results in only the first value in the first row of the
subquery's results being using in the comparison criteria:
WHERE ... [my:type].[prop1] =

( SELECT [my:prop2] FROM [my:type2]
WHERE [my:prop3] < '1000' )
AND ...
Dynamic operand
In various constraints described above, the dynamic operand appears on the left-hand side of an operator,
and signifies that the values can only be determined when the query is evaluated.
The standard JCR-SQL2 grammar defines seven kinds of dynamic operands: property value, length, node
name, node local name, full-text search score, lowercase, and uppercase.
ModeShape supports all these types, but adds support for four more: reference value, node path, node
depth, and simple arithmetic clauses. ModeShape also allows the dynamic operand to be surrounded by
parentheses, which is sometimes convenient for complex queries.
Page 289 of 424
ModeShape 3
The DynamicOperand rule in ModeShape's extended grammar is:
DynamicOperand ::= PropertyValue | ReferenceValue |

Length | NodeName | NodeLocalName |
NodePath | NodeDepth |
FullTextSearchScore |
LowerCase | UpperCase |
Arithmetic |
'(' DynamicOperand ')'
Each of these types of dynamic operands is described in the following subsections.
Property value operand

The property value operand always evaluates to the value(s) of the specified property on the selector.
PropertyValue ::= [selectorName'.'] propertyName

preceding the propertyName is optional. */
Note that if the property is multi-valued, the constraint will be satisfied if any of the property values works
with the constraint. For example, if the 'acme:tagNames' property is a multi-valued property declared on the
"acme:taggable" node type, then the following query will finds all "acme:taggable' nodes that has "foo"
for at least one of the values of the 'acme:tagNames' property:

WHERE tagged.[acme:tagNames] = 'foo'
Page 290 of 424
ModeShape 3
Reference value operand

One of ModeShape's extensions is to support the a "REFERENCE(...)" dynamic operand, which enables
placing constraints on one or any of the reference properties.
ReferenceValue ::= 'REFERENCE(' selectorName '.' propertyName ')' |

'REFERENCE(' selectorName ')' |
'REFERENCE()' |
preceding the propertyName is optional.
Also, the property name may be excluded
if the constraint should apply to any
reference property.*/
The "REFERENCE" operand always evaluates to the identifier of the referenced nodes in one or all of the
REFERENCE properties. Thus, all of the REFERENCE operands should be used with a StaticOperand
that also evaluates to identifiers.
The "REFERENCE()" operand (with no selector name and no property name) evaluates to the identifiers of
the nodes referenced by all of reference properties on the node in the only selector. The "REFERENCE(
selectorName)" works the same way, but must be used if there is more than one selector in the query.
Finally, the "REFERENCE(selectorName.propertyName)" evaluates to the identifiers of nodes
referenced by the "propertyName" reference property on the nodes in the named selector.
For example, here is a query that finds all nodes that reference a set of nodes for which we already know the
identifiers, "id1", "id2", and "id3".
SELECT * FROM [nt:base]

WHERE REFERENCE() IN ('id1','id2','id3')
This operand works really well with subqueries or variables for the right-hand side. For example, here is a
query finds all nodes that reference any of the nodes in the subgraph below the " /foo/bar/baz" node,
where a subquery is used to find all nodes in the subgraph:

WHERE REFERENCE() IN (
SELECT [jcr:uuid] FROM [nt:base] AS refd
WHERE ISDESCENDANTNODE(refd,'/foo/bar/baz')
)
This kind of query is impossible to do using standard JCR-SQL2 features, and shows some of the power of
ModeShape's extensions to JCR-SQL2.
Page 291 of 424
ModeShape 3
Length operand
The length operand evaluates to the length (or lengths, if multi-valued) of a property. The length is defined to
be:
for a BINARY value, the number of bytes in the value, or
for all other value types, the number of characters of the string resulting from a conversion of the
value to a string.
The rule for the length operand is:
Length ::= 'LENGTH(' PropertyValue ')'
where PropertyValue rule is defined above.
Node name operand

The node name operand always evaluates to the prefixed name of the node given by the supplied selector:
NodeName ::= 'NAME(' [selectorName] ')'

is optional */
See also the 'jcr:name' pseudo-column, which enables accessing the JCR name of any node as if the
name were a regular property on any node.
Node local name operand

The node name operand always evaluates to the local name of the node given by the supplied selector:
NodeLocalName ::= 'LOCALNAME(' [selectorName] ')'

is optional */
See also the 'mode:localName' pseudo-column, which enables accessing the local name of any node as if
the local name were a regular property.
Page 292 of 424
ModeShape 3
Node depth operand

The node depth operand is a ModeShape-specific extension to the standard set of dynamic operands, and
evaluates to the integer depth of the node given by the supplied selector. The depth of a node is defined to
be the number of segments in the node's path. For example, the depth of the root node is 0, whereas the
depth of the node at "/foo/bar/baz" is 3.
NodeDepth ::= 'DEPTH(' [selectorName] ')'

is optional */
See also the 'mode:depth' pseudo-column, which enables accessing the depth of any node as if the depth
were a regular property.
Node path operand

The node path operand is a ModeShape-specific extension to the standard set of dynamic operands, and
evaluates to the path of the node given by the supplied selector.
NodePath ::= 'PATH(' [selectorName] ')'

is optional */
See also the 'jcr:path' pseudo-column, which enables accessing the path of any node as if the path were
a regular property.
Full text search score operand

The full-text search score operand evaluates to a DOUBLE value equal to the full-text search score of a node.
The full-text search score ranks a selector's nodes by their relevance to the 'fullTextSearchExpression
' specified in a [FullTextSearch|#Fulltextsearchconstraint. The magnitude of the scores are
implementation specific, but most implementations will produce higher scores with more relevant matching
and lower scores for less-relevant matching.
FullTextSearchScore ::= 'SCORE(' [selectorName] ')'

is optional */
See also the 'jcr:score' pseudo-column, which enables accessing the score of any node as if the score
were a regular property.
Page 293 of 424
ModeShape 3
Lowercase operand
The lowercase operand evaluates to the the lower-case string value (or values, if multi-valued) of operand. If
the operand does not evaluate to a string value, its value is first converted to a string.
LowerCase ::= 'LOWER(' DynamicOperand ')'
Uppercase operand
The uppercase operand evaluates to the the upper-case string value (or values, if multi-valued) of operand.
If the operand does not evaluate to a string value, its value is first converted to a string.
UpperCase ::= 'UPPER(' DynamicOperand ')'
Arithmetic operand
The arithmetic operand is a ModeShape-specific extension to the standard JCR-SQL2 grammar. It allows
two other dynamic operands that evaluate to numeric values to be numerically combined using addition,
subtraction, multiplication, or division.
Arithmetic ::= DynamicOperand ('+'|'-'|'*'|'/') DynamicOperand
For example, the following query restricts the results such that the sum of the score of nodes originating from
separate selectors is greater than 1.0:
SELECT * FROM [acme:type1] AS type1

JOIN [acme:type2] as type2 ON type1.prop1 < type2.prop2
WHERE SCORE(type1) + SCORE(type2) > 1.0
So although it's possible to use in the WHERE clause, it's more likely to be used in the order-by clauses. For
example, the following query orders the results based upon the difference in the scores of nodes in the two
selectors:
SELECT * FROM [acme:type1] AS type1

ORDER BY ( SCORE(type1) - SCORE(type2) ) ASC,
LENGTH(type2.prop3) DESC
Page 294 of 424
ModeShape 3
Ordering
The "ORDER BY" clause defined by the standard JCR-SQL2 grammar allows the order of the results to be
dictated by the values evaluated at execution time based upon one or more dynamic operands. The rule for
the expression is as follows:
orderings ::= Ordering {',' Ordering}

Ordering ::= DynamicOperand [Order]
Order ::= 'ASC' | 'DESC'
As with SQL, the "ASC" qualifier specifies that the ordering should be in ascending order, and is the default;
likewise, the "DESC" qualifier specifies that the ordering should be in descending order.
Page 295 of 424
ModeShape 3
Columns
The standard JCR-SQL2 grammar allows a query to include in the "SELECT" clause which property values
should be returned and included in the results:
columns ::= (Column ',' {Column}) | '*'

Column ::= ([selectorName'.']propertyName ['AS' columnName]) |
(selectorName'.*')
is optional */
selectorName ::= Name
propertyName ::= Name
columnName ::= Name
When "*" is used for the list of selected columns, the result set is expected to minimally include, for each
selector, a column for each single-valued non-residual property of the selector's node type, including those
explicitly declared on the node type and those inherited from the node's supertypes.
For example, the result set for the following query would contain at least the ' [jcr:primaryType]'
column, since it is the only single-valued, non-residual property defined on the ' [nt:base]' node type. The '
[jcr:mixinTypes]' property is also non-residual, but the results need not include it since it's multi-valued.
If there are multiple selectors, then "SELECT *" will include all of the selectable columns from each
selector's node type. However, it's possible to request all of the selectable columns from some of the
selectors, using the form. For example:
SELECT type1.*
FROM [acme:type1] AS type1
Note, however, that although only single-valued, non-residual properties are included when " *" is used in the
SELECT clause, it is possible to explicitly include residual properties. For example, the following query finds
all nodes that have at least one "foo" value for the 'acme:tagNames' property:
SELECT [acme:tagNames] AS tagName

FROM [nt:base] WHERE tagName = 'foo'
Page 296 of 424
ModeShape 3
Limit and offset

Neither the standard JCR-SQL2 grammar or the JCR API itself provide support for limiting the rows that are
returned in the results. This is a common need, especially for applications that paginate the results, where
each page shows a subset of the results.
Because this is such an essential feature that can't be accomplished any other way, ModeShape adds
support for specifying the maximum number of rows to return, and optionally specifying the number of initial
rows that should be skipped. The ModeShape extension is follows the SQL syntax:
Limit ::= 'LIMIT' count [ 'OFFSET' offset ]

count ::= /* Positive integer value */
offset ::= /* Non-negative integer value */
The LIMIT clause is entirely optional, and if absent does not limit the result set rows in any way. However, if
the "LIMIT count" clause is used, then the result set will contain no more than count rows. This LIMIT
clause may optionally be followed by the "OFFSET number" clause, where number is the number of initial
rows that should be skipped before the rows are included in the results.
Pseudo-columns
The design of the JCR-SQL2 query language makes fairly heavy use of functions, including SCORE(),
NAME(), LOCALNAME(), and various constraints. ModeShape adds several more useful functions, such as
PATH() and DEPTH(), that follow the same patterns.
However, these functions have several disadvantages. First, they make the JCR-SQL2 language less
"SQL-like", since SQL-92 and -99 don't define similar kinds of functions. (There are aggregate functions, like
COUNT, SUM, etc., but they operate on a particular column in all tuples and are therefore more dissimilar than
similar.) This means that applications that use SQL and SQL-like query languages are less likely to be able
to build and issue JCR-SQL2 queries.
A second disadvantage of these functions is that JCR-SQL2 does not allow them to be used within the
SELECT clause. As a result, the location-related and score information cannot be included as columns of
values in the QueryResult rows. Instead, a client can only access this information by obtaining the Node
object(s) for each row. Relying upon both the result set and additional Java objects makes it difficult to use
the JCR query system. It also makes certain kinds of applications impossible.
For example, ModeShape's JDBC driver is designed to enable JDBC-aware applications to query repository
content using JCR-SQL2 queries. The standard JDBC API cannot expose the Node objects, so the only way
to return the path-related and score information is through additional columns in the result. While such
columns could always "magically" appear in the result set, doing this is not compatible with JDBC
applications that dynamically build the SELECT clauses of queries based upon database metadata. Such
applications require the columns to be properly described in database metadata, and the columns need to
be used within queries.
Page 297 of 424
ModeShape 3
ModeShape attempts to solve these issues by directly supporting a number of "pseudo-columns" within
JCR-SQL2 queries, wherever columns can be used. These "pseudo-columns" include:
jcr:score is a column of type DOUBLE that represents the full-text search score of the node, which
is a measure of the node's relevance to the full-text search expression. ModeShape does compute
the scores for all queries, though the score for rows in queries that do not include a full-text search
criteria may not be reliable.
jcr:path is a column of type PATH that represents the normalized path of a node, including
same-name siblings. This is the same as what would be returned by the getPath() method of Node.
Examples of paths include "/jcr:system" and "/foo/bar
[3]
".
jcr:name is a column of type NAME that represents the node name in its namespace-qualified form
using namespace prefixes and excluding same-name-sibling indexes. Examples of node names
include "jcr:system", "jcr:content", "ex:UserData", and "bar".
mode:localName is a column of type STRING that represents the local name of the node, which
excludes the namespace prefix and same-name-sibling index. As an example, the local name of the
"jcr:system" node is "system", while the local name of the "ex:UserData
[3]
" node is "UserData".
mode:depth is a column of type LONG that represents the depth of a node, which corresponds
exactly to the number of path segments within the path. For example, the depth of the root node is 0,
whereas the depth of the "/jcr:system/jcr:nodeTypes" node is 2.
All of these pseudo-columns can be used in the SELECT clause of any JCR-SQL2 query, and their use
defines whether such columns appear in the result set. In fact, all of these pseudo-columns will be included
when "SELECT *" clauses in JCR-SQL2 queries are expanded by the query engine. This means that every
node type (even mixin node types that have no properties and are essentially markers) are represented by a
queryable table with at least one column. However, unlike the older JCR-SQL query language, these
pseudo-columns are never included in the result unless explicitly included or implicitly included with the "
SELECT *" clause.
Why did ModeShape use the "jcr" namespace prefix for some of the pseudo-columns, and "mode
" for the others? The older JCR-SQL language defined the "jcr:score", "jcr:path", and "
jcr:name" pseudo-columns, so we just use the same names. The other columns were unique to
ModeShape and are therefore defined with the "mode" namespace prefix.
Like any other column, all of these pseudo-columns can be also be used in the WHERE clause of any
JCR-SQL2 query, even if they are not included in the SELECT clause. They can be used anywhere that a
regular column can be used, including within constraints and dynamic operands. ModeShape will
automatically rewrite queries that use pseudo-columns in the dynamic operands of constraints to use the
corresponding function, such as SCORE(), PATH(), NAME(), LOCALNAME(), and DEPTH(). Additionally,
any property existence constraint using these pseudo-columns will always evaluate to 'true' (and thus
ModeShape's query optimizer will always remove such constraints from the query plan).
Page 298 of 424
ModeShape 3
The "jcr:path" pseudo-column may also be used on both sides of an equijoin constraint clause. For
example, equijoin expressions similar to:
... selector1.[jcr:path] = selector2.[jcr:path] ...
will be automatically rewritten by ModeShape's optimizer to the following form:
... ISSAMENODE(selector1,selector2) ...
As with regular columns, the pseudo-columns must be qualified with the selector name if the query contains
more than one selector.
Custom Locales
This is only available starting with ModeShape 4.5.0.Final
When comparing string properties (either as a criteria or part of an Order By clause) it's sometime
helpful to use a custom locale (other than the platform default) when performing the string comparison. You
can do this using the ModeShape Query API:
Query query = session.getWorkspace().getQueryManager().createQuery(sql, Query.JCR_SQL2,

Locale.FRENCH);
Page 299 of 424
ModeShape 3
4.1.3 Full-text search grammar

The grammar for the full-text search expressions used in the JCR-SQL2's full-text search constraint is as
follows:
FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct}

Disjunct ::= Term {Space Term}
Term ::= ['-'] SimpleTerm
SimpleTerm ::= Word | '"' Word {Space Word} '"'
Word ::= NonSpaceChar {NonSpaceChar}
Space ::= SpaceChar {SpaceChar}
NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */
SpaceChar ::= ' '
Char ::= /* Any character */
This grammar supports expressions similar to what you might provide to an internet search engine.
Essentially, it simply lists the terms or phrases that should appear (or not appear) in the applicable property
value(s). Simple terms consist of a single word (with only non-space characters), while phrases can simply
be surrounded with double quotes.
4.1.4 Example JCR-SQL2 queries

This section walks through several JCR-SQL2 example queries, describing what each one does and, in
some cases, providing alternative queries.
The basics
One of the simplest JCR-SQL2 queries finds all nodes in the current workspace of the repository:
This query will return a result set containing the "jcr:primaryType" column, since the nt:base node type
defines only one single-valued, non-residual property called "jcr:primaryType".
Page 300 of 424
ModeShape 3
ModeShape does not currently support returning multi-valued properties in result sets. This is
permitted by the JCR 2.0 specification. ModeShape does, however, support using multi-valued
properties in constraints and ORDER BY clauses.
Since our query used "SELECT *", ModeShape also includes the five non-standard pseudo-columns
mentioned above: "jcr:path", "jcr:score", "jcr:name", "mode:localName", and "mode:depth".
These columns are very convenient to have in the results, but also make certain criteria much easier than
with the corresponding standard or ModeShape-specific functions.
Queries can explicitly specify the columns that are to be returned in the results. The following query is very
similar to the previous query and will return the same rows, but the result set will have only a single column
and will not include any of the pseudo-columns:
SELECT [jcr:primaryType] FROM [nt:base]
The following query will return the same rows as in the previous two queries, but the SELECT clause
explicitly includes only two of the pseudo-columns for the path and depth (which are computed from the
nodes' locations):
SELECT [jcr:primaryType], [jcr:path], [mode:depth] FROM [nt:base]
In JCR-SQL2, a table representing a particular node type will have a column for each of the node type's
property definitions, including those inherited from supertypes. For example, the nt:file node type, its
nt:hierarchyNode supertype, and the mix:created mixin type are defined using the CND notation as
follows:
[mix:created] mixin
Therefore, the table representing the nt:file node type will have 3 columns: the "jcr:created" and "
jcr:createdBy" columns inherited from the mix:created mixin node type (via the nt:hierarchyNode
node type), and the "jcr:primaryType" column inherited from the nt:base node type, which is the
implicit supertype of the nt:hierarchyNode (and all node types).
ModeShape adheres to this behavior with the exception that a "SELECT *" will result in the additional
pseudo-columns. Thus, this next query:
Page 301 of 424
ModeShape 3
SELECT * FROM [nt:file]
is equivalent to this query:
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy],

[jcr:path], [jcr:name], [jcr:score], [mode:localName], [mode:depth]
FROM [nt:file]
Using columns in constraints

Next, let's look at a query that selects some of the available columns from the nt:file table and uses a
constraint to ensure the resulting file nodes have names that end in '.txt':
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path] FROM [nt:file]

WHERE LOCALNAME() LIKE '%.txt'
ModeShape also supports placing criteria against the mode:localName pseudo-column instead of using
the LOCALNAME() function. Such a query is equivalent to the previous query and will produce the exact
same results:
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path]

FROM [nt:file]
WHERE [mode:localName] LIKE '%.txt'
ModeShape's pseudo-columns are often far easier to use than the corresponding function-like
constraints.
Although this query looks much more like SQL, the use of the '[' and ']' characters to quote the identifiers is
not typical of a SQL dialect. ModeShape actually supports the using double-quote characters and square
braces interchangeably around identifiers (although they must match around any single identifier). Again,
this next query, which looks remarkably like any SQL-92 or -99 dialect, is functionally identical to the
previous two queries:
SELECT "jcr:primaryType", "jcr:created", "jcr:createdBy", "jcr:path" FROM "nt:file"

WHERE "mode:localName" LIKE '%.txt'
Inner joins
In JCR-SQL2, a node will appear as a row in each table that corresponds to the node types defined by that
node's primary type or mixin types, or any supertypes of these node types. In other words, a node will
appear in the table corresponding to each node type for which Node.isNodeType(...) returns true.
Page 302 of 424
ModeShape 3
For example, consider a node that has a primary type of nt:file but has an explicit mixin of
mix:referenceable. This node will appear as a row in the all of these tables:
nt:file
mix:referenceable
nt:hierarchyNode
mix:created
nt:base
However, the columns in each of these tables will differ. The nt:file node type has the
nt:hierarchyNode, mix:created, and nt:base for supertypes, and therefore the table for nt:file
contains columns for the property definitions on all of these types. But because mix:referenceable is not
a supertype of nt:file, the table for nt:file will not contain a jcr:uuid column. To obtain a single
result set that contains columns for all the properties of our node, we need to perform an identity join.
The next query shows how to return all properties for nt:file nodes that are also mix:referenceable:
SELECT file.*, ref.*

FROM [nt:file] AS file
JOIN [mix:referenceable] AS ref
ON ISSAMENODE(file,ref)
Since wildcards were used in the SELECT clause, ModeShape expands the SELECT clause to include the
columns for all (explicit and inherited) property definitions of each type plus pseudo-columns for each type,
which is equivalent to:
SELECT file.[jcr:primaryType],
file.[jcr:created],
file.[jcr:createdBy],
file.[jcr:path],
file.[jcr:name],
file.[jcr:score],
file.[mode:localName],
file.[mode:depth],
ref.[jcr:path],
ref.[jcr:name],
ref.[jcr:score],
ref.[mode:localName],
ref.[mode:depth],
ref.[jcr:uuid]
Note because we are using an identity join, the "file.[jcr:path]" column will contain the same value as
the "ref.[jcr:path]".
Page 303 of 424
ModeShape 3
Fully-expand the SELECT clause to specify exactly the columns that you want, excluding the
columns that return the same values or return values not needed by your application. This can also
make the query a bit more efficient, since less data needs to be found and returned.
By the way, this is also what many well-written applications do when querying SQL databases.
Here is a query that does this by eliminating columns with duplicate values and using aliases that are simpler
than the namespace-qualified names:
SELECT file.[jcr:primaryType] AS primaryType,

file.[jcr:created] AS created,
file.[jcr:createdBy] AS createdBy,
ref.[jcr:uuid] AS uuid,
file.[jcr:path] AS path,
file.[jcr:name] AS name,
file.[jcr:score] AS score,
file.[mode:localName] AS localName,
file.[mode:depth] AS depth
Although this query looks much more like SQL, use of the '[' and ']' characters in JCR-SQL2 to quote the
identifiers is not typical of a SQL dialect. Again, ModeShape supports the using double-quote characters and
square braces interchangeably around identifiers (although they must match around any single identifier).
This makes it easier for existing SQL-oriented tools and applications to work more readily with ModeShape,
including applications that use ModeShape's JDBC driver to query a ModeShape JCR repository.
This next query, which looks remarkably like any SQL-92 or -99 dialect, is functionally identical to the
previous query. However, it uses double quotes and a pseudo-column identity constraint on " jcr:path"
(which is identical in semantics and performance as the "ISSAMENODE(...)" constraint):
SELECT file."jcr:primaryType" AS primaryType,

file."jcr:created" AS created,
file."jcr:createdBy" AS createdBy,
ref."jcr:uuid" AS uuid,
file."jcr:path" AS path,
file."jcr:name" AS name,
file."jcr:score" AS score,
file."mode:localName" AS localName,
file."mode:depth" AS depth
FROM "nt:file" AS file
JOIN "mix:referenceable" AS ref
ON file."jcr:path" = ref."jcr:path"
Page 304 of 424
ModeShape 3
When using joins and selecting multiple columns, use aliases on the columns to make it easier to
reference those columns in constraints and ordering clauses.
Other joins
These are examples of two-way inner joins, but ModeShape supports joining multiple tables together in a
single query. ModeShape also supports a variety of joins, including:
INNER JOIN (or just JOIN)
LEFT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
CROSS JOIN
Set operations: unions, intersections, and complements

ModeShape also supports several other query features beyond JCR-SQL2. One of these is support for set
queries that use:
UNION and UNION ALL
INTERSECT and INTERSECT ALL
EXCEPT and EXCEPT ALL.
Here is an example of a union:
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path] FROM [nt:file]

UNION
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path] FROM [nt:folder]
Page 305 of 424
ModeShape 3
Subqueries
ModeShape also supports using (non-correlated) subqueries within the WHERE clause and wherever a static
operand can be used. Subqueries can even be used within another subquery. All subqueries, though,
should return a single column (all other columns will be ignored), and each row's single value will be
treated as a literal value. If the subquery is used in a clause that expects a single row (e.g., in a comparison),
only the subquery's first row will be used.
Subqueries in ModeShape are a powerful and easy way to use more complex criteria that is a function of the
content in the repository, without having to resort to multiple queries and complex application logic, such as
taking the results of one query and dynamically generating the criteria of another query.
Here's an example of a query that finds all nt:file nodes in the repository whose paths are referenced in
the value of the vdb:originalFile property of the vdb:virtualDatabase nodes. (This query also
uses the "$maxVersion" variable in the subquery.)
SELECT [jcr:primaryType], [jcr:created], [jcr:createdBy], [jcr:path]

FROM [nt:file]
WHERE PATH() IN (
SELECT [vdb:originalFile] FROM [vdb:virtualDatabase]
WHERE [vdb:version] <= $maxVersion
AND CONTAINS([vdb:description],'xml OR xml maybe')
)
Without subqueries, this query would need to be broken into two separate queries: the first would find all of
the paths referenced by the vdb:virtualDatabase nodes matching the version and description criteria,
followed by one (or more) subsequent queries to find the nt:file nodes with the paths expressed as literal
values (or variables).
Using a subquery is not only easier to implement and understand, it is actually more efficient.
4.2 SQL
The JCR-SQL query language is defined by the JCR 1.0 specification as a way to express queries using
strings that are similar to SQL. Support for the language is optional, and in fact this language was
deprecated in the JCR 2.0 specification in favor of the improved and more powerful (and more SQL-like)
JCR-SQL2 language.
As an aside, ModeShape's parser for JCR-SQL queries is actually just a simplified and more
limited version of the parser for JCR-SQL2 queries. All other processing, however, is done in
exactly the same way.
Page 306 of 424
ModeShape 3
The JCR 2.0 specification defines how nodes in a repository are mapped onto relational tables queryable
through a SQL-like language, including JCR-SQL and JCR-SQL2. Basically, each node type is mapped as a
relational view with a single column for each of the node type's (residual and non-residual) property
definitions. Conceptually, each node in the repository then appears as a record inside the view
corresponding to the node type for which "Node.isNodeType(nodeTypeName)" would return true.
Since each node likely returns true from this method for multiple node type (e.g., the primary node type, the
mixin types, and all supertypes of the primary and mixin node types), all nodes will likely appear as records
in multiple views. And since each view only exposes those properties defined by (or inherited by) the
corresponding node type, a full picture of a node will likely require joining the views for multiple node types.
This special kind of join, where the nodes have the same identity on each side of the join, is referred to as an
identity join, and is handled very efficiently by ModeShape.
ModeShape includes support for the JCR-SQL language, and adds several extensions to make it even more
powerful and useful:
Support for the UNION, INTERSECT, and EXCEPT set operations on multiple result sets to form a
single result set. As with standard SQL, the result sets being combined must have the same columns.
The UNION operator combines the rows from two result sets, the INTERSECT operator returns the
difference between two result sets, and the EXCEPT operator returns the rows that are common to two
result sets. Duplicate rows are removed unless the operator is followed by the ALL keyword. For
detail, see the grammar for set queries.
Removal of duplicate rows in the results, using "SELECT DISTINCT ...".
Limiting the number of rows in the result set with the "LIMIT count" clause, where count is the
maximum number of rows that should be returned. This clause may optionally be followed by the "
OFFSET number" clause to specify the number of initial rows that should be skipped.
Support for the IN and NOT IN clauses to more easily and concisely supply multiple of discrete static
operands. For example, "WHERE ... prop1 IN (3,5,7,10,11,50) ...".
Support for the BETWEEN clause to more easily and concisely supply a range of discrete operands.
For example, "WHERE ... prop1 BETWEEN 3 EXCLUSIVE AND 10 ...".
Support for (non-correlated) subqueries in the WHERE clause, wherever a static operand can be used.
Subqueries can even be used within another subquery. All subqueries must return a single column,
and each row's single value will be treated as a literal value. If the subquery is used in a clause that
expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the
subquery is used in a clause that allows multiple values (e.g., IN (...)), then all of the subquery's
rows will be used. For example, this query "WHERE ... prop1 IN ( SELECT my:prop2 FROM
my:type2 WHERE my:prop3 < '1000' ) AND ..." will use the results of the subquery as the
literal values in the IN clause.
The grammar for the JCR-SQL query language is actually a superset of that defined by the JCR 1.0
specification, and as such the complete grammar is included here.
Page 307 of 424
ModeShape 3
The grammar is presented using the same EBNF nomenclature as used in the JCR 1.0
specification. Terms are surrounded by '[' and ']' denote optional terms that appear zero or one
times. Terms surrounded by '{' and '}' denote terms that appear zero or more times. Parentheses
are used to identify groups, and are often used to surround possible values. Literals (or keywords)
are denoted by single-quotes.
4.2.1 Grammar
QueryCommand ::= Query | SetQuery
SetQuery ::= Query ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query
{ ('UNION'|'INTERSECT'|'EXCEPT') ['ALL'] Query }
Query ::= Select From [Where] [OrderBy] [Limit]
Select ::= 'SELECT' ('*' | Proplist )
From ::= 'FROM' NtList
Where ::= 'WHERE' WhereExp
OrderBy ::= 'ORDER BY' propname [Order] {',' propname [Order]}
Order ::= 'DESC' | 'ASC'
Proplist ::= propname {',' propname}
NtList ::= ntname {',' ntname}
WhereExp ::= propname Op value |
propname 'IS' ['NOT'] 'NULL' |
like |
contains |
whereexp ('AND'|'OR') whereexp |
'NOT' whereexp |
'(' whereexp ')' |
joinpropname '=' joinpropname |
between |
propname ['NOT'] 'IN' '(' value {',' value } ')'
Op ::= '='|'>'|'<'|'>='|'<='|'<>'
joinpropname ::= quotedjoinpropname | unquotedjoinpropname
quotedjoinpropname ::= ''' unquotedjoinpropname '''
unquotedjoinpropname ::= ntname '.jcr:path'
propname ::= quotedpropname | unquotedpropname
quotedpropname ::= ''' unquotedpropname '''
unquotedpropname ::= /* A property name, possible a pseudo-property: jcr:score or jcr:path */
ntname ::= quotedntname | unquotedntname
Page 308 of 424
ModeShape 3
quotedntname ::= ''' unquotedntname '''
unquotedntname ::= /* A node type name */
value ::= literal | subquery
literal ::= ''' literalvalue ''' | literalvalue
literalvalue ::= /* A property value (in standard string form) */
subquery ::= '(' QueryCommand ')' | QueryCommand
like ::= propname 'LIKE' likepattern [ escape ]
likepattern ::= ''' likechar { likepattern } '''
likechar ::= char | '%' | '_'
escape ::= 'ESCAPE' ''' likechar '''
char ::= /* Any character valid within the string representation of a value
except for the characters % and _ themselves. These must be escaped */
contains ::= 'CONTAINS(' scope ',' searchexp ')'
scope ::= unquotedpropname | '.'
searchexp ::= ''' exp '''
exp ::= ['-']term {whitespace ['OR'] whitespace ['-']term}
term ::= word | '"' word {whitespace word} '"'
word ::= /* A string containing no whitespace */
whitespace ::= /* A string of only whitespace*/
between ::= propname ['NOT'] 'BETWEEN' lowerBound ['EXCLUSIVE']
'AND' upperBound ['EXCLUSIVE']
lowerBound ::= value
upperBound ::= value
Limit ::= 'LIMIT' count [ 'OFFSET' offset ]
count ::= /* Positive integer value */
offset ::= /* Non-negative integer value */
4.3 XPath
The JCR 1.0 specification uses the XPath query language because node structures in JCR are very
analogous to the structure of an XML document. Thus, XPath provides a useful language for selecting and
searching workspace content. And since JCR 1.0 defines a mapping between XML and a workspace view
called the "document view", adapting XPath to workspace content is quite natural.
A JCR XPath query specifies the subset of nodes in a workspace that satisfy the constraints defined in the
query. Constraints can limit the nodes in the results to be those nodes with a specific (primary or mixin) node
type, with properties having particular values, or to be within a specific subtree of the workspace. The query
also defines how the nodes are to be returned in the result sets using column specifiers and ordering
specifiers.
Page 309 of 424
ModeShape 3
ModeShape offers a bit more functionality in the "jcr:contains(...)" clauses than required by the
specification. In particular, the second parameter specifies the search expression, and for these ModeShape
accepts full-text search language expressions, including wildcard support.
As an aside, ModeShape actually implements XPath queries by transforming them into the
equivalent JCR-SQL2 representation. And the JCR-SQL2 language, although often more verbose,
is much more capable of representing complex queries with multiple combinations of type,
property, and path constraints.
4.3.1 Column Specifiers

JCR 1.0 specifies that support is required only for returning column values based upon single-valued,
non-residual properties that are declared on or inherited by the node types specified in the type constraint.
ModeShape follows this requirement, and does not specifying residual properties. However, ModeShape
does allow multi-valued properties to be specified as result columns. And as per the specification,
ModeShape always returns the "jcr:path" and "jcr:score" pseudo-columns.
ModeShape uses the last location step with an attribute axis to specify the properties that are to be returned
as result columns. Multiple properties are specified with a union. For example, the following table shows
several XPath queries and how they map to JCR-SQL2 queries.
XPath
JCR-SQL2
//*
//element(*,my:type)
SELECT * FROM [my:type]
//element(*,my:type)/@my:title
SELECT [my:title] FROM [my:type]
//element(*,my:type)/(@my:title |
SELECT [my:title], [my:text] FROM
@my:text)
[my:type]
//element(*,my:type)/(@my:title union
SELECT [my:title], [my:text] FROM
@my:text)
[my:type]
Specifying result set columns
Page 310 of 424
ModeShape 3
4.3.2 Type Constraints

JCR 1.0 specifies that support is required only for specifying constraints of one primary type, and it is
optional to support specifying constraints on one (or more) mixin types. The specification also defines that
the XPath element test be used to test against node types, and that it is optional to support element tests
on location steps other than the last one. Type constraints are inherently inheritance-sensitive, in that a
constraint against a particular node type 'X' will be satisfied by nodes explicitly declared to be of type 'X' or of
subtypes of 'X'.
ModeShape does support using the element test to test against primary or mixin type. ModeShape also
only supports using an element test on the last location step. For example, the following table shows
XPath
JCR-SQL2
//*
//element(*,my:type)
/jcr:root/nodes/element(*,my:type)

WHERE PATH([my:type])> LIKE
'/nodes/%'
{{AND DEPTH([my:type]) = CAST(2 AS
LONG) }}
/jcr:root/nodes//element(*,my:type)

WHERE PATH([my:type]) LIKE
'/nodes/%'
/jcr:root/nodes//element(ex:nodeName,my:type) SELECT * FROM [my:type]

WHERE PATH([my:type]) LIKE
'/nodes/%'
AND NAME([my:type]) =
'ex:nodeName'
Specifying type constraints
Note that the JCR-SQL2 language supported by ModeShape is far more capable of joining multiple sets of
nodes with different type, property and path constraints.
4.3.3 Property Constraints

JCR 1.0 specifies that attribute tests on the last location step is required, but that predicate tests on any
other location steps are optional.
Page 311 of 424
ModeShape 3
ModeShape does support using attribute tests on the last location step to specify property constraints, as
well as supporting axis and filter predicates on other location steps. For example, the following table shows
XPath
JCR-SQL2
//*[]

WHERE [nt:base].prop1 IS NOT NULL
//element(*,my:type)[@prop1]

WHERE [my:type].prop1 IS NOT NULL
//element(*,my:type)[@prop1=xs:boolean('true')] SELECT * FROM [my:type]

WHERE [my:type].prop1 = CAST('true'
AS BOOLEAN)
//element(*,my:type)[@id<1 and @name='john']

WHERE id < 1 AND name = 'john'
//element(*,my:type)[a/b/@id]

JOIN [nt:base] as nodeSet1
ON ISCHILDNODE(nodeSet1,[my:type])
ON ISCHILDNODE(nodeSet2,nodeSet1)
WHERE (NAME(nodeSet1) = 'a'
{{AND NAME(nodeSet2) = 'b') }}
AND nodeSet2.id IS NOT NULL]
//element(,my:type)[./{}{}/*/@id]

ON ISCHILDNODE(nodeSet1,[my:type])
ON ISCHILDNODE(nodeSet2,nodeSet1)
WHERE nodeSet2.id IS NOT NULLL
//element(*,my:type)[.//@id]

ON
ISDESCENDANTNODE(nodeSet1,[my:type])
WHERE nodeSet2.id IS NOT NULLL
Specifying property constraints

Section 6.6.3.3 of the JCR 1.0 specification contains an in-depth description of property value constraints
using various comparison operators.
Page 312 of 424
ModeShape 3
4.3.4 Path Constraints

JCR 1.0 specifies that exact, child node, and descendants-or-self path constraints be supported on the
location steps in an XPath query.
ModeShape does support the four kinds of path constraints. For example, the following table shows several
XPath queries and how they map to JCR-SQL2 queries.
XPath
JCR-SQL2
/jcr:root/a[1]/b[2]

WHERE PATH([nt:base]) = '/a[1]/b[2]'
/jcr:root/a/b[*]

WHERE PATH([nt:base]) = '/a[%]/b[%]'
/jcr:root/a[1]/b[*]

WHERE PATH([nt:base]) = '/a[%]/b[%]'
/jcr:root/a[2]/b

WHERE PATH([nt:base]) = '/a[2]/b[%]'
/jcr:root/a/b[2]//c[4]

WHERE PATH([nt:base]) = '/a[%]/b[2]/c[4]'
OR PATH(nodeSet1) LIKE
'/a[%]/b[\2]/%/c[\4]'
/jcr:root/a/b//c//d

WHERE PATH([nt:base]) =
'/a[%]/b[%]/c[%]/d[%]'
OR PATH([nt:base]) LIKE
'/a[%]/b[%]/%/c[%]/d[%]'
'/a[%]/b[%]/c[%]/%/d[%]'
'/a[%]/b[%]/%/c[%]/%/d[%]'
//element(*,my:type)[@id<1 and
@name='john']
WHERE id < 1 AND name = 'john'
/jcr:root/a/b//element(*,my:type)

WHERE PATH([my:type]) = '/a[%]/b[%]/%'
Specifying path constraints

Note that the JCR-SQL2 language supported by ModeShape is capable of representing a wider combination
of path constraints, although the XPath expressions are easier to understand and significantly shorter.
Page 313 of 424
ModeShape 3
Also, path constraints in XPath do not need to specify wildcards for the same-name-sibling (SNS) indexes,
as XPath should naturally find all nodes regardless of the SNS index, unless the SNS index is explicitly
specified. In other words, any path segment that does not have an explicit SNS index (or an SNS index of
'[%]' or '[]') will match _all SNS index values. However, any segments in the path expression that have an
explicit numeric SNS index will require an exact match. Thus this path constraint:
/a/b/c\[2]/d\[%]/\%/e\[_]
will effectively be converted into
/a[%]/b[%]/c\[2]/d\[%]/\%/e\[_]
This behavior is very different than how JCR-SQL and JCR-SQL2 path constraints are handled, since these
languages interpret a lack of a SNS index as equating to '[1]'. To achieve the XPath-like matching, a query
written in JCR-SQL or JCR-SQL2 would need to explicitly include '[%]' in each path segment where an SNS
index literal is not already specified.
Page 314 of 424
ModeShape 3
4.3.5 Ordering Specifiers

JCR 1.0 extends the XPath grammar to add support for ordering the results according to the natural ordering
of the values of one or more properties on the nodes.
ModeShape does support zero or more ordering specifiers, including whether each specifier is ascending or
descending. If no ordering specifiers are defined, the ordering of the results is not predefined and may vary
(though ordering by score may be used by default). For example, the following table shows several XPath
queries and how they map to JCR-SQL2 queries.
XPath
JCR-SQL2
//element(,) order by @title
SELECT nodeSet1.title
FROM [nt:base] AS nodeSet1
ORDER BY nodeSet1.title
//element(,) order by jcr:score()
SELECT *
ORDER BY SCORE(nodeSet1)
//element(*,my:type) order by jcr:score(my:type) SELECT *

FROM [my:type] AS nodeSet1
ORDER BY SCORE(nodeSet1)
//element(,) order by @jcr:path
SELECT jcr:path
ORDER BY PATH(nodeSet1)
//element(,) order by @title, @jcr:score
SELECT nodeSet1.title
ORDER BY nodeSet1.title,
SCORE(nodeSet1)
Specifying result ordering

Note that the JCR-SQL2 language supported by ModeShape has a far richer ORDER BY clause, allowing the
use of any kind of dynamic operand, including ordering upon arithmetic operations of multiple dynamic
operands.
Page 315 of 424
ModeShape 3
4.3.6 Miscellaneous
JCR 1.0 defines a number of other optional and required features, and these are summarized in this section.
Only abbreviated XPath syntax is supported.
Only the child axis (the default axis, represented by '/' in abbreviated syntax),
descendant-or-self axis (represented by '//' in abbreviated syntax), self axis (represented by '.'
in abbreviated syntax), and attribute axis (represent by '@' in abbreviated syntax) are supported.
The text() node test is not supported.
The element() node test is supported.
The jcr:like() function is supported.
The jcr:contains() function is supported.
The jcr:score() function is supported.
The jcr:deref() function is not supported.
4.4 JQOM
JCR 2.0 introduces a new API for programmatically constructing a query. This API allows the client to
construct the lower-level objects for each part of the query, and is a great fit for applications that would
otherwise need to dynamically generate query expressions using fairly complicated string manipulation.
Using this API is a matter of getting the QueryObjectModelFactory from the session's QueryManager,
and using the factory to create the various components, starting with the lowest-level components. Then
these lower-level components can be passed to other factory methods to create the higher-level
components, and so on, until finally the createQuery(...) method is called to return the
QueryObjectModel.
Although the JCR-SQL2 and Query Object Model API construct queries in very different ways,
executing queries for the two languages is done in nearly the same way. The only difference is that
a JCR-SQL2 query expression must be parsed into an abstract syntax tree (AST), whereas with
the Query Object Model API your application is programmatically creating objects that effectively
are the AST. From that point on, however, all subsequent processing is done in an identical
manner for all the query languages.
However, please do not consider using the QOM API just to get a performance benefit. The
JCR-SQL2 parser is very efficient, and your application code will be far easier to understand and
maintain. Where possible, use JCR-SQL2 query expressions.
Here is a simple example that shows how this is done for the simple query "SELECT * FROM
[nt:unstructured] AS unstructNodes":
Page 316 of 424
ModeShape 3
// Obtain the query manager for the session ...

// Create a query object model factory ...
QueryObjectModelFactory factory = queryManager.getQOMFactory();
// Create the FROM clause: a selector for the [nt:unstructured] nodes ...
Selector source = factory.selector("nt:unstructured","unstructNodes");
// Create the SELECT clause (we want all columns defined on the node type) ...
Column[] columns = null;
// Create the WHERE clause (we have none for this query) ...
Constraint constraint = null;
// Define the orderings (we have none for this query)...
Ordering[] orderings = null;
// Create the query ...
QueryObjectModel query = factory.createQuery(source,constraint,orderings,columns);
// (This is the same as before.)
javax.jcr.QueryResult result = query.execute();
Obviously this is a lot more code than would be required to just submit the fixed query expression, but the
purpose of the example is to show how to use the Query Object Model API to build a query that you can
easily understand. In fact, most Query Object Model queries will create the columns, orderings, and
constraints using the QueryObjectModelFactory, whereas the example above just assumes all of the
columns, no orderings, and no constraints.
Once your application executes the QueryResult, processing the results is exactly the same as when
using the JCR Query AP. This is because all of the query languages are represented internally and executed
in exactly the same manner. For the sake of completion, here's the code to process the results by iterating
over the nodes:

...
}
or iterating over the rows in the results:
Page 317 of 424
ModeShape 3

// Iterate over the column values in each row ...
...
}
// Or access the column values by name ...
...
}
}
4.5 Full text search

There are times when a formal structured query language is overkill, and the easiest way to find the right
content is to perform a search, like you would with a search engine such as Google or Yahoo! This is where
ModeShape's full-text search language comes in, because it allows you to use the JCR query API but with
a far simpler, Google-style search grammar.
This query language is actually defined by the JCR 2.0 specification as the full-text search expression
grammar used in the second parameter of the CONTAINS(...) function of the JCR-SQL2 language. We
just pulled it out and made it available as a first-class query language, such that a full-text search query
supplied by the user, full-text-query, is equivalent to executing this JCR-SQL2:
SELECT * FROM [nt:base] WHERE CONTAINS([nt:base],'full-text-query')
This language allows a JCR client to construct a query to find nodes with property values that match the
supplied terms. Nodes that "best" match the terms are returned before nodes that have a lesser match. Of
course, ModeShape uses a complex system to analyze the node content and the query terms, and may
perform a number of optimizations, such as (but not limited to) eliminating stop words (e.g., "the", "a", "and",
etc.), treating terms independent of case, and converting words to base forms using a process called
stemming (e.g., "running" into "run", "customers" into "customer").
Search terms can also include phrases by simply wrapping the phrase with double-quotes. For example, the
search term 'table "customer invoice"' would rank higher those nodes with properties containing the
phrase "customer invoice" than nodes with properties containing just "customer" or "invoice".
Term in the query are implicitly AND-ed together, meaning that the matches occur when a node has property
values that match all of the terms. However, it is also possible to put an "OR" in between two terms where
either of those terms may occur.
Page 318 of 424
ModeShape 3
By default, all terms are assumed to be positive terms, in the sense that the occurrence of the term will
increase the rank of any nodes containing the value. However, it is possible to specify that terms should not
appear in the results. This is called a negative term, and it reduces the rank of any node whose property
values contain the the value. To specify a negative term, simply prefix the term with a hyphen (' -').
Each term may also contain wildcards to specify the pattern to be matched (or negated). ModeShape
supports two different sets of wildcards:
'*' matches zero or more characters, and '?' matches any single character; and
'%' matches zero or more characters, and '_' matches any single character.
The former are wildcards that are more commonly used in various systems (including older JCR repository
implementations), while the latter are the wildcards used in LIKE expressions in both JCR-SQL and
JCR-SQL2. Both families are supported for convenience, and you can also mix and match the various
wildcards, such as 'ta*bl_' and 'ta?_ble*'. (Of course, placing multiple '*' or '%' characters next to each
other offers no real benefit, as it is equivalent to a single '*' or '%'.)
If you want to use these characters literally in a term and do not want them to be treated as wildcards, they
must be escaped by prefixing them with a '{{}}' character. For example, this full text search expression:
table\* 'customer invoice\?'
will would rank higher those nodes with properties containing 'table*' (including the unescaped asterisk as
a wildcard) and those containing the phrase "customer invoice?" (including the unescaped question mark as
a wildcard). To use a literal backslash character, simply escape it as well.
When using this query language, the QueryResult always contains the "jcr:path" and "jcr:score"
columns.
ModeShape handles leading and trailing wildcards in very different ways. When trailing wildcards
are used, even a few characters preceding the wildcard can be used to quickly narrow down the
potential results using the internal reverse indexes. However, when terms start with a wildcard
ModeShape cannot use the internal reverse indexes to help narrow the results. Thus, performing a
search with a leading wildcard must be done in a pretty inefficient manner in a process that is
something analogous to a relational database's table scan. Where possible, avoid using leading
wildcards in your search terms.
Page 319 of 424
ModeShape 3
4.5.1 Grammar
The grammar for this full-text search language is specified in Section 6.7.19 of the JCR 2.0 specification, but
it is also included here as a convenience.
The grammar is presented using the same EBNF nomenclature as used in the JCR 2.0
specification. Terms are surrounded by matching square brackets (e.g., ' [' and ']') denote optional
terms that appear zero or one times. Terms surrounded by matching braces (e.g., ' }' and '{')
denote terms that appear zero or more times. Parentheses are used to identify groups, and are
often used to surround possible values.
FulltextSearch ::= Disjunct {Space 'OR' Space Disjunct}

Disjunct ::= Term {Space Term}
Term ::= ['-'] SimpleTerm
SimpleTerm ::= Word | '"' Word {Space Word} '"'
Word ::= NonSpaceChar {NonSpaceChar}
Space ::= SpaceChar {SpaceChar}
NonSpaceChar ::= Char - SpaceChar /* Any Char except SpaceChar */
SpaceChar ::= ' '
Char ::= /* Any character */
As you can see, this is a pretty simple and straightforward query language. But this language makes it
extremely easy to find all the nodes in the repository that match a set of terms.
Page 320 of 424
ModeShape 3
5 Built-in node types

The JCR 2.0 specification requires that repositories have a number of node types immediately available for
use by client applications, and ModeShape defines a number of additional node types that are installed into
every repository. None of these node types can be changed or modified.
Standard node types
ModeShape built-in node types
5.1 Standard node types

The following is the CND representation of the standard JCR built-in node types:
<jcr='http://www.jcp.org/jcr/1.0'>
// -----------------------------------------------------------------------//
Pre-defined Node Types
// -----------------------------------------------------------------------[nt:base] abstract
- jcr:primaryType (name) mandatory autocreated
protected compute
- jcr:mixinTypes (name) protected multiple compute
[nt:unstructured]
orderable
- * (undefined)
[mix:created] mixin
[nt:linkedFile] > nt:hierarchyNode
- jcr:content (reference) primary mandatory
[mix:referenceable] mixin
- jcr:uuid (string) mandatory autocreated protected initialize
[mix:mimeType] mixin
Page 321 of 424
ModeShape 3
- jcr:mimeType (string)
- jcr:encoding (string)
[mix:lastModified] mixin
- jcr:lastModified (date)
- jcr:lastModifiedBy (string)
[nt:nodeType]
- jcr:nodeTypeName (name) mandatory protected copy
- jcr:supertypes (name) multiple protected copy
- jcr:isAbstract (boolean) mandatory protected copy
- jcr:isMixin (boolean) mandatory protected copy
- jcr:isQueryable (boolean) mandatory protected copy
- jcr:hasOrderableChildNodes (boolean) mandatory protected copy
- jcr:primaryItemName (name) protected copy
+ jcr:propertyDefinition (nt:propertyDefinition) = nt:propertyDefinition sns protected copy
+ jcr:childNodeDefinition (nt:childNodeDefinition) = nt:childNodeDefinition sns protected copy
[nt:propertyDefinition]
- jcr:name (name) protected
- jcr:autoCreated (boolean) mandatory protected
- jcr:mandatory (boolean) mandatory protected
- jcr:isFullTextSearchable (boolean) mandatory protected
- jcr:isQueryOrderable (boolean) mandatory protected
- jcr:onParentVersion (string) mandatory protected
< 'COPY', 'VERSION', 'INITIALIZE', 'COMPUTE',
'IGNORE', 'ABORT'
- jcr:protected (boolean) mandatory protected
- jcr:requiredType (string) mandatory protected
< 'STRING', 'URI', 'BINARY', 'LONG', 'DOUBLE', 'DECIMAL', 'BOOLEAN',
'DATE', 'NAME', 'PATH', 'REFERENCE', 'WEAKREFERENCE', 'UNDEFINED'
- jcr:valueConstraints (string) multiple protected
- jcr:availableQueryOperators (name) mandatory multiple protected
- jcr:defaultValues (undefined) multiple protected
- jcr:multiple (boolean) mandatory protected
[nt:childNodeDefinition]
- jcr:name (name) protected
- jcr:autoCreated (boolean) mandatory protected
- jcr:mandatory (boolean) mandatory protected
- jcr:onParentVersion (string) mandatory protected
< 'COPY', 'VERSION', 'INITIALIZE', 'COMPUTE',
'IGNORE', 'ABORT'
- jcr:protected (boolean) mandatory protected
- jcr:requiredPrimaryTypes (name) = 'nt:base' mandatory protected multiple
- jcr:defaultPrimaryType (name) protected
- jcr:sameNameSiblings (boolean) mandatory protected
[nt:versionHistory] > mix:referenceable
- jcr:versionableUuid (string) mandatory autocreated protected abort
- jcr:copiedFrom (weakreference) protected abort < 'nt:version'
+ jcr:rootVersion (nt:version) = nt:version mandatory autocreated protected abort
+ jcr:versionLabels (nt:versionLabels) = nt:versionLabels mandatory autocreated protected
abort
+ * (nt:version) = nt:version protected abort
Page 322 of 424
ModeShape 3
[nt:versionLabels]
- * (reference) protected abort < 'nt:version'
[nt:version] > mix:referenceable
- jcr:created (date) mandatory autocreated protected abort
- jcr:predecessors (reference) protected multiple abort < 'nt:version'
- jcr:successors (reference) protected multiple abort < 'nt:version'
- jcr:activity (reference) protected abort < 'nt:activity'
+ jcr:frozenNode (nt:frozenNode) protected abort
[nt:frozenNode] > mix:referenceable
orderable
- jcr:frozenPrimaryType (name) mandatory autocreated protected abort
- jcr:frozenMixinTypes (name) protected multiple abort
- jcr:frozenUuid (string) mandatory autocreated protected abort
- * (undefined) protected abort
- * (undefined) protected multiple abort
+ * (nt:base) protected sns abort
[nt:versionedChild]
- jcr:childVersionHistory (reference) mandatory autocreated protected abort <
'nt:versionHistory'
[nt:query]
- jcr:statement (string)
- jcr:language (string)
[nt:activity] > mix:referenceable
- jcr:activityTitle (string) mandatory autocreated protected
[mix:simpleVersionable] mixin
- jcr:isCheckedOut (boolean) = 'true' mandatory autocreated protected ignore
[mix:versionable] > mix:simpleVersionable, mix:referenceable mixin
- jcr:versionHistory (reference) mandatory protected ignore < 'nt:versionHistory'
- jcr:baseVersion (reference) mandatory protected ignore < 'nt:version'
- jcr:predecessors (reference) mandatory protected multiple ignore < 'nt:version'
- jcr:mergeFailed (reference) protected multiple abort
- jcr:activity (reference) protected < 'nt:version'
- jcr:configuration (reference) protected ignore < 'nt:configuration'
[nt:configuration] > mix:versionable
- jcr:root (reference) mandatory autocreated protected
[nt:address]
- jcr:protocol (string)
- jcr:host (string)
- jcr:port (string)
- jcr:repository (string)
- jcr:workspace (string)
- jcr:path (path)
- jcr:id (weakreference)
[nt:naturalText]
- jcr:text (string)
Page 323 of 424
ModeShape 3
- jcr:messageId (string)
// -----------------------------------------------------------------------//
Pre-defined Mixins
// -----------------------------------------------------------------------[mix:etag] mixin
- jcr:etag (string) protected autocreated
[mix:lockable] mixin
- jcr:lockOwner (string) protected ignore
- jcr:lockIsDeep (boolean) protected ignore
[mix:lifecycle] mixin
- jcr:lifecyclePolicy (reference) protected initialize
- jcr:currentLifecycleState (string) protected initialize
[mix:managedRetention] > mix:referenceable mixin
- jcr:hold (string) protected multiple
- jcr:isDeep (boolean) protected multiple
- jcr:retentionPolicy (reference) protected
[mix:shareable] > mix:referenceable mixin
[mix:title] mixin
- jcr:title (string)
- jcr:description (string)
[mix:language] mixin
- jcr:language (string)
5.2 ModeShape built-in node types

The following is the CND representation of the ModeShape-specific built-in node types. Note that many of
these outline the structure of nodes under the '/jcr:system' area of the repository and are protected
(meaning clients can view but not directly modify their content).
//-----------------------------------------------------------------------------// N A M E S P A C E S
//-----------------------------------------------------------------------------<jcr = "http://www.jcp.org/jcr/1.0">
<nt = "http://www.jcp.org/jcr/nt/1.0">
<mix = "http://www.jcp.org/jcr/mix/1.0">
<mode = "http://www.modeshape.org/1.0">
//-----------------------------------------------------------------------------// N O D E T Y P E S
//-----------------------------------------------------------------------------[mode:namespace] > nt:base
- mode:uri (string) primary protected version
- mode:generated (boolean) protected version
Page 324 of 424
ModeShape 3
[mode:namespaces] > nt:base

+ * (mode:namespace) = mode:namespace protected version
[mode:nodeTypes] > nt:base
+ * (nt:nodeType) = nt:nodeType protected version
[mode:lock] > nt:base
- mode:lockedKey (string) protected ignore
- jcr:lockOwner (string) protected ignore
- mode:lockingSession (string) protected ignore
- mode:expirationDate (date) protected ignore
- mode:sessionScope (boolean) protected ignore
- jcr:isDeep (boolean) protected ignore
- mode:isHeldBySession (boolean) protected ignore
- mode:workspace (string) protected ignore
[mode:locks] > nt:base
+ * (mode:lock) = mode:lock protected ignore
[mode:versionHistoryFolder] > nt:base
+ * (nt:versionHistory) = nt:versionHistory protected ignore
+ * (mode:versionHistoryFolder) protected ignore
[mode:versionStorage] > mode:versionHistoryFolder
[mode:system] > nt:base
+ mode:namespaces (mode:namespaces) = mode:namespaces autocreated mandatory protected abort
+ mode:locks (mode:locks) = mode:locks autocreated mandatory protected abort
+ jcr:nodeTypes (mode:nodeTypes) = mode:nodeTypes autocreated mandatory protected abort
+ jcr:versionStorage (mode:versionStorage) = mode:versionStorage autocreated mandatory protected
abort
[mode:root] > nt:base, mix:referenceable orderable
- * (undefined) multiple version
- * (undefined) version
+ jcr:system (mode:system) = mode:system autocreated mandatory protected ignore
// This is the same as 'nt:resource' (which should generally be used instead)...
[mode:resource] > nt:base, mix:mimeType, mix:lastModified
[mode:share] > mix:referenceable
// Used for non-original shared nodes, but never really
exposed to JCR clients
- mode:sharedUuid (reference) mandatory protected initialize
[mode:hashed] mixin
- mode:sha1 (string)
// A marker node type that can be used to denote areas into which files can be published.
// Published areas have optional titles and descriptions.
[mode:publishArea] > mix:title mixin
[mode:derived] mixin
- mode:derivedFrom (path) // the location of the original information from which this was
derived
- mode:derivedAt (date) // the timestamp of the last change to the original information from
Page 325 of 424
ModeShape 3
which this was derived
Page 326 of 424
ModeShape 3
6 Built-in sequencers
ModeShape comes with a number of ready-to-use sequencers. All you have to do is configure them and be
ready to work with the generated output. This section of the documentation describes each of ModeShape's
built-in sequencers.
6.1 Compact Node Type (CND) files

This sequencer processes JCR Compact Node Definition (CND) files to extract the node definitions with their
property definitions, and inserts these into the repository using aliases of the JCR built-in types. The node
structure generated by this sequencer is equivalent to the node structure used in
/jcr:system/jcr:nodeTypes.
As an example, the CND file below:
<mode = "http://www.modeshape.org/1.0">
[mode:example] mixin
- mode:name (string) multiple copy
+ mode:child (mode:example) = mode:example version
The resulting graph structure contains the node type information from the CND file above. Note that
comments are not sequenced.
Page 327 of 424
ModeShape 3
<mode:example jcr:primaryType=cnd:nodeType
cnd:isQueryable=true
cnd:hasOrderableChildNodes=false
cnd:nodeTypeName=mode:example
cnd:supertypes=[]
cnd:isAbstract=false
cnd:isMixin=true/>
<cnd:propertyDefinition cnd:requiredType=STRING
jcr:primaryType=cnd:propertyDefinition
cnd:multiple=true
cnd:autoCreated=false
cnd: >cnd:mandatory=false
cnd:defaultValues=[]
cnd:isFullTextSearchable=true
cnd:isQueryOrderable=true
cnd:name=mode:name
cnd:availableQueryOperators=[]
cnd:protected=false
cnd:valueConstraints=[] />
<cnd:childNodeDefinition jcr:primaryType=cnd:childNodeDefinition
cnd:sameNameSiblings=false
cnd:autoCreated=false
cnd: >cnd:defaultPrimaryType=mode:example
cnd:mandatory=false
cnd:name=mode:child
cnd:protected=false
cnd:requiredPrimaryTypes=[mode:example] />
This sequencer can be added to the repository configuration like so:
{
"name" : "CNDSequencer Test Repository",
"sequencing" : {
"sequencers" : {
"CND Sequencer" : {
"description" : "CND Sequencer Same Location",
"classname" : "CNDSequencer",
"pathExpressions" : [ "default://(*.cnd)/jcr:content[@jcr:data]" ]
}
}
}
}
As with other sequencers, you may want to use a more restrictive input path expression. For example, if you
only want to sequence the CND files stored anywhere under the "/global/nodeTypes/cnd" area in the
"metadata" workspace, then the path expression might be this:
Page 328 of 424
ModeShape 3
metadata:/global/nodeTypes/cnd//(*.cnd)/jcr:content[@jcr:data]
6.2 DDL files

The DDL file sequencer included in ModeShape is capable of parsing the more important DDL statements
from SQL-92, Oracle, Derby, and PostgreSQL, and constructing a graph structure containing a structured
representation of these statements. The resulting graph structure is largely the same for all dialects, though
some dialects have non-standard additions to their grammar, and thus require dialect-specific additions to
the graph structure.
The sequencer is designed to behave as intelligently as possible with as little configuration. Thus, the
sequencer automatically determines the dialect used by a given DDL stream. This can be tricky, of course,
since most dialects are very similar and the distinguishing features of a dialect may only be apparent in
some of the statements.
To get around this, the sequencer uses a "best fit" algorithm: run the DDL stream through the parser for
each of the dialects, and determine which parser was able to successfully read the greatest number of
statements and tokens.
It is possible to define which DDL dialects (or grammars) should be considered during sequencing
using the "grammars" property in the sequencer configuration. Set the values of this property to the
names of the grammars (e.g., "oracle", "postgres", "sql92", or "derby"), specified in the order they
should be used. To use a custom DDL parser not provided by ModeShape, simply provide the
fully-qualified class name of the implementation class. If this custom parser implementation is not
found on the default classpath, additional classpath URLs can be specified using the "classpath"
property of the sequencer.
One very interesting capability of this sequencer is that, although only a subset of the (more common) DDL
statements are supported, the sequencer is still extremely functional since it does still add all statements into
the output graph, just without much detail other than just the statement text and the position in the DDL file.
Thus, if a DDL file contains statements the sequencer understands and statements the sequencer does not
understand, the graph will still contain all statements, where those statements understood by the sequencer
will have full detail. Since the underlying parsers are able to operate upon a single statement, it is possible to
go back later (after the parsers have been enhanced to support additional DDL statements) and re-parse
only those incomplete statements in the graph.
At this time, the sequencer supports SQL-92 standard DDL as well as dialects from Oracle, Derby, and
PostgreSQL. It supports:
Page 329 of 424
ModeShape 3
Detailed parsing of CREATE SCHEMA, CREATE TABLE and ALTER TABLE.

Partial parsing of DROP statements
General parsing of remaining schema definition statements (i.e. CREATE VIEW, CREATE DOMAIN,
etc.
Note that the sequencer does not perform detailed parsing of SQL (i.e. SELECT, INSERT, UPDATE,
etc....) statements.
The DDL sequencer is being included as a Technology Preview. It is fully functional for the dialects listed
above and may indeed work on certain DDL files that use other dialects. But we would like to have feedback
from users, test against more DDL examples, support additional dialects, and support more kinds of DDL
statements.
Below is an example DDL schema definition statement containing table and view definition statements.
CREATE SCHEMA hollywood

CREATE TABLE films (title varchar(255), release date, producerName varchar(255))
CREATE VIEW winners AS SELECT title, release FROM films WHERE producerName IS NOT NULL;
The resulting graph structure contains the raw statement expression, pertinent table, column and key
reference information and position of the statement in the text stream (e.g., line number, column number and
character index) so the statement can be tied back to the original DDL:
Page 330 of 424
ModeShape 3
<nt:unstructured jcr:name="statements"
jcr:mixinTypes = "mode:derived"
ddl:parserId="POSTGRES">
<nt:unstructured jcr:name="hollywood" jcr:mixinTypes="ddl:createSchemaStatement"
ddl:startLineNumber="1"
ddl:startColumnNumber="1"
ddl:expression="CREATE SCHEMA hollywood"
ddl:startCharIndex="0">
<nt:unstructured jcr:name="films" jcr:mixinTypes="ddl:createTableStatement"
ddl:expression="CREATE TABLE films (title varchar(255), release date,
producerName varchar(255))"
ddl:startCharIndex="28"/>
<nt:unstructured jcr:name="title" jcr:mixinTypes="ddl:columnDefinition"
ddl:datatypeName="VARCHAR"
ddl:datatypeLength="255"/>
<nt:unstructured jcr:name="release" jcr:mixinTypes="ddl:columnDefinition"
ddl:datatypeName="DATE"/>
<nt:unstructured jcr:name="producerName" jcr:mixinTypes="ddl:columnDefinition"
ddl:datatypeName="VARCHAR"
ddl:datatypeLength="255"/>
<nt:unstructured jcr:name="winners" jcr:mixinTypes="ddl:createViewStatement"
ddl:expression="CREATE VIEW winners AS SELECT title, release FROM films
WHERE producerName IS NOT NULL;"
ddl:queryExpression="SELECT title, release FROM films WHERE producerName IS
NOT NULL"
ddl:startCharIndex="113"/>
</nt:unstructured>
Note that all nodes are of type nt:unstructured while the type of statement is identified using mixins.
Also, each of the nodes representing a statement contain: a ddl:expression property with the exact
statement as it appeared in the original DDL stream; a ddl:startLineNumber and
ddl:startColumnNumber property defining the position in the original DDL stream of the first character in
the expression; and a ddl:startCharIndex property that defines the integral index of the first character
in the expression as found in the DDL stream. All of these properties make sure the statement can be traced
back to its location in the original DDL.
To use this sequencer, simply include the modeshape-sequencer-ddl JAR in your application and
configure the repository to use this sequencer using something similar to:
Page 331 of 424
ModeShape 3
{
"name" : "DdlSequencer Test Repository",
"sequencing" : {
"sequencers" : [
{
"description" : "Ddl sequencer test",
"classname" : "DdlSequencer",
"pathExpressions" : [ "default://(*.ddl)/jcr:content[@jcr:data] => default:/ddl"
]
}
]
}
}
This will use all of the built-in grammars (e.g., "sql92", "oracle", "postgres", and "derby"). To specify a
different order or subset of the grammars, use the grammars parameter. Here's an example that just uses
the Standard grammar followed by the PostgreSQL grammar:
{
"sequencing" : {
"sequencers" : [
{
"grammars" : ["sql92", "postgres"],
]
}
]
}
}
And, to use a custom implementation simply use the fully-qualified name of the implementation class (which
must have a no-arg constructor) as the name of the grammar:
Page 332 of 424
ModeShape 3
{
"sequencing" : {
"sequencers" : {
"DDL Sequencer" : {
"grammars" : ["sql92", "postgres", "org.example.ddl.MyCustomDdlParser"],
]
}
}
}
}
6.3 Image files

This sequencer extracts metadata from JPEG, GIF, BMP, PCX, PNG, IFF, RAS, PBM, PGM, PPM and PSD
image files. This sequencer extracts the file format, image resolution, number of bits per pixel and optionally
number of images, comments and physical resolution, and then writes this information into the repository
using the following structure:
<image:metadata node of type image:metadata

jcr:encoding - optional string property for the encoding of the image
image:formatName - string property for the name of the format
image:width - optional integer property for the image's width in pixels
image:height - optional integer property for the image's height in pixles
image:bitsPerPixel- optional integer property for the number of bits per pixel
image:progressive- optional boolean property specifying whether the image is stored in a
progressive (i.e., interlaced) form
image:numberOfImages - optional integer property for the number of images stored in the
file; defaults to 1
image:physicalWidthDpi - optional integer property for the physical width of the image
in dots per inch
image:physicalHeightDpi - optional integer property for the physical height of the image
in dots per inch
image:physicalWidthInches - optional double property for the physical width of the image
in inches
image:physicalHeightInches - optional double property for the physical height of the
image in inches
This structure could be extended in the future to add EXIF and IPTC metadata as child
nodes.
For example, EXIF metadata is structured as tags in directories, where the directories
form something like namespaces,
and which are used by different camera vendors to store custom metadata.
This structure could be mapped with each directory (e.g. "EXIF" or "Nikon Makernote" or
"IPTC") as the name of a child node, with the EXIF tags values stored as either properties or
child nodes.
jcr:mimeType - optional string property for the mime type of the image />
Page 333 of 424
ModeShape 3
To use this sequencer, simply include the modeshape-sequencer-images JAR in your application and
configure the repository in a similar fashion to:
{
"name" : "Image Sequencer Config",
"sequencing" : {
"sequencers" : {
"Image Sequencer" : {
"description" : "Images sequencer",
"classname" : "ImageSequencer",
"pathExpressions" : [ "default://(*.(gif|png|pict|jpg))/jcr:content[@jcr:data]"
]
}
}
}
}
6.4 Java source and class files

The Java File and the Class File sequencers are a pair of sequencers which parse .java or .class files. They
are both located in the modeshape-sequencer-java module.
6.4.1 Node Structure

Both sequencers produce the same structure, based on the following type definition:
<class='http://www.modeshape.org/sequencer/javaclass/1.0'>
[class:annotationMember]
- class:name (string) mandatory
- class:value (string)
[class:annotation]
+ * (class:annotationMember) = class:annotationMember
[class:annotations]
+ * (class:annotation) = class:annotation
[class:field]
- class:typeClassName (string) mandatory
- class:visibility (string) mandatory < 'public', 'protected', 'package', 'private'
- class:static (boolean) mandatory
- class:final (boolean) mandatory
- class:transient (boolean) mandatory
- class:volatile (boolean) mandatory
+ class:annotations (class:annotations) = class:annotations
[class:fields]
Page 334 of 424
ModeShape 3
+ * (class:field) = class:field
[class:interfaces]
- * (string)
[class:method]
- class:returnTypeClassName (string) mandatory
- class:static (boolean) mandatory
- class:abstract (boolean) mandatory
- class:strictFp (boolean) mandatory
- class:native (boolean) mandatory
- class:synchronized (boolean) mandatory
- class:parameters (string) multiple
+ class:methodParameters (class:parameters) = class:parameters
[class:methods]
+ * (class:method) = class:method
[class:constructors]
+ * (class:method) = class:method
[class:parameters]
+ * (class:parameter) = class:parameter
[class:parameter]
- class:typeClassName (string) mandatory
[class:class]
- class:sequencedDate (date)
- class:superClassName (string)
- class:abstract (boolean) mandatory
- class:interface (boolean) mandatory
- class:strictFp (boolean) mandatory
- class:interfaces (string) multiple
+ class:constructors (class:constructors) = class:constructors
+ class:methods (class:methods) = class:methods
+ class:fields (class:fields) = class:fields
[class:enum] > class:class
- class:enumValues (string) mandatory multiple
which means that given either a .java or a .class input, the output will be:
Page 335 of 424
ModeShape 3
<nt:unstructured jcr:name="packageSegment1">
...
<nt:unstructured jcr:name="packageSegmentN">
<class:class jcr:name="ClassName">
<class:annotations jcr:name="class:annotations">
<class:annotation jcr:name="AnnotationName1"/>
...
<class:annotation jcr:name="AnnotationNameN"/>
</class:annotations>
<class:constructors jcr:name="class:constructors">
<class:constructor jcr:name="constructor parameters">
...
</class:constructor>
</class:constructors>
<class:methods jcr:name="class:methods">
<class:method jcr:name="methodName(parameters)">
...
</class:method>
</class:methods>
<class:fields jcr:name="class:fields">
<class:field jcr:name="fieldName">
...
</class:field>
</class:fields>
</class:class>
</nt:unstructured>
...
</nt:unstructured>
6.4.2 Java Source File Sequencer

This sequencer parses Java source code added to the repository and extracts the basic structure of the
classes and enumerations defined in the code. This structure includes: the package structures, class
declarations, class and member attribute declarations, class and member method declarations with
signature (but not implementation logic), enumerations with each enumeration literal value, annotations
(including annotations with RetentionPolicy.SOURCE) . After extracting this information from the source
code, the sequencer then writes this structure into the repository, where it can be further processed,
analyzed, searched, navigated, or referenced.
The org.modeshape.sequencer.javafile.JavaFileSequencer class provides a JavaBean
property that can be used to specify a custom
org.modeshape.sequencer.javafile.SourceFileRecorder implementation to use to map the
extracted metadata to an output location:
Page 336 of 424
ModeShape 3
Property
Description
sourceFileRecorderClassName Optional property that, if set, provides the name of a class that provides a
custom implementation of the SourceFileRecorder interface. This class
must have a no-argument, public constructor. If set, an instance of this
class will be created immediately and reused for all subsequent sequencing
activity for this sequencer. If this property is set to null, a default
implementation (
org.modeshape.sequencer.javafile.ClassSourceFileRecorder
) will be used. The default value of this property is null
This sequencer also had a different recorder implementation in ModeShape 2.x, but the structure
produced by this implementation did not match that produced by the ClassFileSequencer
which is used as a default recorder. Therefore, in ModeShape 3.x or later, the
OriginalFormatSourceFileRecorder class has been removed
To use this sequencer, simply include the modeshape-sequencer-java JAR (plus all of the JARs that it is
dependent upon) in your application and configure your repository similar to:
{
"name" : "Java Sequencers Test Repository",
"sequencing" : {
"sequencers" : [
{
"description" : "Java Sequencer",
"classname" : "javasourcesequencer",
"pathExpressions" : [ "default://(*.java)/jcr:content[@jcr:data] => /java" ]
}
]
}
}
Page 337 of 424
ModeShape 3
6.4.3 Java Class File Sequencer

The Java class file sequencer parses Java class file to extract metadata for the class, its methods, its fields,
and its annotations. The output of the sequencer can be customized by using the classFileRecorder or
classFileRecorderClassName properties to provide a custom implementation of the
org.modeshape.sequencer.classfile.ClassFileRecorder interface. A default implementation (
org.modeshape.sequencer.classfile.DefaultClassFileRecorder) is provided that records all
extracted metadata to the output location.
As noted previously, the org.modeshape.sequencer.classfile.ClassFileSequencer class
provides a pair of JavaBean properties that can be used to specify a custom
org.modeshape.sequencer.classfile.ClassFileRecorder implementation to use to map the
extracted metadata to an output location:
Property
Description
classFileRecorder
Optional property that, if set, provides an instance of the

org.modeshape.sequencer.classfile.ClassFileRecorder
interface that will be used for all subsequent sequencing activity for
this sequencer. If this property is set to null, a default implementation
will be used. The default value of this property is null.
classFileRecorderClassName Optional property that, if set, provides the name of a class that
provides a custom implementation of the
org.modeshape.sequencer.classfile.ClassFileRecorder
interface. This class must have a no-argument, public constructor. If
set, an instance of this class will be created immediately and reused
for all subsequent sequencing activity for this sequencer. If this
property is set to null, a default implementation will be used. The
default value of this property is null.
To use this sequencer, simply include the modeshape-sequencer-java JAR in your application and
configure your repository similar to:
{
"name" : "Java Sequencers Test Repository",
"sequencing" : {
"sequencers" : {
"Class File Sequencer" : {
"classname" : "ClassSequencer",
"pathExpression" : "default://(*.class)/jcr:content[@jcr:data] => /classes"
}
}
}
}
Page 338 of 424
ModeShape 3
6.5 Microsoft Office files

The Microsoft Sequencer is not supported in ModeShape 3.3 due to a critical flaw in the Apache
POI libraries (see https://issues.apache.org/bugzilla/show_bug.cgi?id=54682 for more information).
However, it was added back in with ModeShape 3.4.0.Final.
The Microsoft Office sequencer is included in ModeShape and processes Microsoft Office documents,
including Word documents, Excel spreadsheets, and PowerPoint presentations. With documents, the
sequencer attempts to infer the internal structure from the heading styles. With presentations, the sequencer
extracts the slides, titles, text and slide thumbnails. With spreadsheets, the sequencer extracts the names of
the sheets and text of each sheet. Also, the sequencer extracts for all the files the general file information,
including the name of the author, title, keywords, subject, comments, and various dates.
Example
This sequencer generates a simple graph structure containing a variety of metadata from the Office
document. The example below provides example output (in the JCR document view) from a Word document
sequenced into /document.
<document jcr:primaryType="msoffice:metadata"
jcr:mixinTypes="mode:derived"
msoffice:title="My Word Document"
msoffice:subject="My Subject"
msoffice:author="James Joyce"
msoffice:keywords="essay english term paper"
msoffice:comment="This is my English 101 term paper"
msoffice:template="term_paper.dot"
msoffice:last_saved_by="jjoyce"
msoffice:revision="42"
msoffice:total_editing_time="1023"
msoffice:last_printed="2011-05-12T14:33Z"
msoffice:created="2011-05-10T20:07Z"
msoffice:saved="2011-05-12T14:32Z"
msoffice:pages="14"
msoffice:words="3025"
msoffice:characters="12420"
msoffice:creating_application="MSWORD.EXE"
msoffice:thumbnail="..." />
As indicated in the CND below, sequencing Excel spreadsheets will add a msoffice:xlssheet child node
for each slide containing name (msoffice:sheet_name)and the text (msoffice:text) for each
sheet.Sequencing PowerPoint presentations adds a child node for each slide containing the title (
msoffice:title), slide text (msoffice:text), and thumbnail image (msoffice:thumbnail) for each
slide.
Page 339 of 424
ModeShape 3
[msoffice:metadata] > nt:unstructured, mix:mimeType

- msoffice:title (string)
- msoffice:subject (string)
- msoffice:author (string)
- msoffice:keywords (string)
- msoffice:comment (string)
- msoffice:template (string)
- msoffice:last_saved_by (string)
- msoffice:revision (string)
- msoffice:total_editing_time (long)
- msoffice:last_printed (date)
- msoffice:created (date)
- msoffice:saved (date)
- msoffice:pages (long)
- msoffice:words (long)
- msoffice:characters (long)
- msoffice:creating_application (string)
- msoffice:thumbnail (binary)
//Word specific data
+ msoffice:heading (msoffice:heading) sns
// PowerPoint specific data
+ msoffice:slide (msoffice:pptslide) sns
// Excel specific data
- msoffice:full_content (string)
+ msoffice:sheet (msoffice:xlssheet) sns
[msoffice:pptslide]
- msoffice:title (string)
- msoffice:text (string)
- msoffice:notes (string)
- msoffice:thumbnail (binary)
[msoffice:xlssheet]
- msoffice:sheet_name (string)
- msoffice:text (string)
[msoffice:heading]
- msoffice:heading_name (string)
- msoffice:heading_level (long)
To use this sequencer, simply include the modeshape-sequencer-msoffice JAR and all of the POI
JARs in your application and configure the repository to use this sequencer using something similar to:
Page 340 of 424
ModeShape 3
{
"name" : "MS Office Test Repository",
"sequencing" : {
"sequencers" : {
"MS Office Sequencer" : {
"description" : "Office sequencer",
"classname" : "msoffice",
"pathExpressions" : [ "default://(*.(xls|doc|ppt))/jcr:content[@jcr:data] =>
/output/$1" ]
}
}
}
}
Page 341 of 424
ModeShape 3
6.6 MP3 files

Another sequencer that is included in ModeShape is the modeshape-sequencer-mp3 sequencer project.
This sequencer processes MP3 audio files added to a repository and extracts the ID3 metadata for the file,
including the track's title, author, album name, year, and comment. After extracting this information from the
audio files, the sequencer then writes this structure into the repository, where it can be further processed,
analyzed, searched, navigated, or referenced.
This sequencer generates a node with the name mp3:metadata below the sequencing target if the
sequencing target is an existing node or if the sequencing target is a new node, it will use that as the root of
the sequencing output.
<mp3:metadata jcr:primaryType="mp3:metadata"
mode:derivedAt="2011-05-13T13:12:03.925Z"
mode:derivedFrom="/files/LOP.mp3"
mp3:title="Livin' on a Prayer"
mp3:author="Bon Jovi"
mp3:album="Slippery When Wet"
mp3:year="1986"
mp3:comment="Rock 'n' roll!" />
The CND used by this sequencer is provided below.
[mp3:metadata] > nt:unstructured, mix:mimeType

- mp3:title (string)
- mp3:author (string)
- mp3:album (string)
- mp3:year (long)
- mp3:comment (string)
To use this sequencer, simply include the modeshape-sequencer-mp3 JAR and the JAudioTagger library
in your application and configure the repository to use this sequencer using something similar to:
{
"name" : "Mp3 Sequencers Test Repository",
"sequencing" : {
"sequencers" : {
"MP3 Sequencer" : {
"description" : "Mp3s in the same location",
"classname" : "mp3",
"pathExpressions" : [ "default://(*.mp3)/jcr:content[@jcr:data]" ]
}
}
}
}
Page 342 of 424
ModeShape 3
6.7 Teiid Relational Models

Not yet ported from 2.x. See the 2.x documentation for reference: Teiid Relational Model
Sequencer
6.8 Teiid Virtual Database (VDB) files

Not yet ported from 2.x. See the 2.x documentation for reference: Teiid VDB Sequencer
6.9 Text Files

The text sequencers extract data from text streams. There are separate sequencers for character-delimited
sequencing and fixed width sequencing, but both treat the incoming text stream as a series of rows
(separated by line-terminators, as defined in BufferedReader.readLine() with each row consisting of one or
more columns. As noted above, each text sequencer provides its own mechanism for splitting the row into
columns.
The AbstractTextSequencer class provides a number of JavaBean properties that are common to both
of the concrete text sequencer classes:
Property
Description
commentMarker
Optional property that, if set, indicates that any line beginning with exactly this
string should be treated as a comment and should not be processed further. If
this value is null, then all lines will be sequenced. The default value for this
property is null
maximumLinesToRead
Optional property that, if set, limits the number of lines that will be read during
sequencing. Additional lines will be ignored. If this value is non-positive, all
lines will be read and sequenced. Comment lines are not counted towards this
total. The default value of this property is -1 (indicating that all lines should be
read and sequenced).
rowFactoryClassName Optional property that, if set, provides the fully qualified name of a class that
provides a custom implementation of the RowFactory interface. This class
must have a no-argument, public constructor. If set, an instance of this class
will be created each time that the sequencer sequences an input stream and
will be used to provide the output structure of the graph. If this property is set to
null, a default implementation will be used. The default value of this property is
null.
Page 343 of 424
ModeShape 3
Abstract Text Sequencer properties
The default row factory creates one node in the output location for each row sequenced from the source and
adds each column with the row as a child node of the row node. The output graph takes the following form
(all nodes have primary type nt:unstructured):
<graph root jcr:mixinTypes = mode:derived,

mode:derivedAt="2011-05-13T13:12:03.925Z",
mode:derivedFrom="/files/foo.dat">
+ text:row[1]
|
+ text:column[1] (jcr:mixinTypes = text:column, text:data
|
+ ...
|
+ text:column[n] (jcr:mixinTypes = text:column, text:data
+ ...
+ text:row[m]
+ text:column[1] (jcr:mixinTypes = text:column, text:data
+ ...
+ text:column[n] (jcr:mixinTypes = text:column, text:data
= <column1 data>)
= <columnN data>)
= <column1 data>)
= <columnN data>)
Delimited Text Sequencer

The DelimitedTextSequencer splits rows into columns based on a regular expression pattern. Although
the default pattern is a comma, any regular expression can be provided allowing for more sophisticated
splitting patterns.
The DelimitedTextSequencer class provides an additional JavaBean property to override the default
regular expression pattern:
Property
Description
splitPattern Optional property that, if set, sets the regular expression pattern that is used to split each
row into columns. This property may not be set to null and defaults to ",".
DelimitedTextSequencer properties
To use this sequencer, simply include the modeshape-sequencer-text JAR in your application and
configure the repository to use this sequencer using something similar to:
{
"name" : "Text Sequencers Test Repository",
"sequencing" : {
"sequencers" : [
{
"name" : "Delimited text sequencer",
"classname" : "delimitedtext",
"pathExpression" : "default:/(*.csv)/jcr:content[@jcr:data] => /delimited",
"commentMarker" : "#"
}
]
}
}
Page 344 of 424
ModeShape 3
Fixed Width Text Sequencer
The FixedWidthTextSequencer splits rows into columns based on predefined positions. The default
setting is to have a single column per row. It also provides an additional JavaBean property to override the
default start positions for each column.
Property
Description
columnStartPositions Optional property that, if set, specifies an array of integers where each value
represents the start position of each column after the first (the start position
for the first column never needs to be specified, since it is always '0'). The
default value is an empty array, implying that each row should be treated as a
single column. This property may not be set to null.
FixedWidthTextSequencer properties
To use this sequencer, simply include the modeshape-sequencer-text JAR in your application configure
the repository to use this sequencer using something similar to:
{
"name" : "Text Sequencers Test Repository",
"sequencing" : {
"sequencers" : {
"Fixed Width Text Sequencer" : {
"classname" : "fixedwidthtext",
"pathExpressions" : [ "default:/(*.txt)/jcr:content[@jcr:data] => /fixed" ],
"columnStartPositions" : [3,6],
"commentMarker" : "#"
}
}
}
}
6.10 Web Service Definition Language (WSDL) files

The WSDL sequencer included in ModeShape can parse WSDL files that adhere to the W3C's Web Service
Definition Language (WSDL) 1.1 specification, and output a representation of the WSDL file's messages,
port types, bindings, services, types (including embedded XML Schemas), documentation, and extension
elements (including HTTP, SOAP and MIME bindings). This derived information is intended to mirror the
structure and semantics of the actual WSDL files while also making it possible for ModeShape users to
easily navigate, query and search over this derived information. This sequencer captures the namespace
and names of all referenced components, and will resolve references to components appearing within the
same file.
Page 345 of 424
ModeShape 3
The design of this sequencer and it's output structure have been influenced by the SOA Repository Artifact
Model and Protocol (S-RAMP) draft specification, which is currently under development as an OASIS
Technology Committee. S-RAMP defines a model for a variety of file types, including WSDL and XSD. This
sequencer's output was designed to mirror that model, and thus some of the properties and node types used
are defined within the "sramp" namespace. However, the structure derived by the ModeShape WSDL
sequencer is a superset of that defined by S-RAMP.
The WSDL specification allows for a fair amount of variation in WSDL files, and consequently this variation is
reflected in the derived output structure.
Example
Let's look at an example WSDL file from the WSDL 1.1 specification:
<?xml version="1.0" encoding="ISO-8859-1" ?>

<?xml version="1.0"?>
<definitions name="StockQuote"
targetNamespace="http://example.com/stockquote.wsdl"
xmlns:tns="http://example.com/stockquote.wsdl"
xmlns:xsd1="http://example.com/stockquote.xsd"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns="http://schemas.xmlsoap.org/wsdl/">
<types>
<schema targetNamespace="http://example.com/stockquote.xsd"
xmlns="http://www.w3.org/2001/XMLSchema">
<element name="TradePriceRequest">
<complexType>
<all>
<element name="tickerSymbol" type="string"/>
</all>
</complexType>
</element>
<element name="TradePrice">
<complexType>
<all>
<element name="price" type="float"/>
</all>
</complexType>
</element>
</schema>
</types>
<message name="GetLastTradePriceInput">
<part name="body" element="xsd1:TradePriceRequest"/>
</message>
<message name="GetLastTradePriceOutput">
<part name="body" element="xsd1:TradePrice"/>
</message>
<portType name="StockQuotePortType">
<operation name="GetLastTradePrice">
<input message="tns:GetLastTradePriceInput"/>
<output message="tns:GetLastTradePriceOutput"/>
Page 346 of 424
ModeShape 3
</operation>
</portType>
<binding name="StockQuoteSoapBinding" type="tns:StockQuotePortType">
<soap:binding style="document"
transport="http://schemas.xmlsoap.org/soap/http"/>
<operation name="GetLastTradePrice">
<soap:operation
soapAction="http://example.com/GetLastTradePrice"/>
<input>
<soap:body use="literal"/>
</input>
<output>
<soap:body use="literal"/>
</output>
</operation>
</binding>
<service name="StockQuoteService">
<documentation>My first service</documentation>
<port name="StockQuotePort" binding="tns:StockQuoteBinding">
<soap:address location="http://example.com/stockquote"/>
</port>
</service>
</definitions>
This WSDL definition includes an embedded XML Schema that defines the structure of two XML elements
used in the web service messages, and it defines a 'StockQuotePortType' port type with input and output
messages, a SOAP binding, and a SOAP service. The WSDL sequencer will derive from this file the
following content:
<stockQuote.wsdl jcr:primaryType=wsdl:wsdlDocument jcr:mixinTypes=[mode:derived]

- jcr:uuid=d69d9fac-c5b5-42fc-ae70-0947d5986744
- sramp:contentSize=2210
- sramp:contentType="application/wsdl"
<wsdl:schema jcr:primaryType=xs:schemaDocument
- jcr:uuid=8e0b8a17-11d2-4611-bc83-ef067526329c
- sramp:contentType="application/xsd"
- targetNamespace="http://example.com/stockquote.xsd"
- xmlns:xmlns="http://www.w3.org/2001/XMLSchema"
<TradePriceRequest jcr:primaryType=xs:elementDeclaration
- jcr:uuid=8407370d-9c6a-43ad-84ee-53f480524432
- xs:abstract=false
- xs:form="qualified"
- xs:namespace=http://example.com/stockquote.xsd
- xs:ncName="TradePriceRequest"
- xs:nillable=false
- xs:typeNamespace=http://example.com/stockquote.xsd
<xs:complexType jcr:primaryType=xs:complexTypeDefinition
- jcr:uuid=5afc2fe3-e6c3-4cc2-8667-e5a9faf8963d
- xs:abstract=false
- xs:baseTypeName="anyType"
- xs:baseTypeNamespace="http://www.w3.org/2001/XMLSchema"
- xs:method="restriction"
Page 347 of 424
ModeShape 3
- xs:mixed=false
<xs:all jcr:primaryType=xs:all
- jcr:uuid=e491f657-c20a-43e7-99b7-e5f76778c11e
- xs:maxOccurs=1
- xs:minOccurs=1
<tickerSymbol jcr:primaryType=xs:elementDeclaration
- jcr:uuid=22c26a7f-e9fa-4c44-a346-7df3cf436c7a
- id="string"
- name="string"
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="tickerSymbol"
- xs:nillable=false
- xs:typeName="string"
- xs:typeNamespace=http://www.w3.org/2001/XMLSchema
<TradePrice jcr:primaryType=xs:elementDeclaration
- jcr:uuid=5667cfcc-d87e-4ef3-811c-4e64dc27f263
- xs:abstract=false
- xs:ncName="TradePrice"
- xs:nillable=false
- xs:typeNamespace=http://example.com/stockquote.xsd
<xs:complexType jcr:primaryType=xs:complexTypeDefinition
- jcr:uuid=b2eb5936-4a12-4d2f-854c-ca4b251c6a74
- xs:abstract=false
- xs:mixed=false
<xs:all jcr:primaryType=xs:all jcr:uuid=57d8f62f-71b1-44c7-8807-a1faac3582a4
- xs:maxOccurs=1
- xs:minOccurs=1
<price jcr:primaryType=xs:elementDeclaration
- jcr:uuid=049a905c-1c1d-4122-aa2f-7d2fe7d45bef
- id="float"
- name="float"
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="price"
- xs:nillable=false
- xs:typeName="float"
<wsdl:messages jcr:primaryType=wsdl:messages
- jcr:uuid=3ae584b3-2807-4022-b1fb-c7d39d0cfc48
<GetLastTradePriceInput jcr:primaryType=wsdl:message
- jcr:uuid=6eac84de-e7e3-4e12-ac5e-d5a8dfe11c7f
- wsdl:namespace=http://example.com/stockquote.wsdl
- wsdl:ncName="GetLastTradePriceInput"
<body jcr:primaryType=wsdl:part
- jcr:uuid=28d5bc74-f21c-49c2-9850-a9992cbbf88e
- wsdl:elementName="TradePriceRequest"
Page 348 of 424
ModeShape 3
- wsdl:elementNamespace=http://example.com/stockquote.xsd
- wsdl:ncName="body"
<GetLastTradePriceOutput jcr:primaryType=wsdl:message
- jcr:uuid=1be232c8-898b-49ce-90c7-8e31b20f991f
- wsdl:ncName="GetLastTradePriceOutput"
<body jcr:primaryType=wsdl:part
- jcr:uuid=06feaf78-f1ce-4f6c-a8e3-65eda3d600da
- wsdl:elementName="TradePrice"
- wsdl:elementNamespace=http://example.com/stockquote.xsd
- wsdl:ncName="body"
<wsdl:portTypes jcr:primaryType=wsdl:portTypes
- jcr:uuid=44afcb97-9b19-4dd0-98ca-191ca14495b2
<StockQuotePortType jcr:primaryType=wsdl:portType
- jcr:uuid=3e81f0fd-7759-445a-b540-1253605ce0fd
- wsdl:ncName="StockQuotePortType"
<GetLastTradePrice jcr:primaryType=wsdl:operation
- jcr:uuid=bd5d2f23-5454-4de2-9962-93c30b1be6d9
- wsdl:ncName="GetLastTradePrice"
<wsdl:input jcr:primaryType=wsdl:operationInput
- jcr:uuid=fba4398b-84c8-4ebe-8eb8-f83ce867329b
- wsdl:message=6eac84de-e7e3-4e12-ac5e-d5a8dfe11c7f
- wsdl:messageName="GetLastTradePriceInput"
- wsdl:messageNamespace="http://example.com/stockquote.wsdl"
- wsdl:ncName="GetLastTradePriceRequest"
<wsdl:output jcr:primaryType=wsdl:operationOutput
- jcr:uuid=aa7a2ef8-883e-4598-a822-15283c0b63d4
- wsdl:message=1be232c8-898b-49ce-90c7-8e31b20f991f
- wsdl:messageName="GetLastTradePriceOutput"
- wsdl:messageNamespace="http://example.com/stockquote.wsdl"
- wsdl:ncName="GetLastTradePriceResponse"
<wsdl:bindings jcr:primaryType=wsdl:bindings
- jcr:uuid=f736166e-cf40-45ec-b4a4-23243e241205
<StockQuoteBinding jcr:primaryType=wsdl:binding
- jcr:uuid=b224c1f5-d223-483b-ab43-479ceef3e015
- wsdl:ncName="StockQuoteBinding"
<StockQuoteSoapBinding jcr:primaryType=wsdl:binding
- jcr:uuid=cd65da16-bc97-479c-bb27-c9766ee5c946
- wsdl:ncName="StockQuoteSoapBinding"
- wsdl:type=3e81f0fd-7759-445a-b540-1253605ce0fd
- wsdl:typeName="StockQuotePortType"
- wsdl:typeNamespace="http://example.com/stockquote.wsdl"
<GetLastTradePrice jcr:primaryType=wsdl:bindingOperation
- jcr:uuid=949919a7-23c4-4994-853a-5a14b1fd04ed
- wsdl:ncName="GetLastTradePrice"
<wsdl:input jcr:primaryType=wsdl:bindingOperationInput
- jcr:uuid=76069b3a-c73e-4c23-be3d-b0ee6f874e7a
- wsdl:input="fba4398b-84c8-4ebe-8eb8-f83ce867329b"
- wsdl:inputName="GetLastTradePriceRequest"
- wsdl:ncName="GetLastTradePriceRequest"
<wsdl:soapBody jcr:primaryType=wsdl:soapBody
- jcr:uuid=22bd5f19-5450-4720-ab23-e4d97c8adee5
- wsdl:use="literal"
<wsdl:output jcr:primaryType=wsdl:bindingOperationOutput
- jcr:uuid=03b70411-d992-41db-ade1-de70ddd7822a
Page 349 of 424
ModeShape 3
- wsdl:ncName="GetLastTradePriceResponse"
- wsdl:output="aa7a2ef8-883e-4598-a822-15283c0b63d4"
- wsdl:outputName="GetLastTradePriceResponse"
<wsdl:soapBody jcr:primaryType=wsdl:soapBody
- jcr:uuid=5d9d8127-8617-4947-b142-6d31e0b84c03
- wsdl:use="literal"
<wsdl:soapOperation jcr:primaryType=wsdl:soapOperation
- jcr:uuid=52ce3adf-b018-4148-a679-64822b870908
- wsdl:soapAction=http://example.com/GetLastTradePrice
<wsdl:soapBinding jcr:primaryType=wsdl:soapBinding
- jcr:uuid=659102a6-206e-4ebc-8d51-9b21e5dcc431
- wsdl:style="document"
- wsdl:transport=http://schemas.xmlsoap.org/soap/http
<wsdl:services jcr:primaryType=wsdl:services
- jcr:uuid=3dbd2a54-9d2d-4223-98a2-8362369e8f0d
<StockQuoteService jcr:primaryType=wsdl:service
- jcr:uuid=72420bcb-dd3f-4a5e-ba13-811af5a98bd5
- sramp:description="My first service"
- wsdl:ncName="StockQuoteService"
<StockQuotePort jcr:primaryType=wsdl:port
- jcr:uuid=24779c9f-ebe6-4030-b9cd-3f0e623b94fa
- wsdl:binding=b224c1f5-d223-483b-ab43-479ceef3e015
- wsdl:ncName="StockQuotePort"
<wsdl:soapAddress jcr:primaryType=wsdl:soapAddress
- jcr:uuid=d015a2ee-fbae-4b28-bda8-16a8295d8e02
- wsdl:soapLocation=http://example.com/stockquote
The first thing to note is that the sequencer produces a node of type wsdl:wsdlDocument that includes the
mode:derived information and information about the WSDL file itself. If the WSDL file contained
documentation elements directly under the root element, the content of those elements would have been
placed inside an sramp:description property.
Secondly, the WSDL file contains an embedded XML Schema document, and this XSD was sequenced
also. See the XML Schema sequencer documentation for the structure of the XML Schema documents. Any
references to the XSD components in the embedded schema(s) will be captured as REFERENCE properties
as well as properties containing the local name and namespace of the components.
Thirdly, there are several "container" nodes underneath the top-level wsdl:wsdlDocument node, and are
named wsdl:messages, wsdl:portTypes, wsdl:bindings, and wsdl:services. These container
nodes serve to separate out the various kinds of definitions, since per the WSDL 1.1 specification the name
scope of each kind of component is distinct from the other kinds.
Within the wsdl:messages container node are all of the messages. In this case, there are two: the "
GetLastTradePriceInput" input message and "GetLastTradePriceOutput" output message for the
"GetLastTracePrice" operation defined a bit later in the structure. Note how these messages contain the
name, namespace URI, and REFERENCE to the corresponding element node in the embedded schema
content. (If the element reference could not be resolved, REFERENCE property would not be set.)
Page 350 of 424
ModeShape 3
Within the wsdl:portTypes container node are all of the port types. In this example, there is just one: the "
StockQuotePortType" that contains a single "GetLastTradePrice" operation. Here, the operation's
input and output reference the corresponding message nodes vi the name, namespace URI, and
REFERENCE property. Again, the REFERENCE property would not be set if the input and/or output use a
message that is not in this WSDL file.
Within the wsdl:bindings container node are all of the bindings defined in the WSDL. In this example,
there is just a single binding that uses SOAP extensions, which describe all of the SOAP-specific information
for the port type. The sequencer also supports HTTP and MIME extensions. And node how the input, output
and faults of each binding operation reference (using the name, namespace URI, and REFERENCE
properties) the corresponding input, output and fault (respectively) in the correct port type.
Finally, within the wsdl:services container node are all of the services defined in the WSDL. In this
example, there is just a single SOAP service that references the "StockQuotePortType" port type.
This example shows the basic structure this sequencer derives from WSDL 1.1 files. Not only does this
structure mirror that of the actual WSDL file, but it makes this structure easy to navigate, search and query,
especially when it includes the names and namespace URIs of the referenced components (and setting
REFERENCE properties to the referenced component where possible).
Node TypesThe WSDL 1.1 sequencer follows JCR best-practices by defining all nodes to have a primary
type that allows any single or multi-valued property, meaning it's possible and valid for any node to have any
property (with single or multiple values). This sequencer doesn't add any such properties or nodes, but you
are free to annotate the structure as needed.
The compact node definitions for the "wsdl" namespace are as follows:
//-----------------------------------------------------------------------------// N A M E S P A C E S
//-----------------------------------------------------------------------------<jcr='http://www.jcp.org/jcr/1.0'>
<sramp = "http://s-ramp.org/xmlns/2010/s-ramp">
<xs = "http://www.w3.org/2001/XMLSchema">
<wsdl = "http://schemas.xmlsoap.org/wsdl/">
//-----------------------------------------------------------------------------// N O D E T Y P E S
//-----------------------------------------------------------------------------[wsdl:wsdlExtension] > sramp:derivedArtifactType
- wsdl:ncName (string)
- wsdl:namespace (uri)
[wsdl:wsdlDerivedArtifactType] > sramp:derivedArtifactType abstract
- wsdl:namespace (uri)
+ * (wsdl:wsdlExtension)
[wsdl:namedWsdlDerivedArtifactType] > wsdl:wsdlDerivedArtifactType
- wsdl:ncName (string)
Page 351 of 424
ModeShape 3
/*
* Messages and parts
*/
[wsdl:part] > wsdl:namedWsdlDerivedArtifactType
- wsdl:element (reference) < 'xs:elementDeclaration'
- wsdl:elementName (string)
- wsdl:elementNamespace (uri)
- wsdl:type (reference) < 'xs:simpleTypeDefinition'
- wsdl:typeName (string)
- wsdl:typeNamespace (uri)
[wsdl:message] > wsdl:namedWsdlDerivedArtifactType
+ * (wsdl:part) = wsdl:part multiple
/*
* Port types, operations, inputs, outputs, and faults
*/
[wsdl:operationInput] > wsdl:namedWsdlDerivedArtifactType
- wsdl:message (reference) < 'wsdl:message'
[wsdl:operationOutput] > wsdl:namedWsdlDerivedArtifactType
- wsdl:message (reference) < 'wsdl:message'
[wsdl:fault] > wsdl:namedWsdlDerivedArtifactType
- wsdl:message (reference) mandatory < 'wsdl:message'
[wsdl:operation] > wsdl:namedWsdlDerivedArtifactType
- wsdl:parameterOrder (string) multiple
+ wsdl:input (wsdl:operationInput) = wsdl:operationInput
+ wsdl:output (wsdl:operationOutput) = wsdl:operationOutput
+ wsdl:fault (wsdl:fault) = wsdl:fault sns
[wsdl:portType] > wsdl:namedWsdlDerivedArtifactType
+ * (wsdl:operation) sns
/*
* Bindings, binding operations, inputs, outputs
*/
[wsdl:bindingOperationOutput] > wsdl:namedWsdlDerivedArtifactType
- wsdl:output (reference) < 'wsdl:operationOutput'
- wsdl:outputName (string)
[wsdl:bindingOperationInput] > wsdl:namedWsdlDerivedArtifactType
- wsdl:input (reference) < 'wsdl:operationInput'
- wsdl:inputName (string)
[wsdl:bindingOperationFault] > wsdl:namedWsdlDerivedArtifactType
[wsdl:bindingOperation] > wsdl:namedWsdlDerivedArtifactType
+ wsdl:input (wsdl:bindingOperationInput) = wsdl:bindingOperationInput
+ wsdl:output (wsdl:bindingOperationOutput) = wsdl:bindingOperationOutput
+ wsdl:fault (wsdl:bindingOperationFault) = wsdl:bindingOperationFault sns
[wsdl:binding] > wsdl:namedWsdlDerivedArtifactType
- wsdl:type (reference) < 'wsdl:portType'
+ * (wsdl:bindingOperation) sns
/*
Page 352 of 424
ModeShape 3
* Ports and services
*/
[wsdl:port] > wsdl:namedWsdlDerivedArtifactType
- wsdl:binding (reference) < 'wsdl:binding'
- wsdl:bindingName (string)
- wsdl:bindingNamespace (uri)
[wsdl:service] > wsdl:namedWsdlDerivedArtifactType
+ * (wsdl:port) sns
/*
* Types, schemas, and schema references
*/
[wsdl:referencedXsd] > sramp:derivedArtifactType abstract
- xs:id (string)
- xs:schemaLocation (string)
- * (undefined)
[wsdl:importedXsd] > wsdl:referencedXsd
- xs:namespace (uri) mandatory
[wsdl:includedXsd] > wsdl:referencedXsd
[wsdl:redefinedXsd] > wsdl:referencedXsd
/*
* The containers for the different kinds of components within WSDL documents.
* Strictly speaking, the containers should not allow SNS, but these components'
* names in WSDL are QNames, and we're only using the local part for the node name.
* Therefore, two components might have the same local part but different namespaces.
* (This is probably not a common occurance.)
*/
[wsdl:container] > sramp:derivedArtifactType abstract
- * (string)
- * (string) multiple
[wsdl:messages] > wsdl:container
+ * (wsdl:message) = wsdl:message sns
[wsdl:portTypes] > wsdl:container
+ * (wsdl:portType) = wsdl:portType sns
[wsdl:bindings] > wsdl:container
+ * (wsdl:binding) = wsdl:binding sns
[wsdl:services] > wsdl:container
+ * (wsdl:service) = wsdl:service sns
/*
* WSDL documents
*/
[wsdl:wsdlDocument] > sramp:xmlDocument
- wsdl:importedXsds (weakreference) multiple < 'xs:schemaDocument'
- wsdl:includedXsds (weakreference) multiple < 'xs:schemaDocument'
- wsdl:redefinedXsds (weakreference) multiple < 'xs:schemaDocument'
- wsdl:importedWsdls (weakreference) multiple < 'wsdl:wsdlDocument'
+ wsdl:schema (xs:schemaDocument) = xs:schemaDocument sns
Page 353 of 424
ModeShape 3
+
+
+
+
+
+
+
wsdl:importedXsd (wsdl:importedXsd) sns

wsdl:includedXsd (wsdl:includedXsd) sns
wsdl:redefinedXsd (wsdl:redefinedXsd) sns
wsdl:messages (wsdl:messages) = wsdl:messages
wsdl:portTypes (wsdl:portTypes) = wsdl:portTypes
wsdl:bindings (wsdl:bindings) = wsdl:bindings
wsdl:services (wsdl:services) = wsdl:services
// ------------------------------------------------------// HTTPWSDL Model

// ------------------------------------------------------[wsdl:httpExtension] > wsdl:wsdlExtension
[wsdl:httpAddress] > wsdl:httpExtension
- wsdl:location (uri) mandatory
[wsdl:httpBinding] > wsdl:httpExtension
- wsdl:verb (string) mandatory
[wsdl:httpOperation] > wsdl:httpExtension
- wsdl:location (uri) mandatory
[wsdl:httpUrlEncoded] > wsdl:httpExtension
[wsdl:httpUrlReplacement] > wsdl:httpExtension
// ------------------------------------------------------// SOAPWSDL Model
// ------------------------------------------------------[wsdl:soapExtension] > wsdl:wsdlExtension
[wsdl:soapAddress] > wsdl:soapExtension
- wsdl:soapLocation (uri) mandatory
[wsdl:soapBinding] > wsdl:soapExtension
- wsdl:style (string)
- wsdl:transport (uri)
[wsdl:soapOperation] > wsdl:soapExtension
- wsdl:style (string)
- wsdl:soapAction (uri)
[wsdl:soapBody] > wsdl:soapExtension
- wsdl:encodingStyle (uri) multiple
- wsdl:parts (string) multiple
- wsdl:use (string) < 'literal','encoded'
[wsdl:soapFault] > wsdl:soapExtension
[wsdl:soapHeader] > wsdl:soapExtension
- wsdl:message (string)
- wsdl:part (string)
Page 354 of 424
ModeShape 3
+ * (wsdl:soapHeaderFault) = wsdl:soapHeaderFault
[wsdl:soapHeaderFault] > wsdl:soapExtension
// ------------------------------------------------------// SOAPMIME Model
// ------------------------------------------------------[wsdl:mimeExtension] > wsdl:wsdlExtension
[wsdl:mimeMultipartRelated] > wsdl:mimeExtension
+ wsdl:mimePart (wsdl:mimePart) sns
[wsdl:mimePart] > wsdl:mimeExtension
+ * (wsdl:mimeExtension) sns
[wsdl:mimeContent] > wsdl:mimeExtension
- wsdl:mimeType (string)
- wsdl:mimePart (string)
[wsdl:mimeXml] > wsdl:mimeExtension
- wsdl:mimePart (string)
Configuration
To use this sequencer, simply include the appropriate version of the Maven artifact with a "org.modeshape
" group ID and "modeshape-sequencer-wsdl" artifact ID and configure your repository similar to:
{
"name" : "WSDL Sequencer Test Repository",
"sequencing" : {
"sequencers" : {
"WSDL Sequencer" : {
"classname" : "wsdlsequencer",
"pathExpressions" : [ "default:/(*.wsdl)/jcr:content[@jcr:data] => /wsdl" ]
}
}
}
}
6.11 XML files

This sequencer stores the structure and data of an XML file into the repository. DTD, entity, comments, and
other content are maintained by the sequencer in the output structure.
Example
For this XML document:
Page 355 of 424
ModeShape 3

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
<!ENTITY % RH-ENTITIES SYSTEM "Common_Config/rh-entities.ent">
<!ENTITY versionNumber "0.1">
<!ENTITY copyrightYear "2008">
<!ENTITY copyrightHolder "Red Hat Middleware, LLC.">]>
<?target content ?>
<?target2 other stuff ?>
<Cars xmlns:jcr="http://www.jcp.org/jcr/1.0">

<Hybrid>
<car jcr:name="Toyota Prius"/>
</Hybrid>
<Sports>
</Sports>
</Cars>
The sequencer will generate this content (assuming it's output is redirected to xml/myxml)
Page 356 of 424
ModeShape 3
<xml jcr:primaryType=nt:unstructured
<myxml jcr:primaryType="modexml:document"
mode:derivedAt="2011-05-13T13:12:03.925Z"
mode:derivedFrom="/files/docForReferenceGuide.xml"
modedtd:name="book"
modedtd:publicId="-//OASIS//DTD DocBook XML V4.4//EN"
modedtd:systemId="http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<modedtd:entity jcr:primaryType="modedtd:entity"
modedtd:name="%RH-ENTITIES"
modedtd:systemId="Common_Config/rh-entities.ent" />
<modedtd:entity[2] jcr:primaryType="modedtd:entity"
modedtd:name="versionNumber"
modedtd:value="0.1" />
modedtd:name="copyrightYear"
modedtd:value="2008" />
modedtd:name="copyrightHolder"
modedtd:value="Red Hat Middleware, LLC." />
<modexml:processingInstruction jcr:primaryType="modexml:processingInstruction"
modexml:processingInstructionContent="content"
modexml:target="target" />
<modexml:processingInstruction[2] jcr:primaryType="modexml:processingInstruction"
modexml:processingInstructionContent="other stuff"
modexml:target="target2" />
<Cars jcr:primaryType="modexml:element">
<modexml:comment jcr:primaryType="modexml:comment"
modexml:commentContent="This is a comment" />
<Hybrid jcr:primaryType="modexml:element">
<car jcr:primaryType="modexml:element" />
</Hybrid>
<Sports jcr:primaryType="modexml:element" />
</Cars>
</myxml>
The CND used by this sequencer is provided below. Note that the XML sequencer will parse CDATA into its
own node in the sequenced output even though the example above does not explicitly demonstrate this.
Page 357 of 424
ModeShape 3
//-----------------------------------------------------------------------------// N A M E S P A C E S
<modexml='http://www.modeshape.org/xml/1.0'>
<modedtd='http://www.modeshape.org/dtd/1.0'>
//-----------------------------------------------------------------------------// N O D E T Y P E S
//-----------------------------------------------------------------------------[modexml:document] > nt:unstructured, mix:mimeType
- modexml:cDataContent (string)
[modexml:comment] > nt:unstructured
- modexml:commentContent (string)
[modexml:element] > nt:unstructured
[modexml:elementContent] > nt:unstructured
- modexml:elementContent (string)
[modexml:cData] > nt:unstructured
- modexml:cDataContent (string)
[modexml:processingInstruction] > nt:unstructured
- modexml:processingInstruction (string)
- modexml:target (string)
[modedtd:entity] > nt:unstructured
- modexml:name (string)
- modexml:value (string)
- modexml:publicId (string)
- modexml:systemId (string)
To use this sequencer, include modeshape-sequencer-xml.jar in your classpath and configure your
repository similar to:
{
"name" : "XML Sequencer Test Repository",
"sequencing" : {
"sequencers" : {
"XML Sequencer" : {
"classname" : "xmlsequencer",
"pathExpressions" : [ "default:/(*.xml)/jcr:content[@jcr:data] => /xml" ]
}
}
}
}
Page 358 of 424
ModeShape 3
6.12 XML Schema Document (XSD) files

The XSD sequencer included in ModeShape can parse XML Schema Documents that adhere to the W3C's
XML Schema Part 1 and Part 2 specifications, and output a representation of the XSD's attribute
declarations, element declarations, simple type definitions, complex type definitions, import statements,
include statements, attribute group declarations, annotations, other components, and even attributes with a
non-schema namespace. This derived information is intended to accurately reflect the structure and
semantics of the XSD files while also making it possible for ModeShape users to easily navigate, query and
search over this derived information. This sequencer captures the namespace and names of all referenced
components, and will resolve references to components appearing within the same files.
The design of this sequencer and it's output structure have been influenced by the SOA Repository Artifact
Model and Protocol (S-RAMP) draft specification, which is currently under development as an OASIS
Technology Committee. S-RAMP defines a model for a variety of file types, including WSDL and XSD. This
sequencer's output was designed to mirror that model, and thus some of the properties and node types used
are defined within the "sramp" namespace.
The XML Schema specification is powerful, flexible, rich, and complicated. This means that many XML
Schema Documents themselves are complicated. But it also means that there is a lot of variation in XSDs,
and consequently there is a lot of variation in the output structure that this sequencer derives from XSD files.
Example
So before we get too far, let's look at an example XML Schema Document taken from the XML Schema
Primer:
<?xml version="1.0" encoding="ISO-8859-1" ?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:annotation>
<xsd:documentation xml:lang="en">
Purchase order schema for Example.com.
Copyright 2000 Example.com. All rights reserved.
</xsd:documentation>
</xsd:annotation>
<xsd:element name="purchaseOrder" type="PurchaseOrderType"/>
<xsd:element name="comment" type="xsd:string"/>
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
Page 359 of 424
ModeShape 3
<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name"
type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city"
type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip"
type="xsd:decimal"/>
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN"
fixed="US"/>
</xsd:complexType>
<xsd:complexType name="Items">
<xsd:sequence>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment"
minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>

<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
This schema defines the structure of several XML elements used to represent purchase orders, and
describes an XML document such as the following:
Page 360 of 424
ModeShape 3
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
<zip>90952</zip>
</shipTo>
<billTo country="US">
<name>Robert Smith</name>
<street>8 Oak Avenue</street>
<city>Old Town</city>
<state>PA</state>
<zip>95819</zip>
</billTo>
<comment>Hurry, my lawn is going wild<!/comment>
<items>
<item partNum="872-AA">
<productName>Lawnmower</productName>
<quantity>1</quantity>
<USPrice>148.95</USPrice>
<comment>Confirm this is electric</comment>
</item>
<item partNum="926-AA">
<productName>Baby Monitor</productName>
<quantity>1</quantity>
<USPrice>39.98</USPrice>
<shipDate>1999-05-21</shipDate>
</item>
</items>
</purchaseOrder>
The XSD sequencer will derive the following content from the above XSD:
<po.xsd jcr:primaryType=xs:schemaDocument jcr:mixinTypes=[mode:derived]

- jcr:uuid=ca46f972-6875-481d-b9e1-cfb64ae76f74
- sramp:contentEncoding="UTF-8"
- sramp:contentType="application/xsd"
- sramp:description="Purchase order schema for Example.com.>
Copyright 2000 Example.com. All rights reserved."
<purchaseOrder jcr:primaryType=xs:elementDeclaration
- jcr:uuid=eff3bcfb-42d1-4d55-805b-5133279e15eb
- xs:abstract=false
- xs:ncName="purchaseOrder"
- xs:nillable=false
- xs:type=5088dc05-ad30-4d7d-8d24-3edc548a777f
- xs:typeName="PurchaseOrderType"/>
<comment jcr:primaryType=xs:elementDeclaration
- jcr:uuid=2daaa747-01f1-41f3-b5c2-ec218d8a7290
- xs:abstract=false
Page 361 of 424
ModeShape 3
- xs:ncName="comment"
- xs:nillable=false
- xs:typeNamespace=http://www.w3.org/2001/XMLSchema/>
<PurchaseOrderType jcr:primaryType=xs:complexTypeDefinition
- jcr:uuid=5088dc05-ad30-4d7d-8d24-3edc548a777f
- xs:abstract=false
- xs:mixed=false
- xs:ncName="PurchaseOrderType"/>
<xs:sequence jcr:primaryType=xs:sequence
- jcr:uuid=1b87d92d-4d59-44ac-859f-2a51c3a48eb2
- xs:maxOccurs=1
- xs:minOccurs=1>
<shipTo jcr:primaryType=xs:elementDeclaration
- jcr:uuid=994ba18b-c389-4635-8ce3-27d3a81cf97d
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="shipTo"
- xs:nillable=false
- xs:type=dd683707-83bb-4893-aa6e-f3ce81237e76
- xs:typeName="USAddress"/>
<billTo jcr:primaryType=xs:elementDeclaration
- jcr:uuid=e260c1aa-5a5a-4db5-a962-b02576359ee7
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="billTo"
- xs:nillable=false
- xs:type=dd683707-83bb-4893-aa6e-f3ce81237e76
- xs:typeName="USAddress"/>
<comment jcr:primaryType=xs:elementDeclaration
- jcr:uuid=a7796d20-0e7b-4833-96b6-16e0ac6676ca
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=0
- xs:nillable=false
- xs:ref=2daaa747-01f1-41f3-b5c2-ec218d8a7290
- xs:refName="comment"/>
<items jcr:primaryType=xs:elementDeclaration
- jcr:uuid=02ab83d1-ea1a-4a7b-b66d-a1974f13ca63
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="items"
- xs:nillable=false
- xs:type=7543bf0f-1753-4813-9a31-f2bbed34fd11
- xs:typeName="Items"/>
<orderDate jcr:primaryType=xs:attributeDeclaration
- jcr:uuid=8b23e048-c683-4d6d-8835-faf81df6912d
- xs:ncName="orderDate"
Page 362 of 424
ModeShape 3
- xs:typeName="date"
- xs:use="optional"/>
<USAddress jcr:primaryType=xs:complexTypeDefinition
- jcr:uuid=dd683707-83bb-4893-aa6e-f3ce81237e76
- xs:abstract=false
- xs:mixed=false
- xs:ncName="USAddress"/>
- jcr:uuid=82411c47-7f1a-4b11-9778-acc310c9e51c
- xs:maxOccurs=1
- xs:minOccurs=1>
<name jcr:primaryType=xs:elementDeclaration
- jcr:uuid=40dcb6fc-386c-4d3a-841b-dab478348d74
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="name"
- xs:nillable=false
<street jcr:primaryType=xs:elementDeclaration
- jcr:uuid=a3ff1a2d-38e7-442a-a46b-141fa1ac4442
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="street"
- xs:nillable=false
- xs:typeNamespace=http://www.w3.org/2001/XMLSchema />
<city jcr:primaryType=xs:elementDeclaration
- jcr:uuid=30d4215f-cd44-4857-9589-3df127e42cf3
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="city"
- xs:nillable=false
<state jcr:primaryType=xs:elementDeclaration
- jcr:uuid=061a58d9-94fd-4dca-84e2-6ced7fe523fe
- xs:abstract=false
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="state"
- xs:nillable=false
<zip jcr:primaryType=xs:elementDeclaration
- jcr:uuid=100dc3cc-b59f-4835-b14e-243b9e7a2ecf
- xs:abstract=false
Page 363 of 424
ModeShape 3
- xs:maxOccurs=1
- xs:minOccurs=1
- xs:ncName="zip"
- xs:nillable=false
- xs:typeName="decimal"
<country jcr:primaryType=xs:attributeDeclaration
- jcr:uuid=f323219f-bea0-4d6f-9ad5-f51cf8409f13
- xs:ncName="country"
- xs:typeName="NMTOKEN"
- xs:use="optional"/>
<Items jcr:primaryType=xs:complexTypeDefinition
- jcr:uuid=7543bf0f-1753-4813-9a31-f2bbed34fd11
- xs:abstract=false
- xs:mixed=false
- xs:ncName="Items"/>
- jcr:uuid=d907da56-f370-40e3-b06e-e3a5ae957f4d
- xs:maxOccurs=1
- xs:minOccurs=1/>
<item jcr:primaryType=xs:elementDeclaration
- jcr:uuid=87cc1352-2f90-49f4-9f36-3db7b9ffcf26
- xs:abstract=false
- xs:minOccurs=0
- xs:ncName="item"
- xs:nillable=false/>
<SKU jcr:primaryType=xs:simpleTypeDefinition
- jcr:uuid=4127108d-a699-461e-8210-3bb40c923318
- xs:baseTypeName="string"
- xs:baseTypeNamespace=http://www.w3.org/2001/XMLSchema
- xs:ncName="SKU"
- xs:pattern="\d{3}-[A-Z]{2}"/>
The first thing to note is that the sequencer produces a node of type xs:schemaDocument that includes the
mode:derived information, information about the XSD itself, plus an sramp:description property
containing the documentation content from any annotations directly under the schema element in the XSD.
Secondly, there is a node for each top-level element declaration, namely " purchaseOrder" and "comment
", with properties capturing the element's name, namespace (not shown since there is no target namespace
for the schema), and XSD type name, namespace and reference. The " comment" element declaration has a
base type of "xs:string", whereas the "purchaseOrder" element declaration has a type of "
PurchaseOrderType" (defined later in the XSD and in the derived content). Each node is "
mix:referenceable" and has a jcr:uuid property, allowing the "purchaseOrder" element declaration
to have a "xs:type" REFERENCE property pointing to the "PurchaseOrderType" complex type definition
node.
Page 364 of 424
ModeShape 3
There are also nodes representing each of the global complex type definitions, including "
PurchaseOrderType", "USAddress", "Items", and "SKU". Each of these nodes has properties
representing the complex type's features (such as abstract, mixed, name, etc.), as well as child nodes
that represent the definition of the complex type's content (e.g., sequence, choice, all, simple content,
complex content, etc.).
This example shows some of the structure that this sequencer derives from the XML Schema Documents.
Our goal for this sequencer was to output content that reflected as accurately as possible the structure of the
XML Schema Documents while also making the content easy to navigate, search and query.
Node Types
The XSD sequencer follows JCR best-practices by defining all nodes to have a primary type that allows any
single or multi-valued property, meaning it's possible and valid for any node to have any property (with single
or multiple values). In fact, this feature is used when XSD files contain attributes with non-schema
namespaces, which are then mapped onto properties with the attributes name and possibly-empty
namespace. However, it is still useful to capture the metadata about what that node represents, and so the
sequencer use explicit node type definitions and mixins for this.
The compact node definitions for the "xs" namespace are as follows:
//-----------------------------------------------------------------------------// N A M E S P A C E S
<sramp = "http://s-ramp.org/xmlns/2010/s-ramp">
<xs = "http://www.w3.org/2001/XMLSchema">
//-----------------------------------------------------------------------------// N O D E T Y P E S
//-----------------------------------------------------------------------------[xs:component] > sramp:derivedArtifactType abstract
- xs:id (string)
- * (undefined)
[xs:namespaced] mixin
- xs:namespace (uri)
[xs:located] mixin
- xs:schemaLocation (string)
[xs:import] > xs:component, xs:located, xs:namespaced
[xs:include] > xs:component, xs:located
[xs:redefine] > xs:component, xs:located
[xs:named] > xs:namespaced mixin
- xs:ncName (string)
Page 365 of 424
ModeShape 3
[xs:typeDefinition] > xs:component

// A mixin representing a reference to an 'xs:typeDefinition'
[xs:typed] mixin
- xs:typeName (string)
- xs:typeNamespace (uri)
- xs:type (weakreference) < 'xs:typeDefinition'
// Attribute wildcard
[xs:anyAttribute] > xs:component
- xs:minOccurs (long) < '[0,)'
- xs:maxOccurs (long) < '[0,)'
- xs:namespace (uri) multiple
- xs:processContents (string) = 'strict' < 'lax', 'strict', 'skip'
//
// The 'group', 'all', 'sequence' and 'choice' components
//
[xs:modelGroup] > xs:component abstract
- xs:refName (string)
- xs:refNamespace (uri)
- xs:ref (weakReference) < 'xs:modelGroup'
+ * (xs:elementDeclaration)
[xs:group] > xs:modelGroup
+ 'xs:anyAttribute' (xs:anyAttribute)
[xs:all] > xs:modelGroup
[xs:sequence] > xs:modelGroup
+ 'xs:sequence' (xs:sequence)
+ 'xs:choice' (xs:choice)
+ 'xs:all' (xs:all)
[xs:choice] > xs:modelGroup
+ 'xs:all' (xs:all)
//
// The 'simpleContent' and 'complexContent' components
//
[xs:complexContent] > xs:component
- xs:method (string) < 'restriction', 'extension'
+ * (xs:attributeDeclaration)
+ * (xs:attributeGroup)
+ * (xs:group)
+ 'xs:all' (xs:all)
[xs:simpleContent]
> xs:component
Page 366 of 424
ModeShape 3
+
+
+
+
xs:method (string) < 'restriction', 'extension'

xs:minValueExclusive (*)
xs:minValueInclusive (*)
xs:maxValueExclusive (*)
xs:maxValueInclusive (*)
xs:totalDigits (long) < '[0,]'
xs:fractionDigits (long) < '[0,]'
xs:length (long)
xs:maxLength (long) < '[0,]'
xs:minLength (long) < '[0,]'
xs:enumeratedValues (string) multiple
xs:whitespace (string) < 'preserve','collapse','replace'
xs:pattern (string)
* (xs:attributeDeclaration) sns
* (xs:attributeGroup) sns
* (xs:simpleTypeDefinition) sns
'xs:anyAttribute' (xs:anyAttribute)
//
// Attribute Groups
//
[xs:attributeGroup] > xs:component
- xs:ncName (string)
- xs:namespace (uri)
- xs:ref (weakReference) < 'xs:attributeGroup'
+ * (xs:attributeDeclaration) sns
+ * (xs:attributeGroup) sns
//
// Complex and simple type definitions
//
[xs:complexTypeDefinition] > xs:typeDefinition, xs:named
- xs:abstract (boolean) = 'false'
- xs:mixed (boolean) = 'false'
- xs:block (string) multiple < 'restriction', 'extension', 'all'
- xs:final (string) multiple < 'restriction', 'extension', 'all'
+ * (xs:complexContent) sns
+ * (xs:simpleContent) sns
+ * (xs:group) sns
+ 'xs:all' (xs:all)
[xs:simpleTypeDefinition] > xs:typeDefinition, xs:named
- xs:baseTypeName (string)
- xs:baseTypeNamespace (uri)
- xs:baseType (weakreference) < 'xs:typeDefinition'
- xs:final (string) multiple < 'restriction', 'list', 'union', 'all'
//
// Attribute declaration
//
Page 367 of 424
ModeShape 3
[xs:attributeDeclaration] > xs:component, xs:named, xs:typed
- xs:length (long)
- xs:maxLength (long)
- xs:minLength (long)
- xs:enumeratedValues (string) multiple
- xs:whitespace (string) < 'preserve','collapse','replace'
- xs:maxValueExclusive (*)
- xs:minValueExclusive (*)
- xs:maxValueInclusive (*)
- xs:minValueInclusive (*)
- xs:totalDigits (long)
- xs:fractionDigits (long)
- xs:pattern (string)
- xs:use (string)
//
// Identity constraint definition
//
[xs:selector] > xs:component
- xs:xpath (string) mandatory
[xs:field] > xs:component
- xs:xpath (string) mandatory
[xs:identityConstraintDefinition] > xs:component abstract
- xs:ncName (string) mandatory
+ 'selector' (xs:selector)
+ 'field' (xs:field) sns
[xs:unique] > xs:identityConstraintDefinition
[xs:key] > xs:identityConstraintDefinition
[xs:keyref] > xs:identityConstraintDefinition
- xs:refer (string) mandatory
//
// Element declaration
//
[xs:elementDeclaration] > xs:component, xs:named, xs:typed
- xs:abstract (boolean) = 'false'
- xs:nillable (boolean) = 'false'
- xs:final (string) multiple < 'all', 'extension', 'restriction'
- xs:block (string) multiple < 'all', 'extension', 'restriction', 'substitution'
- xs:default (string)
- xs:fixed (string)
- xs:form (string) < 'qualified', 'unqualified'
- xs:ref (weakReference) < 'xs:elementDeclaration'
- xs:substitutionGroupName (string)
- xs:substitutionGroup (weakReference) < 'xs:elementDeclaration'
+ * (xs:typeDefinition)
+ * (xs:identityConstraintDefinition)
//
Page 368 of 424
ModeShape 3
// XML Schema Document
//
[xs:schemaDocument] > sramp:xmlDocument
- xs:id (string)
- xs:targetNamespace (uri)
- xs:version (string)
- xs:attributeFormDefault (string) = 'unqualified' < 'qualified', 'unqualified'
- xs:elementFormDefault (string) = 'unqualified' < 'qualified', 'unqualified'
- xs:finalDefault (string) multiple < 'all', 'extension', 'restriction', 'list', 'union'
- xs:blockDefault (string) multiple < 'all', 'extension', 'restriction', 'substitution'
- xs:importedXsds (weakreference) multiple < 'xs:xsdDocument'
- xs:includedXsds (weakreference) multiple < 'xs:xsdDocument'
- xs:redefinedXsds (weakreference) multiple < 'xs:xsdDocument'
- * (undefined)
+ * (xs:import) sns
+ * (xs:include) sns
+ * (xs:redefine) sns
// Technically need 'sns' because the attributes, elements, simple types, complex types,
attribute groups,
// and groups don't share same name scopes
+ * (xs:elementDeclaration) sns
+ * (xs:group) sns
+ * (xs:simpleTypeDefinition) sns
+ * (xs:complexTypeDefinition) sns
These types use some of the node types and mixins defined in the " sramp" namespace:
Configuration
To use this sequencer, simply include the appropriate version of the Maven artifact with a "org.modeshape
" group ID and "modeshape-sequencer-xsd" and "modeshape-sequencer-sramp" artifacts ID. Or, if
you're using JAR files and manually setting up the classpath for your application, use the "
modeshape-sequencer-xsd-2.7.0.Final-jar-with-dependencies.jar" file. Then, define a
sequencing configuration in the ModeShape configuration, using something similar to:
{
"name" : "XSD Sequencer Test Repository",
"sequencing" : {
"sequencers" : {
"XSD Sequencer" : {
"classname" : "xsdsequencer",
"pathExpressions" : [ "default:/(*.xsd)/jcr:content[@jcr:data]" ]
}
}
}
}
Page 369 of 424
ModeShape 3
6.13 ZIP files

The ZIP file sequencer extracts the files and folders contained in the ZIP archive file, extracting the files and
folders into the repository using JCR's nt:file and nt:folder built-in node types. The structure of the
output thus matches the logical structure of the contents of the ZIP file.
Example
This sequencer generates a graph structure that maps to the files and folders in the ZIP file. An example
(listed in the JCR document view) from sequencing a ZIP file written into /a/foo and containing one file,
/x/y/z.txt is provided below:
<foo jcr:primaryType="zip:file"
jcr:mixinTypes="mode:derived">
<x jcr:primaryType="nt:folder"
jcr:created="2011-05-12T20:07Z"
jcr:createdBy="currentJcrUser">
<y jcr:primaryType="nt:folder"
jcr:created="2011-05-12T20:09Z"
jcr:createdBy="currentJcrUser">
<z.txt jcr:primaryType="nt:file">
<jcr:content jcr:primaryType="nt:resource"
jcr:data="This is the file content"
jcr:lastModified="2011-05-12T20:12Z"
jcr:lastModifiedBy="currentJcrUser"
jcr:mimeType="text/plain" />
</z.txt>
</y>
</x>
</foo>
The CND for the zip:file node type is listed below.
[zip:file] > nt:folder, mix:mimeType
To use this sequencer, simply include the modeshape-sequencer-zip JAR in your application and
configure the repository similar to:
Page 370 of 424
ModeShape 3
{
"name" : "ZIP Sequencer Test Repository",
"sequencing" : {
"sequencers" : {
"ZIP Sequencer" : {
"classname" : "zipsequencer",
"pathExpressions" : [ "default:/(*.zip)/jcr:content[@jcr:data] => /zip" ]
}
}
}
}
Page 371 of 424
ModeShape 3
7 Built-in connectors
ModeShape comes with several ready-to-use connectors so that you can set up repositories that federate
data from external systems. See the introduction to federation to learn more about how federation works.
7.1 File system connector

This connector exposes files and folders on the file system as nt:file and nt:folder nodes in the
repository. To use, configure an external source for a given file system (or area of the repository); each
external source can be set up as read-only (to only expose the file system's existing files and folders) or as
writable (to allow JCR clients to create/update/delete files and folders on the file system).
The File System Connector maps nt:file and nt:folder properties directly to the attributes on the file
system's files and folders. By default, ModeShape will store these extra properties in the same Infinispan
cache where the normal content is stored, though such content will be lost if files and folders are moved or
renamed outside of ModeShape. Several other options are possible, including storing these extra properties
on the file system using "sidecar" files that are named similarly to and stored adjacent to the target file or
folder. See the extraPropertiesStorage attribute description below for more detail.
The connector does not currently monitor the file system for newly created files or folders, and therefore no
events are created. However, navigation will always expose the current files/folder nodes within a folder.
ModeShape can index the content so that the projected nt:file, nt:folder, and nt:resource nodes
can be queried, but this must be done manually via the Workspace API's " reindex" methods.
As of ModeShape 3.4.0.Final, the file system connector is pageable, which means it can efficiently
expose folders that contain large numbers of items. Paging is a tradeoff between loading the parent
node faster (by having smaller numbers of child references) and having to go back to the connector
more frequently. By default, the connector includes only 20 items per page, so the page size can
be adjusted to best suit your application's needs.
The connector classname is "org.modeshape.connector.filesystem.FileSystemConnector", and

there are several attributes that should be configured on each external source:
Attribute Name
Description
directoryPath
The path to the file or folder that is to be accessed by this connector.
Page 372 of 424
ModeShape 3
extraPropertyStorage An optional string flag that specifies how this source handles "extra"
properties that are not stored via file system attributes. The value should be
one of the following:
store - Any extra properties are stored in the same Infinispan cache
where the content is stored. This is the default and is used if the actual
value doesn't match any of the other accepted values.
json - Any extra properties are stored in a JSON file next to the file or
directory.
legacy - Any extra properties are stored in a ModeShape
2.x-compatible file next to the file or directory. This is generally
discouraged unless you were using ModeShape 2.x and have a
directory structure that already contains these files.
none - An exception is thrown if the nodes contain any extra
properties.
inclusionPattern
Optional property that specifies a regular expression that is used to help

determine which files and folders in the underlying file system are exposed
through this connector. The connector will expose only those files and folders
with a name that matches the provided regular expression (as long as they
also are not excluded by the exclusionPattern). If no inclusion pattern is
specified, then the connector will include all files and folders that are not
excluded via the exclusionPattern.
exclusionPattern
Optional property that specifies a regular expression that is used to help

determine which files and folders in the underlying file system are not
exposed through this connector. Files and folders with a name that matches
the provided regular expression will not be exposed by this source.
addMimeTypeMixin
A boolean flag that specifies whether this connector should add the '
mix:mimeType' mixin to the 'nt:resource' nodes to include the '
jcr:mimeType' property. If set to true, the MIME type is computed
immediately when the 'nt:resource' node is accessed, which might be
expensive for larger files. This is false by default.
readOnly
A boolean flag that specifies whether this source can create/modify/remove

files and directories on the file system to reflect changes in the JCR content.
By default, sources are not read-only.
cacheTtlSeconds
Optional property that specifies the default maximum number of seconds (i.e.,
time to live) that a node returned by this connector should be cached in the
workspace cache before being expired. By default, the connector will not set
a special value, and the repository will determine (via its workspace cache
configuration) how long the node is cached in the workspace cache. Notice
that these cached values are updated/purged any time a file is changed via
ModeShape. However, if the files on the files ystem are You may want to set
this property if the files on the file system are changing frequently
Page 373 of 424
ModeShape 3
isQueryable
Optional property that specifies whether or not the content exposed by this
connector should be indexed by the repository. This acts as a global flag,
allowing a specific connector to mark it's entire content as non-queryable. By
default, all content exposed by a connector is queryable.
pageSize
(Added in 3.4.0.Final) Optional advanced property that controls the number

of children that the connector should include in a single page; the default is
20. For example, if a folder contains 200 items (e.g., files or folders) and the
page size is 20, then the connector will include in the document representing
this folder only the properties of the folder and the first 20 items (that are
readable, that satisfy the inclusion pattern, and that does not match the
exclusion pattern). As additional children are needed (e.g., as the
ModeShape client navigates or accesses the folder's child nodes),
ModeShape will request additional pages, each with up to 20 items.
contentBasedSha1
(Added in 3.6.0.Final) Optional advanced boolean property that controls

whether the binary value's hash values are SHA-1s based upon the file
contents. This property is "true" by default, and therefore has exactly the
same behavior as all other binary values within the repository. The connector
has to compute the SHA-1 every time a binary value is returned (including
every "jcr:data" property on the "jcr:content" children of "nt:file"
nodes. If the underlying files are changed by processes other than
ModeShape, the computed SHA-1 may not accurately represent the changed
file contents, though the time ModeShape caches the SHA-1 in the binary
value is controlled as part of the connector's cacheTtlSeconds property.
Also, computing the SHA-1 can be quite expensive and time consuming for
very large files and may thus introduce a lengthy and noticeable lag when
returning a "jcr:content" node until the SHA-1 is computed. If you are
using very large files, consider setting this "contentBasedSha1" property to
"false" so that the connector computes the SHA-1 based upon the URL to
the file on the file system. Such a SHA-1 can be computed very quickly,
eliminating the lag for very large files mentioned above. ModeShape still uses
these SHA-1s internally in a consistent fashion (two SHA-1s will be the same
only when they are for the same file), but the BinaryValue.getHash()
and BinaryValue.getHexHash() methods will return this
non-content-based SHA-1 value. (If you are dealing with very large binary
values and are not satisfied with the speed of ModeShape dynamically
computing the SHA-1, you can subclass the FileSystemConnector and
override the sha1(File) method to compute/cache/lookup the SHA-1 for a
given file using your preferred mechanism.)
Page 374 of 424
ModeShape 3
By default, the file system connector will expose all of the files and folders that are underneath the specified
directory and readable by the Java process, and it will allow ModeShape clients using the JCR API to
change, remove, or even create new files and folders. Additionally, any "extra properties" (e.g., those that
are not directly mappable to file system attributes, such as "jcr:primaryType", "jcr:created", "
jcr:lastModified", and "jcr:data") will be stored not on the file system but in the same Infinispan
cache that the repositories own internal (non-federated) content is stored. The connector will also use pages
to efficiently work with folders with large numbers of items.
If other behavior is desired, simply set the connector's properties to non-default values. For example, if
ModeShape clients are not allowed to modify, create, or remove file and folder nodes, then the connector
should be configured with "readOnly" set to true. Or, if only certain files and folders are to be exposed,
set the inclusionPattern and exclusionPattern to regular expressions that the connector can use to
know whether to include or exclude files and folders by name. Note that any file or folder will only be
exposed by the connector when the file/folder is readable and when its name satisfies the
inclusionPattern and does not satisfy the exclusion pattern.
The connector is often used to expose as content in a repository the existing files and folders on the file
system. Since the connector does not access any OS-specific file attributes, the connector simply maps
each existing file and folder as follows:
A folder is represented in ModeShape as a node with a primary type of "nt:folder", no mixin types,
and the "jcr:created" timestamp set to the last modified timestamp given by the file system. The
node will contain a child for each file and folder that are to be exposed (as discussed above).
A file is represented in ModeShape as a node with a primary type of " nt:file", no mixin types, and
the "jcr:created" timestamp set to the last modified timestamp given by the file system. The node
will contain a single child node named "jcr:content" that represents the content of the file, and
which has a primary type of "nt:resource" and the "jcr:lastModified" timestamp set to the file
system's last modified timestamp for the file. If the connector is configured with "addMimeTypeMixin
" set to true, then ModeShape will also attempt to determine the MIME type for the file's content and,
if determined, add the "mix:mimeType" mixin and the "jcr:mimeType" property to the "
jcr:content" node.
Here is a sample configuration that projects the "//a/b/c" directory onto a node the repository at "/files",
with the above (default) behavior:
{
...
"externalSources" : {
"local-git-repo" : {
"classname" : "org.modeshape.connector.filesystem.FileSystemConnector",
"directoryPath" : "/a/b/c/",
"projections" : \[ "/files" \]
}
}
...
}
Page 375 of 424
ModeShape 3
Here is a slightly different configuration that is read-only, that excludes any files or folders with names that
end with "{{.tmp}" (and have at least one character before this suffix), and that includes the
automatically-detected MIME type:
{
...
"projections" : \[ "/files" \],
"readOnly" : true,
"addMimeTypeMixin" : true,
"exclusionPattern" : ".+[.]tmp$"
}
}
...
}
Of course, some applications may want to set additional properties and/or mixins. When the connector is
writable (e.g., not read-only), the connector can store these properties in one of several places, based upon
the "extraPropertyStorage" configuration property. By default, these extra properties are stored in the
same Infinispan cache where the ModeShape repository stores the rest of its internal (non-federated)
content. This is convenient, but can lead to orphaned documents in the Infinispan cache should files and
folder be removed outside of ModeShape.
Alternatively, the connector can store these extra properties on the file system. Any extra properties on a file
or folder will be stored in a "sidecar" next to the corresponding file or folder and named similarly to the
corresponding file or folder but with a special suffix. If stored as a JSON file, the suffix will be "
.modeshape.json", or if stored as a text file the suffix will be " .modeshape. (The text format is the same
that used in ModeShape 2.x, but is provided only for backward compatibility. Where possible, choose the
JSON format.) Extra properties on the "jcr:content" child of "nt:file" nodes are stored in a different
sidecar file, named similarly to the corresponding file but with the ".content.modeshape.json" or "
.content.modeshape" suffix. Note that these sidecar files are never exposed as nodes by the connector.
It is even possible to prevent updating or creating files and folders with extra properties. To do this, simply
configure the connector with the "extraPropertyStorage" property set to "none".
Here is another sample configuration for a connector that works the same as the earlier configuration except
that it is now storing extra properties in a JSON sidecar:
Page 376 of 424
ModeShape 3
{
...
"projections" : \[ "/files" \],
"readOnly" : true,
"addMimeTypeMixin" : true,
"exclusionPattern" : ".+[.]tmp$",
"extraPropertyStorage" : "json"
}
}
...
}
7.2 Git connector

This read-only connector exposes the branches, tags, and commits in a local Git repository as nodes within
a repository. The structure is pre-defined by the connector so that the branches, tags, commits, and their
files and folders are all accessible via navigation, via identifiers, or via query (if configured).
The connector classname is "org.modeshape.connector.git.GitConnector", and there are several
attributes that should be configured on each external source:
Page 377 of 424
ModeShape 3
Attribute Name
Description
directoryPath
The path to the folder that is or contains the .git data structure is to be accessed
by this connector. This is required.
includeMimeType
A boolean flag denoting whether the MIME types for the files should be
determined and included as a property on the node. This is 'false' by default.
remoteName
The alias used by the local Git repository for the remote repository. The default is
"origin", which is common in Git repositories. If the value contains commas, the
value contains an ordered list of remote aliases that should be accessed; the first
one to match an existing remote will be used. The remote names are used to
know which branches should be exposed: if at least one remote name is given,
then only the branches in the remote(s) will be exposed; if no remotes are given,
then all local branches will be exposed.
queryableBranches An array with the names of the branches that should be queryable by the
repository. By default, only the "master" branch is queryable. Set this to an
empty array if no branches are to be queryable.
cacheTtlSeconds
workspace cache before being expired. By default, the connector will not set a
special value, and the repository will determine how long the node is cached in
the workspace cache.
Here is a sample configuration that projects the Git repository located at "/home/jsmith/git/MyRepo" on
the local file system into the repository under the "/git/MyRepo" node, which will have a primary type of "
git:root". The "master" and "2.x" branches will be included in the ModeShape indexes when the
content is reindexed, and MIME types will be included on all git:resource nodes (that is, the "
jcr:content" child of the "git:file" nodes). The list of branches and tags will include those on the "
upstream" and "origin" remotes.
{
...
"classname" : "org.modeshape.connector.git.GitConnector",
"directoryPath" : "/home/jsmit/git/MyRepo/",
"remoteName" : "upstream,origin",
"includeMimeType" : true,
"queryableBranches" : ["master","2.x"],
"projections" : \[ "/git/MyRepo" \]
}
}
...
}
And here is a description of the repository structure:
Page 378 of 424
ModeShape 3
Path
Description
/branches/{branchName}
The list of branches.
/tags/{tagName}
The list of tags.
/commits/{branchOrTagNameOrCommit\/{objectId}
The history of commits on the

branch, tag or object ID name "{
branchOrTagNameOrCommit}",
where "{objectId}" is the object
ID of the commit.
/commit/{branchOrTagNameOrCommit}
The information about a particular

branch, tag or commit "{
branchOrTagNameOrCommit}".
/tree/{branchOrTagOrObjectId}/{filesAndFolders}/... The structure of the directories and

files in the specified branch, tag or
commit "{
branchOrTagNameOrCommit}".
The node types used by the connector are specified here. Some of the more important node types include:
Node Type
Description
git:committed A mixin that defines the git:objectId (SHA-1 hash), git:author,

git:committer, git:committed (date), and git:title properties that appear on
all "committed" nodes.
git:file
The primary node type for a node representing a file in a Git repository. Extends both
nt:file and git:committed.
git:folder
The primary node type for a node representing a folder in a Git repository. Extends
both nt:folder and git:committed.
git:resource
The primary node type for a node representing the " jcr:content" child of git:file
nodes, where content-related information is placed. Extends both nt:resource and
git:committed.
git:branch
The primary node type for a node representing a Git branch.
git:tag
The primary node type for a node representing a Git tag.
git:commit
The primary node type for a node representing a Git commit.
git:branches
The primary node type for the node that contains the list of git:branch nodes.
git:tags
The primary node type for the node that contains the list of git:tag nodes.
git:commits
The primary node type for the node that contains a list of git:commit nodes.
git:root
The primary node type for the top-level node of the repository.
Page 379 of 424
ModeShape 3
7.3 CMIS connector

This connector exposes the content of a CMIS repository.
The Content Management Interoperability Services (CMIS) standard denes a domain model and Web
Services, Restful AtomPub and browser (JSON) bindings that can be used by applications to work with one
or more Content Management repositories/systems.
The CMIS connector is designed to be layered on top of existing Content Management systems. It is
intended to use Apache Chemistry API to access services provided by Content Management system and
incorporate those services into Modeshape content repository.
The connector class name is "org.modeshape.connector.cmis.CmisConnector", and there are
several attributes that should be configured on each external source:
Page 380 of 424
ModeShape 3
Attribute Name
Description
aclService
URL of the Access list service binding entry point. The ACL Services are used
to discover and manage Access Control Lists.
discoveryService
URL of the Discovery service binding entry point. Discovery service executes a
CMIS query statement against the contents of the repository.
multifilingService
URL of the Multi-filing service binding entry point. The Multi-ling Services are
used to le/un-le objects into/from folders.
navigationService
URL of the Navigation service binding entry point. The Navigation service gets
the list of child objects contained in the specied folder.
objectService
URL of the Object service binding entry point. Creates a document object of
the specied type (given by the cmis:objectTypeId property) in the (optionally)
specied location
policyService
URL of the Policy service binding entry point. Applies a specied policy to an
object.
relationshipService URL of the Relationship service binding entry point. Gets all or a subset of
relationships associated with an independent object.
repositoryService
URL of the Repository service binding entry point. Returns a list of CMIS
repositories available from this CMIS service endpoint.
versioningService
URL of the Policy service binding entry point. Create a private working copy
(PWC) of the document.
readOnly
A boolean flag that specifies whether this source can create/modify/remove

files and directories on the file system to reflect changes in the JCR content.
By default, sources are not read-only.
cacheTtlSeconds
workspace cache before being expired. By default, the connector will not set a
special value, and the repository will determine how long the node is cached in
the workspace cache.
isQueryable
Optional property that specifies whether or not the content exposed by this
connector should be indexed by the repository. This acts as a global flag,
allowing a specific connector to mark it's entire content as non-queryable. By
default, all content exposed by a connector is queryable.
Here is a sample configuration that projects the CMIS repository into the Modeshape repository under the "
/cmis/" node
Page 381 of 424
ModeShape 3
{
...
"cmis" : {
"classname" : "org.modeshape.connector.CmisConnector",
"cacheTtlSeconds" : 5,
"aclService" : "http://localhost:8090/services/ACLService?wsdl",
"discoveryService" : "http://localhost:8090/services/DiscoveryService?wsdl",
"multifilingService" : "http://localhost:8090/services/MultifilingService?wsdl",
"navigationService" : "http://localhost:8090/services/NavigationService?wsdl",
"objectService" : "http://localhost:8090/services/ObjectService?wsdl",
"policyService" : "http://localhost:8090/services/PolicyService?wsdl",
"relationshipService" : "http://localhost:8090/services/RelationshipService?wsdl",
"repositoryService" : "http://localhost:8090/services/RepositoryService?wsdl",
"versioningService" : "http://localhost:8090/services/VersioniongService?wsdl",
"repositoryId" : "A1",
"projections" : [ "default:/cmis => /" ]
}
}
...
}
Here is the same configuration except that a [variable] is used so that the actual URLs can be set with a
system property:
{
...
"cmis" : {
"classname" : "org.modeshape.connector.CmisConnector",
"cacheTtlSeconds" : 5,
"aclService" : "${custom.cmis.services.url}/ACLService?wsdl",
"discoveryService" : "${custom.cmis.services.url}/DiscoveryService?wsdl",
"multifilingService" : "${custom.cmis.services.url}/MultifilingService?wsdl",
"navigationService" : "${custom.cmis.services.url}/NavigationService?wsdl",
"objectService" : "${custom.cmis.services.url}/ObjectService?wsdl",
"policyService" : "${custom.cmis.services.url}/PolicyService?wsdl",
"relationshipService" : "${custom.cmis.services.url}/RelationshipService?wsdl",
"repositoryService" : "${custom.cmis.services.url}/RepositoryService?wsdl",
"versioningService" : "${custom.cmis.services.url}/VersioniongService?wsdl",
"repositoryId" : "A1",
"projections" : [ "default:/cmis => /" ]
}
}
...
}
The Repository structure is defined as follows
Page 382 of 424
ModeShape 3
Path
Description
/repository_info The description of the CMIS repository

/filesAndFolders The structure of the folders and files in the projected repository
Node types used by connectors are specified by JCR specifications or imported from CMIS repository itself.
Most important node types are as follows:
Node Type
Description
nt:folder
The primary node type for the node representing CMIS folder
nt:file
The primary node type for the node representing CMIS document
nt:resource
The primary node type for the node representing binary content of the CMIS
document
cmis:repository The primary node type for the node representing information of CMIS repository
itself
7.4 JDBC Metadata Connector

This connector provides read-only access to the metadata (e.g., catalogs, schemas, table structures and
foreign keys) of a relational database. The connector yields a hierarchy of nodes that looks like this:
/ (root node)
+ <catalog name> - one node for each accessible catalog in the database.
+ <schema name> - one node for each accessible schema in the catalog.
+ tables - a single node that is the parent of all tables in the schema.
|
+ <table name> - one node for each table in the schema.
|
+ <column name> - one node for each column in the table.
|
+ foreignKeys - a single node that is the parent of all the foreign keys
imported by the table
|
+ <foreign key name> - one node for each imported foreign key
+ procedures - a single node that is the parent of all procedures in the schema.
+ <procedure name> - one node for each procedure in the schema.
The root, table, column, foreign key and procedure nodes contain additional properties that correspond to
the metadata provide by the DatabaseMetaData class. In databases that do not support catalogs or
schemas (or allow the empty string as a valid catalog or schema name, the value of the
defaultCatalogName and/or defaultSchemaName properties will be used instead when determining the
node name.
Page 383 of 424
ModeShape 3
This connector has currently been tested successfully against Oracle 10g, Oracle 11g, Microsoft
SQL Server 2008 (with the Microsoft JDBC driver), IBM DB2 v9, Sybase ASE 15, MySQL 5 (with
the InnoDB engine), PostgreSQL 8, HSQLDB and H2. As JDBC driver implementations of the
DatabaseMetaData interface tend to vary widely, other databases may or may not work with the
default MetadataCollector implementation.
As one example, the metadataCollectorClassName property must be set to
org.modeshape.connector.meta.jdbc.SqlServerMetadataConnector if the Microsoft
JDBC driver is used. This is to work around a known bug where that driver returns a list of users
from a call to DatabaseMetaData.getSchemas() instead of a list of schemas.
The org.modeshape.connector.meta.jdbc.JdbcMetadataConnector class provides a number of

JavaBean properties that control its behavior:
Property
Description
dataSourceJndiName
The JNDI name of the JDBC DataSource

instance that should be used. If not specified, the
other driver properties must be set.
password
The password that should be used when creating

JDBC connections using the JDBC driver class.
This is not required if the DataSource is found in
JNDI.
url
The URL that should be used when creating

JDBC connections using the JDBC driver class.
This is not required if the DataSource is found in
JNDI.
username
The username that should be used when

creating JDBC connections using the JDBC
driver class. This is not required if the
DataSource is found in JNDI.
driverClassName
The name of the JDBC driver class. This is not

required if the DataSource is found in JNDI, but
is required otherwise.
defaultCatalogName
Optional property that defines the name to use

for the catalog name if the database does not
support catalogs or the database has a catalog
with the empty string as a name. The default
value is "default".
Page 384 of 424
ModeShape 3
defaultSchemaName
Optional property that defines the name to use

for the schema name if the database does not
support schemas or the database has a schema
with the empty string as a name. The default
value is "default".
idleTimeInSecondsBeforeTestingConnections Optional property that defines the number of

seconds after a connection remains in the pool
that the connection should be tested to ensure it
is still valid. The default is 180 seconds (or 3
minutes).
maximumConnectionsInPool
Optional property that defines the maximum

number of connections that may be in the
connection pool. The default is "5".
maximumConnectionIdleTimeInSeconds

number of seconds that a connection should
remain in the pool before being closed. The
default is "600" seconds (or 10 minutes).
maximumSizeOfStatementCache

number of statements that should be cached.
The default value is "100", but statement caching
can be disabled by setting to "0".
metadataCollectorClassName
Advanced optional property that defines the

name of a custom class to use for metadata
collection, which is typically needed for JDBC
drivers that don't properly support the standard
DatabaseMetaData methods. The specified class
must implement the MetadataCollector
interface and must have a public no-argument
constructor. If an empty string (or null) value is
specified for this property, a default
MetadataCollector implementation will be
used that relies on the driver's
DatabaseMetaData.
minimumConnectionsInPool
Optional property that defines the minimum

number of connections that will be kept in the
connection pool. The default is "0".
numberOfConnectionsToAcquireAsNeeded
The number of connections that should be added

to the pool when there are not enough to be
used. The default is "1".
The connector can either be configured from a standalone JSON configuration file:
Page 385 of 424
ModeShape 3
{
"name" : "Federated repository which uses a JDBC Metadata External Source",
"jndiName" : "java:/testRepo",
"workspaces" : {
"predefined" : ["ws1", "ws2"]
},
"jdbc-meta" : {
"classname" : "org.modeshape.connector.meta.jdbc.JdbcMetadataConnector",
"dataSourceJndiName" : "java:/testDS",
"maximumConnectionsInPool" : "${dataSource.maximumConnectionsInPool}",
"minimumConnectionsInPool" : "${dataSource.minimumConnectionsInPool}",
"maximumConnectionIdleTimeInSeconds" :
"${dataSource.maximumConnectionIdleTimeInSeconds}",
"maximumSizeOfStatementCache" : "${dataSource.maximumSizeOfStatementCache}",
"numberOfConnectionsToAcquireAsNeeded" :
"${dataSource.numberOfConnectionsToAcquireAsNeeded}",
"retryLimit" : "${dataSource.retryLimit}"
}
}
}
or when deployed in EAP, via the ModeShape EAP subsystem:
<source name="jdbc-metadata"
classname="org.modeshape.connector.meta.jdbc.JdbcMetadataConnector"
module="org.modeshape.connector.jdbc.metadata"
dataSourceJndiName="java:jboss/datasources/ExampleDS">
<projection>default:/ModeShapeTestDb => /</projection>
</source>
Page 386 of 424
ModeShape 3
8 Built-in text extractors

ModeShape comes with a single, ready-to-use text extractor. All you have to do is configure it and be ready
to work with the generated output. This section of the documentation describes each of ModeShape's built-in
text extractors.
8.1 Tika text extractor

This text extractor uses the Tika library to extract text from a variety of file formats. It will automatically
discover all of the Tika Parser implementations that are defined in
META-INF/services/org.apache.tika.parser.Parser text files accessible via the current
classloader and that contain the class names of the Parser implementations (one class name per line in
each file). In other words, simply ensure that the Tika libraries for the appropriate file formats are on the
classpath, and the text extractor will be able to use them all.
This text extractor can be configured in a ModeShape configuration by specifying several optional properties:
excludedMimeTypes - The comma- or whitespace-separated list of MIME types that should be
excluded from text extraction, even if there is a Tika Parser available for that MIME type. By default,
the MIME types for package files are excluded, though explicitly setting any excluded MIME types will
override these default.
includedMimeTypes - The comma- or whitespace-separated list of MIME types that should be
included in text extraction. This extractor will ignore any MIME types in this list that are not covered by
Tika Parser implementations.
To use this extractor, simply include the modeshape-extractor-tika JAR and the appropriate required
Tika JARs are on the classpath (or via Maven) and configure the repository in a similar fashion to:
{
"name" : "Sample Config",
"query" : {
"textExtracting": {
"extractors" : {
"tikaExtractor":{
"name" : "General content-based extractor",
"classname" : "tika",
}
}
},
}
}
Page 387 of 424
ModeShape 3
9 Extending ModeShape
This guide describes in detail the process for creating customized extensions to ModeShape. Doing so
requires writing Java code, and we'll also use Maven 3 for our build system.
9.1 Custom authentication providers

ModeShape can integrate with a custom authentication and authorization service, and it just takes a little bit
of code.
The AuthenticationProvider interface
The AuthorizationProvider interface
The AdvancedAuthorizationProvider interface
Putting it all together
Configure a repository to use your provider(s)
9.1.1 The AuthenticationProvider interface

ModeShape defines a simple interface for authenticating users. Each repository can have multiple providers,
and a client is authenticated as soon as one of the providers accepts the credentials. The interface is quite
simple:
Page 388 of 424
ModeShape 3
/**
* An interface used by a ModeShape Repository for authenticating users when they create new
sessions
* using Repository.login(javax.jcr.Credentials, String)} and related methods.
*/
public interface AuthenticationProvider {
/**
* Authenticate the user that is using the supplied credentials. If the supplied credentials
are authenticated, this
* method should construct an ExecutionContext that reflects the authenticated environment,
including the context's
* valid SecurityContext security context that will be used for authorization throughout.
*
* Note that each provider is handed a map into which it can place name-value pairs that
will be used in the
* Session attributes of the Session that results from this authentication attempt.
* ModeShape will ignore any attributes if this provider does not authenticate the
credentials.
*
*
* @param credentials the user's JCR credentials, which may be an AnonymousCredentials if
authenticating as an
*
anonymous user
* @param repositoryName the name of the JCR repository; never null
* @param workspaceName the name of the JCR workspace; never null
* @param repositoryContext the execution context of the repository, which may be wrapped by
this method
* @param sessionAttributes the map of name-value pairs that will be placed into the Session
attributes; never null
* @return the execution context for the authenticated user, or null if this provider could
not authenticate the user
*/
ExecutionContext authenticate( Credentials credentials,
String repositoryName,
String workspaceName,
ExecutionContext repositoryContext,
Map<String, Object> sessionAttributes );
}
All the parameters are supplied by ModeShape and contain everything necessary to authenticate a client
attempting to create a new JCR Session.
Implementations are expected return a new ExecutionContext instance for the user, and this can be
created from the repository's execution context by calling
repositoryContext.with(securityContext), where securityContext is a custom
implementation of the org.modeshape.jcr.security.SecurityContext interface that returns
information about the authenticated user:
Page 389 of 424
ModeShape 3
/**
* A security context provides a pluggable means to support disparate authentication and
authorization mechanisms that specify the
* user name and roles.
*
* A security context should only be associated with the execution context after authentication
has occurred.
*/
@NotThreadSafe
public interface SecurityContext {
/**
* Return whether this security context is an anonymous context.
* @return true if this context represents an anonymous user, or false otherwise
*/
boolean isAnonymous();
/**
* Returns the authenticated user's name
* @return the authenticated user's name
*/
String getUserName();
/**
* Returns whether the authenticated user has the given role.
* @param roleName the name of the role to check
* @return true if the user has the role and is logged in; false otherwise
*/
boolean hasRole( String roleName );
/**
* Logs the user out of the authentication mechanism.
* For some authentication mechanisms, this will be implemented as a no-op.
*/
void logout();
}
Note that if you want to provide authorization functionality, then your SecurityContext implementation
must also implement AuthorizationProvider or AdvancedAuthorizationProvider.
Page 390 of 424
ModeShape 3
9.1.2 The AuthorizationProvider interface

ModeShape uses its org.modeshape.jcr.security.AuthorizationProvider interface to determine
whether a Session has the appropriate privileges to perform reads and writes.
/**
* An interface that can authorize access to specific resources within repositories.
*/
public interface AuthorizationProvider {
/**
* Determine if the supplied execution context has permission for all of the named actions
in the named workspace.
* If not all actions are allowed, the method returns false.
*
* @param context the context in which the subject is performing the actions on the supplied
workspace
* @param repositoryName the name of the repository containing the workspace content
* @param repositorySourceName <i>This is no longer used and will always be the same as the
repositoryName</i>
* @param workspaceName the name of the workspace in which the path exists
* @param path the path on which the actions are occurring
* @param actions the list of {@link ModeShapePermissions actions} to check
* @return true if the subject has privilege to perform all of the named actions on the
content at the supplied
*
path in the given workspace within the repository, or false otherwise
*/
boolean hasPermission( ExecutionContext context,
String repositoryName,
String repositorySourceName,
Path path,
String... actions );
}
Simply have your SecurityContext implementation also implement this interface, and return true
whenever the session is allows to perform the requested operations.
9.1.3 The AdvancedAuthorizationProvider interface

ModeShape uses its org.modeshape.jcr.security.AdvancedAuthorizationProvider interface
to determine whether a Session has the appropriate privileges to perform reads and writes.
Page 391 of 424
ModeShape 3
/**
* An interface that can authorize access to specific resources within repositories. Unlike the
more basic and simpl
* AuthenticationProvider, this interface allows an implementation to get at additional
information with each call to
* hasPermission(Context, Path, String...).
*
* In particular, the supplied Context instance contains the Session that is calling this
provider, allowing the
* provider implementation to access authorization-specific content within the repository to
determine permissions for other
* repository content.
*
* In these cases, calls to the session to access nodes will result in their own calls to
hasPermission(Context, Path, String...).
* Therefore, such implementations need to handle these special authorization-specific content
permissions in an explicit fashion.
* It is also adviced that such providers cache as much of the authorization-specifc content as
possible, as the
* hasPermission(Context, Path, String...) method is called frequently.
*/
public interface AdvancedAuthorizationProvider {
/**
* Determine if the supplied execution context has permission for all of the named actions
in the given context. If not all
* actions are allowed, the method returns false.
*
* @param context the context in which the subject is performing the actions on the supplied
workspace
* @param absPath the absolute path on which the actions are occurring, or null if the
permissions are at the workspace-level
* @param actions the list of {@link ModeShapePermissions actions} to check
* @return true if the subject has privilege to perform all of the named actions on the
content at the supplied path in the
*
given workspace within the repository, or false otherwise
*/
boolean hasPermission( Context context,
Path absPath,
String... actions );
}
where Context is a new nested interface nested in AdvancedAuthorizationProvider:
Page 392 of 424
ModeShape 3
/**
* The context in which the calling session is operating, and which contains session-related
information that a provider
* implementation may find useful.
*/
public static interface Context {
/**
* Get the execution context in which this session is running.
*
* @return the session's execution context; never null
*/
public ExecutionContext getExecutionContext();
/**
* Get the session that is requesting this authorization provider to
* {@link AdvancedAuthorizationProvider#hasPermission(Context, Path, String...)
determine permissions}. Provider
* implementations are free to use the session to access nodes <i>other</i> than those
for which permissions are being
* determined. For example, the implementation may access other <i>authorization-related
content</i> inside the same
* repository. Just be aware that such accesses will generate additional calls to the
* {@link AdvancedAuthorizationProvider#hasPermission(Context, Path, String...)} method.
*
* @return the session; never null
*/
public Session getSession();
/**
* Get the name of the repository that is being accessed.
*
* @return the repository name; never null
*/
public String getRepositoryName();
/**
* Get the name of the repository workspace that is being accessed.
*
* @return the workspace name; never null
*/
public String getWorkspaceName();
}
Simply have your SecurityContext implementation also implement this interface, and return true
whenever the session is allowed to perform the requested operation.
Page 393 of 424
ModeShape 3
9.1.4 Putting it all together

To have full control over the authentication & authorization process, you need to implement all the above
interfaces and then configure your repository to use the AuthenticationProvider implementation.
For example:
public class SimpleTestSecurityProvider implements AuthenticationProvider,

AuthorizationProvider, SecurityContext {
@Override
public ExecutionContext authenticate( Credentials credentials, String repositoryName,
ExecutionContext repositoryContext, Map<String,
Object> sessionAttributes ) {
return repositoryContext.with(this);
}
@Override
public boolean hasPermission( ExecutionContext context, String repositoryName, String
repositorySourceName,
String workspaceName, Path absPath, String... actions ) {
return true;
}
@Override
public boolean isAnonymous() {
return false;
}
@Override
public String getUserName() {
return "test user";
}
@Override
public boolean hasRole( String roleName ) {
return true;
}
@Override
public void logout() {
}
}
Note how the authenticate method in the above example places this as the security context. This
ensures that the hasPermission method will be called each time a repository operation is performed.
If you wanted more control, you could choose to implement AdvancedAuthorizationProvider instead
of AuthorizationProvider.
Page 394 of 424
ModeShape 3
9.1.5 Configure a repository to use your provider(s)

Once you've implemented the interfaces and placed the classes on the classpath, all you have to do is then
configure your repositories to use your authentication providers. As noted in the configuration overview,
there is a nested document in the JSON configuration file in the "security" field, and this section lists the
authentication provider implementations in the order that they should be used. For example:
...
"security" : {
"anonymous" : {
"username" : "<anonymous>",
"roles" : ["readonly","readwrite","admin"],
"useOnFailedLogin" : false
},
"providers" : [
{
"name" : "My Custom Security Provider",
"classname" : "com.example.MyAuthenticationProvider",
},
{
"classname" : "JAAS",
"policyName" : "modeshape-jcr",
}
]
},
...
This configuration enables the use of anonymous logins (although it disables a failed authentication attempt
from downgrading to an anonymous session with the "useOnFailedLogin" as false), and configures two
authentication providers: the MyAuthenticationProvider implementation will be used first, and if that
does not authenticate the repository will delegate to the built-in JAAS provider. (Note that built-in providers
can be referenced with an alias in the "classname" field rather than the fully-qualified classname.)
Anonymous authentication is always performed last.
Check out our custom security example for such a custom security provider.
9.2 Custom sequencers

Earlier in the introduction, we briefly described sequencers and how they work. In this section we go into
more detail about the framework and describe all the steps for developing your own custom sequencers.
Page 395 of 424
ModeShape 3
The Sequencer framework

Creating a new sequencer
Create the Maven module
Create a Sequencer subclass
Create unit tests
Package your library
Deploy and configure
9.2.1 The Sequencer framework

A sequencer is actually just a plain old Java object (POJO). Creating a sequencer is pretty straightforward:
create a Java class that extends a single abstract class, called Sequencer:
package org.modeshape.jcr.api.sequencer;
import javax.jcr.Node;
import javax.jcr.Property;
import javax.jcr.RepositoryException;
public abstract class Sequencer {
...
/**
* Execute the sequencing operation on the specified property, which has recently
* been created or changed.
*
* Each sequencer is expected to process the value of the property, extract information
* from the value, and write a structured representation (in the form of a node or a
* subgraph of nodes) using the supplied output node. Note that the output node
* will either be:
*
1. the selected node, in which case the sequencer was configured to generate the
*
output information directly under the selected input node; or
*
2. a newly created node in a different location than node being sequenced (in
*
this case, the primary type of the new node will be 'nt:unstructured', but
*
the sequencer can easily change that using Node.setPrimaryType(String) ).
*
* The implementation is expected to always clean up all resources that it acquired,
* even in the case of exceptions.
*
* @param inputProperty the property that was changed and that should be used as
*
the input; never null
* @param outputNode the node that represents the output for the derived information;
*
never null, and will either be a new node if the output is being placed
*
outside of the selected node, or will not be new when the output is to be
*
placed on the selected input node
* @param context the context in which this sequencer is executing, and which may
*
contain additional parameters useful when generating the output structure; never
null
* @return true if the sequencer's output should be saved, or false otherwise
* @throws Exception if there was a problem with the sequencer that could not be handled.
*
All exceptions will be logged automatically as errors by ModeShape.
Page 396 of 424
ModeShape 3
*/
public abstract boolean execute( Property inputProperty,
Node outputNode,
Context context ) throws Exception;
/**
* Initialize the sequencer. This is called automatically by ModeShape, and
* should not be called by the sequencer.
* <p>
* By default this method does nothing, so it should be overridden by
* implementations to do a one-time initialization of any internal components.
* For example, sequencers can use the supplied 'registry' and
* 'nodeTypeManager' objects to register custom namesapces and node types
* required by the generated content.
* </p>
*
* @param registry the namespace registry that can be used to register
* custom namespaces; never null
* @param nodeTypeManager the node type manager that can be used to register
* custom node types; never null
*/
public void initialize( NamespaceRegistry registry,
NodeTypeManager nodeTypeManager ) {
}
}
The abstract class also contains fields and getters (not shown above) for the name, description, and path
expressions that are automatically set by ModeShape during repository initialization. The
initialize(...) method is run upon repository initialization and can be overridden by an implementation
to register (if required) any custom namespaces and node types required by the sequencer's generated
output.
The outputNode might belong to a different javax.jcr.Session object than the

inputProperty, if the input and output paths of the sequencer configuration specify different
workspaces. Therefore, be careful that all changes are made using the output node and its
session.
The inputs to the sequencer depend on how it's configured, but often the inputProperty represents the
jcr:data BINARY property on the "jcr:content" child of an nt:file node. The outputNode,
however, will be one of two things:
Page 397 of 424
ModeShape 3
1. If there is no output path in the path expression, then the sequenced output is to be placed directly
under the selected node, so therefore the "outputNode" will be the existing node being sequenced.
In this case, the sequencer should place all content under the output node. In this case, the
sequencers are not allowed to change the primary type.
2. Otherwise, the sequenced output is to be placed in a different location than the selected node. In this
case, ModeShape uses the name of the selected node and creates a new node under the output
path. This new node will have a primary type of "nt:unstructured", but sequencers are allowed to
change the primary type.
The final parameter to the execute(...) method is the SequencerContext object, which is an interface
containing some extra information often useful when sequencing files:
package org.modeshape.jcr.api.sequencer;
import java.util.Calendar;
/**
* The sequencer context represents the complete context of a sequencer invocation.
* Currently, this information includes the current time of execution.
*/
public interface SequencerContext {
/**
* Get the timestamp of the sequencing. This is always the timestamp of the
* change event that is being processed.
*
* @return timestamp the "current" timestamp; never null
*/
Calendar getTimestamp();
}
9.2.2 Creating a new sequencer

Create unit tests
Page 398 of 424
ModeShape 3
9.3 Custom text extractors

Earlier in the Introduction to ModeShape, we briefly described text extractors and how they work. In this
section we go into more detail about the framework and describe all the steps for developing your own
custom text extractors.
The text extraction framework
Creating a new sequencer
Create unit tests
9.3.1 The text extraction framework

A text extractor is actually just a plain old Java object (POJO). Creating an extractor is pretty straightforward:
create a Java class that extends a single abstract class, called TextExtractor:
package org.modeshape.jcr.api.text;
import javax.jcr.Node;
import javax.jcr.Property;
import javax.jcr.RepositoryException;
public abstract class TextExtractor {
...
/**
* Determine if this extractor is capable of processing content with the supplied MIME type.
* @param mimeType the MIME type; never null
* @return true if this extractor can process content with the supplied MIME type, or false
otherwise.
*/
public abstract boolean supportsMimeType( String mimeType );
/**
* Extract text from the given {@link Binary}, using the given output to record the results.
* @param binary the binary value that can be used in the extraction process; never
<code>null</code>
* @param output the output from the sequencing operation; never <code>null</code>
* @param context the context for the sequencing operation; never <code>null</code>
* @throws Exception if there is a problem during the extraction process
*/
public abstract void extractFrom( Binary binary,
TextExtractor.Output output,
Context context ) throws Exception;
/**
Page 399 of 424
ModeShape 3
* Allows subclasses to process the stream of binary value property in "safe" fashion,
making sure the stream is closed at the
* end of the operation.
* @param binary a {@link org.modeshape.jcr.api.Binary} who is expected to contain a
non-null binary value.
* @param operation a {@link org.modeshape.jcr.api.text.TextExtractor.BinaryOperation} which
should work with the stream
* @param <T> the return type of the binary operation
* @return whatever type of result the stream operation returns
* @throws Exception if there is an error processing the stream
*/
protected final <T> T processStream( Binary binary,
BinaryOperation<T> operation ) throws Exception {
...
}
/**
* Interface which can be used by subclasses to process the input stream of a binary
property.
* @param <T> the return type of the binary operation
*/
protected interface BinaryOperation<T> {
T execute( InputStream stream ) throws Exception;
}
/**
* Interface which provides additional information to the text extractors, during the
extraction operation.
*/
public interface Context {
String mimeTypeOf( String name,
Binary binaryValue ) throws RepositoryException, IOException;
}
/**
* The interface passed to a TextExtractor to which the extractor should record all text
content.
*/
public interface Output {
/**
* Record the text as being extracted. This method can be called multiple times during a
single extract.
* @param text the text extracted from the content.
*/
void recordText( String text );
}
}
The abstract class also contains fields and getters (not shown above) for the name and logger that are
automatically set by ModeShape during repository initialization.
There are two abstract methods that must be implemented: supportsMimeType(...) and
extractFrom(...). The first is fairly obvious: simply return true for all of the MIME types for which the
extractor is capable of processing. The extractFrom method is the meat of the implementation, and
should process the BINARY value's contents and write the searchable text to the supplied Output object.
Page 400 of 424
ModeShape 3
Note that the processStream(...) method is a utility that can be called by the extractFrom and that
properly opens the BINARY value's stream, processes the content, and ensures that the stream is always
closed. Your implementation can therefore implement the extractFrom method as follows:
public void extractFrom( final Binary binary,

final TextExtractor.Output output,
final Context context ) throws Exception {
processStream(binary, new BinaryOperation<Object>() {
@Override
public Object execute( InputStream stream ) throws Exception {
// Custom logic to read the stream and write to 'output'
return null;
}
});
}
This can make your implementation a little easier, but feel free to just implement the extractFrom method
directly process the stream.
9.3.2 Creating a new sequencer

Create unit tests
9.4 Custom connectors

Earlier in the introduction, we briefly described federation and the connectors available out-of-the-box. In this
section we go into more detail about the framework and describe all the steps for developing your own
custom connectors.
Page 401 of 424
ModeShape 3
The Connector framework

Documents
Read only connector
Properties, Paths, Names, and values
Standard connector properties
Writable connector
Extra properties
Pageable connectors
Creating a custom connector
Create a ReadOnlyConnector or WritableConnector subclass
Create unit tests
9.4.1 The Connector framework

A connector is actually just a plain old Java object (POJO), so creating a connector is pretty straightforward:
create a Java class that extends one of the following abstract classes:
ReadOnlyConnector - extend this class when ModeShape clients will never be able to manipulate,
create or remove any content exposed by the connector.
WritableConnector - extend this class when ModeShape clients may be able to manipulate,
create and/or remove content exposed by the connector. Note that each time this connector is
configured, it can still be made to be read-only.
A connector operates by accessing an external system and dynamically creating nodes that represent
information in that external system. The nodes must form a single tree, although how that tree is structured
and what the nodes actually look like is completely up to the connector implementation.
Page 402 of 424
ModeShape 3
Documents
While a connector conceptually exposes nodes, technically it exchanges representations of nodes (and other
information, like sublists of children). These representations take the form of Java Document objects that
are semantically like JSON and BSON documents. The connector SPI does this for a number of reasons.
Firstly, ModeShape actually stores its own internal (non-federated) nodes as Document}}s, so
connectors are actually working with the same kind of internal {{Document
instances that ModeShape uses. Secondly, a Document is easily converted to and from JSON (and BSON),
making it potentially very easy to write a connector that accesses a remote system. Thirdly, constructs other
than nodes can be represented as documents; for example, a connector can be pageable, meaning it breaks
the list of child node references into multiple pages that are read with separate requests, allowing the
connector to efficiently expose large numbers of children under a single node. Finally, the node's identifier,
properties, child node references, and other ModeShape-specific information are stored in specific fields
within a Document, but additional fields can be used by the connector and hidden to ModeShape clients.
Though this makes little sense for a read-only connector, a writable connector might include such hidden
fields when reading nodes so that when the document comes back to the connector those hidden fields are
still available.
We'll see what these {{Document}}s look like in a little bit, but first let's look at the methods that your
connector implementation will need to implement.
Read only connector

The following code fragment shows the methods that a ReadOnlyConnector subclass must/should
implement.
package org.modeshape.jcr.federation.spi;
import
import
import
import
import
import
java.io.IOException;
java.util.Collection;
javax.jcr.NamespaceRegistry;
org.infinispan.schematic.document.Document;
org.modeshape.jcr.api.nodetype.NodeTypeManager;
public abstract class ReadOnlyConnector extends Connector {

...
/**
* Initialize the connector. This is called automatically by ModeShape once for each
Connector instance,
* and should not be called by the connector. By the time this method is called, ModeShape
will have
* already set the {{ExecutionContext}}, {{Logger}}, connector name, repository name {@link
#context},
* and any fields that match configuration properties for the connector.
*
* By default this method does nothing, so it should be overridden by implementations to do
a one-time
Page 403 of 424
ModeShape 3
* initialization of any internal components. For example, connectors can use the supplied
{{registry}}
* and {{nodeTypeManager}} parameters to register custom namespaces and node types used by
the exposed nodes.
*
* This is also an excellent place for connector to validate the connector-specific fields
set by ModeShape
* via reflection during instantiation.
*
* @param registry the namespace registry that can be used to register custom namespaces;
never null
* @param nodeTypeManager the node type manager that can be used to register custom node
types; never null
* @throws RepositoryException if operations on the {@link NamespaceRegistry} or {@link
NodeTypeManager} fail
* @throws IOException if any stream based operations fail (like importing cnd files)
*/
NodeTypeManager nodeTypeManager ) throws RepositoryException,
IOException {
}
/**
* Returns the id of an external node located at the given external path within the
connector's
* exposed tree of content.
*
* @param externalPath a non-null string representing an external path, or "/" for the
top-level
*
node exposed by the connector
* @return either the id of the document or null
*/
public abstract String getDocumentId( String externalPath );
/**
* Returns a Document instance representing the document with a given id. The document
should have
* a "proper" structure for it to be usable by ModeShape.
*
* @param id a {@code non-null} string
* @return either an {@link Document} instance or {@code null}
*/
public abstract Document getDocumentById( String id );
/**
* Return the path(s) of the external node with the given identifier. The resulting paths
are from the
* point of view of the connector. For example, the "root" node exposed by the connector wil
have a
* path of "/".
*
* @param id a null-null string
* @return the connector-specific path(s) of the node, or an empty document if there is no
such
* document; never null
*/
public abstract Collection<String> getDocumentPathsById( String id );
Page 404 of 424
ModeShape 3
/**
* Checks if a document with the given id exists in the end-source.
*
* @param id a non-null string.
* @return {{true}} if such a document exists, {{false}} otherwise.
*/
public abstract boolean hasDocument( String id );
...
}
Not shown are fields, getters, and other implemented methods that your methods will almost certainly use.
For example, a Document is a read-only representation of a JSON document, and they can be created by
calling the newDocument(id) method with the document's identifier, using the resulting DocumentWriter
to set/remove/add fields (and nested documents), and calling the writer's document() method to obtain the
read-only Document instance.
The DocumentWriter interface provides dozens of methods for getting and setting node properties and
child node references. Here's some code that uses a document writer to construct a node representation
with a few properties:
String id = ...
DocumentWriter writer = newDocument(id);
writer.setPrimaryType("lib:book");
writer.addMixinType("lib:tagged");
writer.addProperty("lib:isbn, "0486280616");
writer.addProperty("lib:format, "paperback");
writer.addProperty("lib:author", "Mark Twain");
writer.addProperty("lib:title", "The Adventures of Huckleberry Finn");
writer.addProperty("lib:tags", "fiction", "classic", "americana");
// Add a single child named 'tableOfContents' with its own identifier
writer.addChild(id + "/toc","tableOfContents");
Document doc = writer.document();
As you can see, creating documents is pretty straightforward.

Identifiers of documents are simple strings that are expected to uniquely and durably identify a document.
However, the content of that string is entirely up to the connector implementations. If the external system
already has the notion of unique identifiers, it might be easiest to simply reuse a string representation of
those identifiers. For example, a database might have a unique key within a given table, whereas a Git
repository uses SHA-1 hashes for identifiers of commits, branches, tags, etc. Some external systems (like
file systems) don't have a concept of unique identifiers, and in such cases the connector should devise its
own identifier mechanism is durable and reliable.
Page 405 of 424
ModeShape 3
Properties, Paths, Names, and values

Most of the time, you can use string property names and property values that are String, Calendar, URL, or
Numeric instances, and ModeShape will convert to an internal object representation. However, ModeShape
provides object definitions of JCR names, paths, values, and properties. These classes are often much
easier to work with than the String names and paths, and they're easy to create using ModeShape's
namespace-aware factories. The "ValueFactories" interface is a container for type-specific factories
accessible with various getter methods. Here's an example of creating a Path value from a string and then
using the Path methods to get at the already-parsed segments of the path:
String str = "/a/b/c/cust:d";

PathFactory pathFactory = factories().getPathFactory();
Path path = pathFactory.create(str);
for ( Segment segment : path ) {
Name name = segment.getName();
String localName = name.getLocalName();
String namespaceUri = name.getNamespaceUri();
if ( segment.hasIndex() ) {
String snsIndex = segment.getIndex();
}
}
Path parentPath = path.getParent();
...
The process of using a factory to create Name, Binary, DateTime, and all other JCR-compliant values is
similar.
Properties are slightly different, since they are a bit more structured. ModeShape provides a
PropertyFactory that can create single- or multi-valued Property instances given a name and one or
more values. Here's some simple code that shows how to create a single-valued property:
PropertyFactory propFactory = propertyFactory();

Name propName = nameFactory().create("lib:title");
String propValue = factories().stringFactory("The Adventures of Huckleberry Finn");
Property prop = propFactory.create(propName,propValue);
All Property, Name, Path, DateTime, and Binary instances are immutable, meaning you can pass them
around without worrying about whether the receiver might modify them. Also, the factories will often pick
implementation classes that are tailored for the specific value. For example, there are separate
implementations for the root path, single-segment paths, paths created from a parent path, single-valued
properties, empty properties, and multi-valued properties.
Standard connector properties
Page 406 of 424
ModeShape 3
Writable connector
The following code fragment shows the methods that a WritableConnector subclass must/should
implement.
import
import
import
import
import
import
java.io.IOException;
java.util.Collection;
javax.jcr.NamespaceRegistry;
org.infinispan.schematic.document.Document;
org.modeshape.jcr.api.nodetype.NodeTypeManager;
public abstract class WritableConnector extends Connector {

...
/**
* Initialize the connector. This is called automatically by ModeShape once for each
Connector instance,
* and should not be called by the connector. By the time this method is called, ModeShape
will have
* already set the {{ExecutionContext}}, {{Logger}}, connector name, repository name {@link
#context},
* and any fields that match configuration properties for the connector.
*
* By default this method does nothing, so it should be overridden by implementations to do
a one-time
* initialization of any internal components. For example, connectors can use the supplied
{{registry}}
* and {{nodeTypeManager}} parameters to register custom namespaces and node types used by
the exposed nodes.
*
* This is also an excellent place for connector to validate the connector-specific fields
set by ModeShape
* via reflection during instantiation.
*
* @param registry the namespace registry that can be used to register custom namespaces;
never null
* @param nodeTypeManager the node type manager that can be used to register custom node
types; never null
* @throws RepositoryException if operations on the {@link NamespaceRegistry} or {@link
NodeTypeManager} fail
* @throws IOException if any stream based operations fail (like importing cnd files)
*/
NodeTypeManager nodeTypeManager ) throws RepositoryException,
IOException {
}
/**
* Returns the id of an external node located at the given external path within the
connector's
* exposed tree of content.
Page 407 of 424
ModeShape 3
*
* @param externalPath a non-null string representing an external path, or "/" for the
top-level
*
node exposed by the connector
* @return either the id of the document or null
*/
public abstract String getDocumentId( String externalPath );
/**
* Returns a Document instance representing the document with a given id. The document
should have
* a "proper" structure for it to be usable by ModeShape.
*
* @param id a {@code non-null} string
* @return either an {@link Document} instance or {@code null}
*/
public abstract Document getDocumentById( String id );
/**
* Return the path(s) of the external node with the given identifier. The resulting paths
are
* from the point of view of the connector. For example, the "root" node exposed by the
connector
* will have a path of "/".
*
* @param id a null-null string
* @return the connector-specific path(s) of the node, or an empty document if there is no
such
*
document; never null
*/
public abstract Collection<String> getDocumentPathsById( String id );
/**
* Checks if a document with the given id exists in the end-source.
*
* @return {{true}} if such a document exists, {{false}} otherwise.
*/
public abstract boolean hasDocument( String id );
/**
* Removes the document with the given id.
*
* @return {{true}} if the document was removed, or {{false}} if there was no document with
the
*
given id
*/
public abstract boolean removeDocument( String id );
/**
* Stores the given document.
*
* @param document a non-null Document instance.
* @throws DocumentAlreadyExistsException if there is already a new document with the same
identifier
* @throws DocumentNotFoundException if one of the modified documents was removed by another
session
Page 408 of 424
ModeShape 3
*/
public abstract void storeDocument( Document document );
/**
* Updates a document using the provided changes.
*
* @param documentChanges a non-null DocumentChanges object which contains
*
granular information about all the changes.
*/
public abstract void updateDocument( DocumentChanges documentChanges );
/**
* Generates an identifier which will be assigned when a new document (aka. child) is
created under an
* existing document (aka.parent). This method should be implemented only by connectors
which support
* writing.
*
* @param parentId a non-null string which represents the identifier of the parent under
which the new
*
document will be created.
* @param newDocumentName a non-null Name which represents the name that will be given
*
to the child document
* @param newDocumentPrimaryType a non-null Name which represents the child document's
*
primary type.
* @return either a non-null string which will be assigned as the new identifier, or null
which means
*
that no "special" id format is required. In this last case, the repository will
*
auto-generate a random id.
* @throws org.modeshape.jcr.cache.DocumentStoreException if the connector is readonly.
*/
public abstract String newDocumentId( String parentId,
Name newDocumentName,
Name newDocumentPrimaryType );
...
}
A WritableConnector has to implement all of the read-related methods that a ReadOnlyConnector

must implement and a handful of write-related methods for removing, updating, and storing new documents
(nodes).
Just as ModeShape provides a DocumentWriter, there is also a DocumentReader that has methods to
easily read properties, primary type, mixin types, and child references. Using it is just as simple as using the
writer:
Page 409 of 424
ModeShape 3
Document doc = ...

DocumentReader reader = readDocument(doc);
String id = reader.getDocumentId();
String primaryType = reader.getPrimaryTypeName();
Map<Name, Property> properties = reader.getProperties();
// Get the ordered list of child references ...
LinkedHashMap<String,Name> childReferences = reader.getChildrenMap();
for ( Map<String,Name>.Entry childRef : childReferences.entrySet() ) {
String key = childRef.getKey();
String name = childRef.getValue();
}
Extra properties
ModeShape provides a framework for storing "extra properties" that cannot be stored in the external system.
For example, the file system connector can't naturally map arbitrary properties to file attributes, and instead
uses a variety of techniques to stores these extra properties.
By default, ModeShape can store the extra properties inside the same Infinispan cache where the
repository's own internal (non-federated) content is stored. However, this may not be ideal, and so a
connector can provide its own implementation of the ExtraPropertiesStore interface:
/**
* Store for extra properties, which a {@link Connector} implementation can use to store and
retrieve
* "extra" properties on a node that cannot be persisted in the external system. Generally, a
connector
* should store as much as possible in the external system. However, not all systems are capable
of
* persisting any and all properties that a JCR client may put on a node. In such cases, the
connector
* can store these "extra" properties (that it does not persist) in this properties store.
*/
public interface ExtraPropertiesStore {
static final Map<Name, Property> NO_PROPERTIES = Collections.emptyMap();
/**
* Store the supplied extra properties for the node with the supplied ID. This will
overwrite any properties
* that were previously stored for the node with the specified ID.
*
* @param id the identifier for the node; may not be null
* @param properties the extra properties for the node that should be stored in this storage
area, keyed by
*
their name
*/
void storeProperties( String id,
Map<Name, Property> properties );
/**
Page 410 of 424
ModeShape 3
* Update the supplied extra properties for the node with the supplied ID.
*
* @param properties the extra properties for the node that should be stored in this storage
area, keyed by
*
their name; any entry that contains a null Property will define a property that
should be removed
*/
void updateProperties( String id,
Map<Name, Property> properties );
/**
* Retrieve the extra properties that were stored for the node with the supplied ID.
*
* @return the map of properties keyed by their name; may be empty, but never null
*/
Map<Name, Property> getProperties( String id );
/**
* Remove all of the extra properties that were stored for the node with the supplied ID.
*
* @return true if there were properties stored for the node and now removed, or false if
there were no extra
*
properties stored for the node with the supplied key
*/
boolean removeProperties( String id );
}
Then to use the extra properties store, simple call in the connector's initialize(...) method the "
setEtraPropertiesStore(ExtraPropertiesStore store)" method with an instance of your
custom store. Then, in your "store(Document)" and "update(Document)" methods, record these extra
properties. There are multiple ways of doing this, but here are a few:
ExtraProperties extraProperties = extraPropertiesFor(id, false);

// Add a single property ...
Property p1 = ...
extraProperties.add(p1);
// Or add multiple properties at once ...
Map<Name,Property> properties = ...
extraProperties.addAll(properties).except("jcr:primaryType", "jcr:created");
extraProperties.save();
Pageable connectors
A Document that represents a node will contain references to all the children of that node. These references
are relatively small (just the ID and name of the child), and for many connectors this is sufficient and fast
enough. However, when the number of children under a node starts to increase, building the list of child
references for a parent node can become noticeable and even burdensome, especially when few (if any) of
the child references may ultimately be resolved into their node representations.
Page 411 of 424
ModeShape 3
A pageable connector is one that want to expose the children of nodes in a "page by page" fashion, where
the parent node only contains the first page of child references and subsequent pages are loaded only if
needed. This turns out to be quite effective, since when clients navigate a specific path (or ask for a specific
child of a parent by its name) ModeShape doesn't need to use the child references in a node's document
and can instead simply have the connector resolve such (relative or absolute external) paths into an
identifier and then ask for the document with that ID.
Therefore, the only time the child references are needed are when clients iterate over the children of a node.
A pageable connector will only be asked for as many pages as needed to handle the client's iteration,
making it very efficient for exposing a node structure that can contain nodes with numerous children.
To make your ReadOnlyConnector or WritableConnector support paging, simply implement the
Pageable interface:
public interface Pageable {
/**
* Return a document which represents a page of children. The document for the parent node
* should include as many children as desired, and then include a reference to the next
* page of children with the {{PageWriter#addPage(String, String, long, long)}} method.
* Each page returned by this method should also include a reference to the next page.
*
* @param pageKey a non-null {@link PageKey} instance, which offers information about the
*
page that should be retrieved.
* @return either a non-null page document or {@code null} indicating that such a page
*
doesn't exist
*/
Document getChildren( PageKey pageKey );
}
ModeShape then knows that the document for the parent will contain only some of the children and how to
access each page of children as needed.
For example, here's an example of code that might be used in a connector's " getDocumentById(...)"
method to include some of the children in the parent node's document and to include a reference to a
second page of children. This uses an imaginary "Book" class that is presumed to represent information
about a book in a library:
Page 412 of 424
ModeShape 3
String id = "category/Americana";
DocumentWriter writer = newDocument(id);
writer.setPrimaryType("lib:category");
writer.addProperty("lib:description", "Classic American literature");
// Get the books in this category ...
Collection<Book> books = getBooksInCategory("Americana");
// Put just 20 in this document ...
count = 0;
for ( Book book : books ) {
writer.addChild(book.getId(),book.getTitle());
if ( ++count == 20 ) break;
}
if ( count == 20 ) {
// There were more than 20 books, so add a reference to the next page
// that starts with the 20th book ...
writer.addPage(id, 20, 20, books.size());
}
Then, the connector's "getPage(...)" method would implement getting the child references for a
particular page:
public Document getPage( PageKey pageKey ) {

String parentId = pageKey.getParentId();
int offset = pageKey.getOffsetInt();
String category = parentId.substring(9); // we assume this is "category/{categoryName}"
DocumentWriter writer = newDocument(parentId);
// Get the next 20 books in this category plus one so we know there are more ...
List<Book> books = getBooksInCategory("Americana").sublist(offset,offset+20+1); // no error
checking here!
for ( Book book : books ) {
writer.addChild(book.getId(),book.getTitle());
if ( ++count == 20 ) break;
}
if ( count == 20 ) {
// There were more than 20 books, so add a reference to the next page
// that starts with the 20th book ...
writer.addPage(id, 20, 20, books.size());
}
}
As you can see, the logic of getPage(...) is actually very similar to the logic that adds children in the
getDocumentById(...) method, and your connector might find it useful to abstract this into a single
helper method.
Page 413 of 424
ModeShape 3
9.4.2 Creating a custom connector

Create a ReadOnlyConnector or WritableConnector subclass
Create unit tests
Page 414 of 424
ModeShape 3
10 Tools for Eclipse

ModeShape also has several useful Eclipse plugins:
A file editor for JSR-283 Compact Node Definition (CND) files
A publishing tool for use with ModeShape servers
These do not come with the normal ModeShape distribution, but are installed as Eclipse features using our
Eclipse update site.
Installation
Compact Node Definition (CND) editor
Header Section
Namespaces Section
Node Types Section
CND Preference Page
ModeShape publishing tool
Configuration
Publishing
Want to help?
10.1 Installation
Each of these features can be installed separately or together by following these steps:
1. Start up Eclipse, then do: Help > Install New Software...
2. Copy the following site's URL
"http://download.jboss.org/jbosstools/updates/stable/kepler/integration-stack/modeshape/" into Eclipse, and
hit Enter.
3. When the site loads, select the features to install, or click the Select All button.
4. Click Next, agree to the license terms, and install.
10.2 Compact Node Definition (CND) editor

The ModeShape Tools Java Content Repository (JCR) Compact Node Type Definition (CND) Editor is a
2-page editor for *.cnd files. The first page is a form-based view of the CND file and the second page is a
readonly source view.
The CND editor can be used to edit CND files for any JCR 2.0 implementation and is not limited to
ModeShape users. It even can be installed separately from the ModeShape-specific features.
Here is what the CND Editor looks like:
Page 415 of 424
ModeShape 3
The CND Editor's form page consists of the following sections:

a header section, which displays error messages and a link to open the CND preference page,
a namespaces section, which displays and allows editing of the namespace mappings defined in the
CND, and
a node types section, which displays and allows editing of the node type definitions defined in the
CND.
10.2.1 Header Section

The header section contains a hyperlink that, when activated, opens the CND notation preference page.
Also, if the CND being edited has validation errors, the header section will have another hyperlink that
identifies the total number of validation errors found. Clicking the errors hyperlink opens a dialog that lists the
specific validation errors and provides a way to export those validation messages to a file.
Here is what the header section will look like when the CND has one validation error:
Page 416 of 424
ModeShape 3
10.2.2 Namespaces Section

The namespaces section is a collapseable area used to create and maintain the namespace mappings
declared in the CND file. A namespace mapping consists of a unique prefix, a unique URI, and an optional
comment. Namespace mappings can be copied and pasted within the same CND editor or between different
CND editors. The namespace section looks like this:
Namespace mappings are editing using the Namespace Editor show here:
10.2.3 Node Types Section

The node types section is used to create and maintain the node type definitions declared in the CND file.
The node types section consists of, along the left-side, a table containing all the declared node type
definitions, and a node type name filter box which allows the user to limit the number of node type definitions
being displayed. The node type table can be used to delete a selected node type. Also, node type definitions
can be copied and pasted within the same CND editor or between different CND editors.
The node type table, with the name filter on top, looks like this:
Page 417 of 424
ModeShape 3
The right-side of the node types section consists of a details area, as well as, collapseable areas for property
and child node definitions. When a node type definition is selected, it's corresponding information is used to
populate the details, properties, and child nodes areas. The details area is used to edit a node type's
namespace, name, supertypes, attributes, and an optional comment.
The node type details area looks like this:
Page 418 of 424
ModeShape 3
A node type definition can contain zero or more property definitions. When the properties area is expanded,
the following table will show the declared property definitions for the selected node type definition:
The properties table can be used to delete a selected property definition and can optionally show inherited
properties. Property definitions can be copied and pasted within the same CND editor or between different
CND editors. A property definition can be created or edited using the Property Definition Editor shown
here:
Page 419 of 424
ModeShape 3
A node type definition can contain zero or more child node definitions. When the child nodes area is
expanded, the following table will show the declared child node definitions for the selected node type
definition:
The child nodes table can be used to delete a selected child node definition and can optionally show
inherited child nodes. Child node definitions can be copied and pasted within the same CND editor or
between different CND editors. A child node definition can be created or edited using the Child Node
Definition Editor shown here:
Page 420 of 424
ModeShape 3
10.2.4 CND Preference Page

The CND Preference Page allows you to save CND files using the various CND notations available. The
notation type determines the size and readability of the output.
Here is what the CND Preference Page looks like:
10.3 ModeShape publishing tool

The publishing feature provides a way to upload files from your Eclipse workspace to a ModeShape
repository. The Publishing Dialog uses the files and directories selected in the workspace as the resources
which will be used in the publishing operation.
Page 421 of 424
ModeShape 3
10.3.1 Configuration
Before publishing files into a ModeShape repository, you need to make sure there is an appropriate EAP
server running with the ModeShape kit installed.
You need to make sure that the ModeShape server has at least one user which has the connect &
readwrite roles or the connect & admin roles. See Configuring ModeShape in EAP
After you know the server location, add it to the Eclipse ModeShape view:
1. Window - Show View - Other - Modeshape
2. Right click - New Server
3. Make sure the user you configure has both connect and write roles
4. Click "Test" to make sure the connection is ok and then Finish
10.3.2 Publishing
Once you have the server correctly configured, you can publish artifacts into the ModeShape repository in a
couple of ways:
1. In the ModeShape server view, right click on the server and select "New Publish Area" or
2. From the project view select files/folder the right click and ModeShape - Publish
Here is what the Publishing Dialog looks like:
Page 422 of 424
ModeShape 3
When a project or folder is selected, all their included files are published. You can use the Ignored
Resources Preference Page to identify resources that should always be excluded from publishing
operations.
Here is a what the Ignored Resources Preference Page looks like:
Page 423 of 424
ModeShape 3
The publishing dialog can also be used to delete, or unpublish, files from ModeShape repository
workspaces. Just select ModeShape -> Unpublish context menu item when opening the
publishing dialog.
10.4 Want to help?

Let us know what you think of these plugins. If you have questions, suggestions, or think you've found a bug,
contact us on our discussion forum or on IRC. We also welcome anyone that wants to help contribute code,
too.
The ModeShape Eclipse tools has its own JIRA and its own Git repository. Have a look at the code, and
even fork it on GitHub.
Page 424 of 424

ModeShape Guide-V5-20150918 - 1708

Uploaded by

Copyright:

Available Formats

ModeShape Guide-V5-20150918 - 1708

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ModeShape Guide-V5-20150918 - 1708

Uploaded by

Copyright:

Available Formats

ModeShape 3

Exported from JBoss Community Documentation Editor at 2015-09-18 17:08:06 EDT

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

1.1 Why use a repository

JBoss Community Documentation

1.1.1 Lots of choices for storing data

JBoss Community Documentation

1.1.2 What are repositories good at?

JBoss Community Documentation

JBoss Community Documentation

1.1.3 What are repositories bad at?

JBoss Community Documentation

1.1.4 What kinds of applications use repositories?

JBoss Community Documentation

Map<String,String> parameters = ...

1.2.2 Workspace and Sessions

JBoss Community Documentation

Repository repository = ...

1.2.3 Node, children, names, and paths

JBoss Community Documentation

JBoss Community Documentation

Repository repository = ...

// Or iterate over some children, using name patterns ...

There are a couple of interesting things in this example:

JBoss Community Documentation

JBoss Community Documentation

Properties and values

JBoss Community Documentation

1.2.4 Node types and mixins

and any children with or without same-name-siblings. This node type is

primary A concrete node type used to store a JCR query expression.

JBoss Community Documentation

Used on nodes that can be referenced directly by REFERENCE and

Used on a node when the repository should add properties automatically

capture when the node was created and by whom.

Used on a node when the repository should add properties that

Added to a node when the repository should created and automatically

maintain a "jcr:etag" property containing a value that is semantically

Added to a node to make it versionable using the JCR versioning API.

Added to a node to make it lockable using the JCR locking API.

Added to a node to make it able to be shared (i.e., linked) into multiple

locations within the same workspace or into different workspaces.

Added to a node when it should have a "jcr:title" property.

JBoss Community Documentation

JBoss Community Documentation

SELECT * FROM [veh:vehicle] AS vehicle

JBoss Community Documentation

SELECT * FROM [veh:vehicle] AS vehicle

SELECT * FROM [veh:vehicle] AS vehicle

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

1.3.1 Discovering support