Nothing Special   »   [go: up one dir, main page]

3 - Relationships in Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

mongoDB

Giảng viên: Từ Thị Xuân Hiền


Contents
▪ Introduction to relationships in MongoDB
▪ Types of MongoDB Relationships
▪ One-to-many Relationship
▪ Many-to-many Relationship
▪ One-to-one Relationship
▪ One-to-Zillions Relationship
Introduction to relationships in MongoDB
▪ Relationships in MongoDB are used to specify how one or
more documents are related to each other. In MongoDB, the
relationships can be modelled either by Embedded way or by
using the Reference approach.
▪ These relationships can be of the following forms
▪ One to One
▪ One to Many
▪ Many to Many
Introduction to relationships in MongoDB
▪ Embedded Document Model: the documents are embedded
inside one document.
▪ Example:
▪ There are two documents one is a student and a address
document.
▪ Instead of creating two different documents, we embed the address
documents inside the student document.
▪ It will help the user to retrieve the data using a single query rather
than writing a bunch of queries.
Introduction to relationships in MongoDB
▪ Reference Model: we maintain the documents separately but
one document contains the reference of the other documents.
▪ Example:
▪ There are two documents: student (contains the basic information of
the student) an address document (which contains the address of
the student)
▪ Student document contains the reference to the address
document’s via id field.
▪ Now using this reference id we can query the address and get the
address of the student.
Types of MongoDB Relationships
▪ Model One-to-One Relationships
▪ With Embedded Documents
▪ With Document References

▪ Model One-to-Many Relationships


▪ With Embedded Documents
▪ With Document References

▪ Model Many-to-Many Relationships


Model One-to-One Relationships
One-to-One: Embedded
▪ A data model that uses embedded documents to describe a
one-to-one relationship between connected data.
▪ Embedding connected data in a single document can reduce
the number of read operations required to obtain data.
▪ In general, you should structure your schema so your application
receives all of its required information in a single read operation.
▪ A one-to-one relationship is represented by a single table in a
tabular database. And the same applies to MongoDB.
One-to-One: Embedded

▪ When we group information together, that is in two


different entities, we refer to this action as
embedding.
▪ Or we can use the document model ability to add
sub-documents to create logical groups of
information. This capability also allows us to embed
the document or entity inside another one.
One-to-One: Embedded
▪ Example: a person's name, date of birth, and email address
would be kept together in the same document
One-to-One: Embedded
▪ One-to-One Relationship embed,
fields at the same level
▪ Very similar to tabular databases
▪ Use case: A user as only one street
address, city, and zip code for billing
and only one street address, city, and
zip code used as the default shipping
address.
One-to-One: Embedded

▪ One-to-One: embed, using


subdocuments
▪ Preferred representation:
▪ Preserves simplicity
▪ Documents are clearer

This address information may profit from a little


more organization.
Using the document model, regroup each set of
address information into a subgroup.
One-to-One: Embedded
▪ Example: consider two documents, patron and address, In
this one-to-one relationship between patron and address
data, the address belongs to the patron.
// patron document // address document
{ {
_id: "joe", patron_id: "joe", // reference to patron document
name: "Joe Bookreader" street: "123 Fake Street",
} city: "Faketon",
state: "MA",
zip: "12345“ }
One-to-One: Embedded
▪ Example (cont.): The better {
data model would be to embed _id: "joe",
the address data in the name: "Joe Bookreader",
patron data address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
}
One-to-One: Reference
▪ We can divide the fields into many documents, usually in
separate collections and reference one document from
another one.
▪ The most common way to represent a one-to-one
relationship between pieces of data is to put the fields with
their values in the document as described earlier.
One-to-One: Reference
▪ Example:
▪ The list of staff employees is kept within the store document.
▪ When we create the stores, we don't care about the staff
information, we could separate out this set of information, placing it
in a second collection, this adds some complexity to model.
▪ So we should only do it for schema optimization reasons.
One-to-One: Reference
▪ Example:
▪ Once we have retrieved a
given store, we would find
additional information like
the manager and staff by
querying the store details
collection using the link to
the corresponding
document, as we do for
any relation expressed
by a reference.
One-to-One: Reference
▪ Example
▪ In the case of a one-to-one relationship, it is easy to simply use the
same value, here our store ID in both documents.
▪ We also want to prevent this one-to-one relationship from
becoming a one-to-many relationship for subdocuments.
▪ In order to do so, we need to ensure that the values in our store ID
field are unique for both collections.
One-to-One: Reference
▪ One – to – One with Document References:
▪ Possible performance improvements with:
▪ Smaller disk access
▪ Smaller amount of RAM needed
One-to-Many Relationship

▪ An object of a given type is associated with n


objects of a second type, while the relationship in
the opposite direction, each of the objects of the
second type can only be associated with one object
of the first type.
One-to-Many Relationship

▪ As for the many-to-many relationships, most of them


can be expressed as two one-to-many relationships
One-to-Many Relationship
▪ Example:
▪ A person and their credit cards: A person has n credit
cards, but each of these credit cards belongs to one and
only one person.
▪ A blog entry and its comments: a Blog where a blog
might have many Comments but a Comment is only
related to a single Blog.
One-to-Many Relationship
▪ Example:
One-to-Many Representations
▪ Embed: Usually, embedding in the entity the most queried
▪ In “one” side
▪ In “Many” side
▪ We can embed each document from the many side into the
document on the one side.
▪ Or vise versa, we can embed the document from the one
side into each document on the many side.
One-to-Many Representations
▪ Reference: Usually, referencing in the “many” side
▪ In “one” side
▪ In “Many” side
▪ Keep the documents in two separate collections and
reference documents from one collection in documents of the
other collection.
One-to-Many: embed, in the “one” side
▪ The first representation embeds the end
documents as an array of sub documents.
▪ Example:
▪ In product catalog, we keep the top reviews of
an item within the item itself because we want to
display these reviews once the item gets
retrieved from the database.
▪ For simple applications where the number of
embedded documents is small
▪ For quartering on the many side, we use
multi key indexes
One-to-Many: embed, in the “one” side
▪ Summary: with One-to-Many embed
▪ The documents from the “many” side are embedded
▪ Most common representation for
▪ Simple applications
▪ Few documents to embed
▪ Need to process main object and the N related documents together
▪ Indexing is done on the array
One-to-many: embed, in the “many” side
▪ The second representation of one-to-many relationships is
to embed the document from the one side in each of the
documents associated with it from the many side.
▪ Example:
▪ An address and the order's delivery to the address.
▪ there are multiple orders shipped to the same address, so there is a
one-to-many relationship between the address and the order.
One-to-many: embed, in the “many” side
One-to-many: embed, in the “many” side
▪ Result:
▪ it makes more sense for us to store the address, the one
side of the relationship, on every order, the many sides of
the relationship, rather than the other way around.
▪ This representation is less often used.
▪ The main disadvantage of this representation is that the
embedded object must be duplicated in many locations
One-to-Many: reference, in the “one” side

▪ The third representation is to have two collections:


▪ From the one side, we reference the many side.
▪ To do so, we need an array of references.

▪ Example
▪ In zips collection, which contain zip codes, we create an array of
stores where each element in the array is a store ID value that
identifies documents in the store's collection.
One-to-Many: reference, in the “one” side

▪ The third representation is to have two collections:


▪ Example:
One-to-Many: reference, in the “one” side

▪ The third representation is to have two collections:


▪ The referencing representations are great.
▪ Array of references
▪ Allows for large documents and a high count of there
▪ List of references available when retrieving the main object
▪ Cascade deletes are not supported by mongoDB and must be
managed by the application
One-to-Many: reference, in the “many” side

▪ More commonly, references are stored in the documents on


the many side of a one to many relationship.
▪ Example:
▪ We have a collection of zip codes, each zip code adding possibly
many stores in it.
▪ By adding a single field zip in each of stores documents, I can
reference the document in the zips collection.
▪ If we delete a store, there is no additional reference to remove
because the reference is inside the document we are removing.
One-to-Many: reference, in the “many” side
Recap for the One-to-many Relationships

▪ There are a lot of choices: embed or reference, and choose the


side between “one” and “many”
▪ Duplication many occur when embedding on the “many” side.
However, it may be OK, or even preferable
▪ Prefer embedding over referencing for simplicity, or when there is
a small number of referenced documents as all related information
is kept together embed on the side of the most queried collection
▪ Prefer referencing when the associated documents are not always
needed with the most often queried documents
Many-to-Many relationship
▪ The many-to-many relationship is identified by documents
on the first side being associated with many documents
on the second side, and documents on the second side
being associated with many documents on the first side.
▪ Example:
▪ Looking at products sold in stores, we can see that the given store
sells many items, and each item is sold in many stores.
▪ This relationship can trick you into thinking it is a one-to-many
relationship
Many-to-Many relationship
▪ Example
Many-to-Many ➔ 2 x one-to-Many
▪ In a normalized relational model, we can't link two tables as
many-to-many.
▪ An additional relationship table needs to be created to define
this relationship,
▪ It breaks a many-to-many relationship into two one-to-
many relationships linked together by our extra third table.
Many-to-Many ➔ 2 x one-to-Many
Many-to-Many ➔ 2 x one-to-Many
▪ MongoDB's flexible schema easily allows for this type of
schema modification, and is more forgiving, as scatter field
can easily be transformed into an array field.
▪ Example:
▪ People and phone numbers.
▪ Someone may have a few phone numbers.
▪ Some of these phone numbers are exclusive, and some are not.
▪ A family's shares the home phone number. So a person can have
many phone numbers, and the phone number may be owned by
many people, resulting in a many-to-many relationship.
Many-to-Many ➔ 2 x one-to-Many
▪ Example:
Many-to-Many ➔ 2 x one-to-Many
▪ Example (cont.):
▪ We can treat the phone number
for a home as uniquely owned
by each member of the family
by making copies of it.
▪ Now, we have a one-to-many
relationship instead, which
removes complexity.
Many-to-Many ➔ 2 x one-to-Many
▪ Some issues:
▪ if the family moves, we must modify each family's members phone
numbers separately.
▪ Performing the same update multiple times may not sound like the
right design, however, where we store only one telephone number
value in the database.
▪ If someone moves-- like the child, for example-- and updates their
phone number, the update will apply to all the members of the family.
▪ We don't get the option to choose between doing multiple updates or
one update that applies to all
Many-to-Many Representations
1. Embed
a. Array of subdocument in the “many” side
b. Array of subdocuments in the other “many” side
Usually, only the most queried side is considered

2. Reference
a. Array of reference in one “many” side
b. Array of references in the other “many” side
Many-to-Many: embed, in the main side
▪ Example:
▪ Let's use the carts and items from our product catalog
▪ The main entity is the cart in which we want to find the items
▪ We embed the items in the cart, because we always retrieve this
information together.
▪ Having copies of items in the carts period is usually fine, because
they represent the state of those items at the time they were added to
the cart.
▪ The same applies to addresses and orders.
Many-to-Many: embed, in the main side

Many-to-Many: embed, in the main side
▪ Example
▪ The address used for that order at the time of the order creation
should be duplicated
▪ This requirement of keeping a source of input applies only to this
specific representation in the many-to-many relationship
▪ An item may exist without being in any carts
▪ The address used for that order at the time of the order creation
should be duplicated.
Many-to-Many: embed, in the main side
▪ Example
▪ We need to keep a collection for the items.
▪ There will be several other access patterns in your application that
utilize items without or needing the information on orders they
have been added to.
▪ Item documents have different life cycle than cart documents.
▪ An item may exist without being in any carts,
▪ This requirement of keeping a source of input applies only to this
specific representation in the many-to-many relationship.
Many-to-Many: embed, in the main side
Many-to-Many: reference, in the seconday side

▪ The second representation of the many-to-many


relationship that uses references is the one where we keep
the reference in another collection.
▪ Example:
▪ Each store has a field called item sold that carries a list of references
to the items sold in the given store.
▪ When we retrieve an item, we still don't know where it is sold.
▪ We need a second query to get this information, which was another
case in the previous representation.
▪ A query like the following will return to stores in which the item with an
ID of 10 or green MongoDB T-shirt has sold.
Many-to-Many: reference, in the seconday side

▪ Example:
Recap for the Many-to-Many Relationships

▪ Ensure it is a “many-to-many” relationship that should not be


simplified
▪ A “many-to-many” relationship can be replaced by two “one-to-
many” relationships but does not have to with the document model
▪ Prefer embedding on the most queried side
▪ Prefer embedding for information that is primarily static overtime
and may profit from duplication.
▪ Prefer referencing over embedding to avoid managing duplication
Recap for the One-to_One Relationships:
▪ Prefer embedding over referencing for simplicity
▪ Use subdocuments to organize the fields
▪ Use a reference for optimization purposes
One-to-Zillions Relationship
One-to-Zillions Relationship
▪ In previous lessons, we used the word zillions, and
introduced a graphical notation for representing the one-to-
zillions relationships.
▪ We extended the crow's foot notation by adding fingers the
foot in order to easily see those zillions sides.
▪ We should also use the notation used in this course for
cardinalities.
One-to-Zillions Relationship
▪ One thing is to identify a relationship as one to zillions, but
better still if you can quantify that same relationship.
▪ What is the minimum number of associated document?
▪ The most likely number and the maximum number.
▪ The maximum number is what we care the most about this
relationship.
▪ Zillions means something is humongous, out of proportion, so
watch out for it.
One-to-Zillions Relationship
One-to-Zillions Representations
One-to-Zillions: reference, in the “Zillions” side

▪ Given the cardinality of these relationships and the pressure


on computational resources to process them, you need to be
on the lookout for very large arrays of subdocuments or
unbound arrays of references.
▪ Looking back at the representation for the one-to-many
relationships, we have a single one left-- the representation
where we referenced the document on the one side of the
relationship from the many or zillion side.
One-to-Zillions: reference, in the “Zillions” side
Recap for the One-to-Zillions Relationships

▪ It is a particular case of the one-to-many relationship.


▪ The only available representation is to reference the
document on the “one” side of the relationship from the
Zillion” side.
▪ Pay extra attention to queries and code that handle “zillions”
of documents.

You might also like