The NoSQL-Relational Comparison Series Continues- Relational SQL vs MongoDB Methods

Aaron Boissonnault

When we compare relational and NoSQL Systems, one of the critical analyses we have to perform is data access mechanisms. As we’ll learn over the next few articles of this series on data access, the SQL language used by relational database management systems is much more feature rich and powerful than its NoSQL counterpart. This statement isn’t intended to sway readers to relational systems, it is just the author’s evaluation of both systems’ access languages.

The acronym CRUD stands for Create, Read, Update and Delete. They are the operations used to add, retrieve, manipulate and remove data stored in database management systems. Relational databases use an extremely powerful, industry-standard Structured Query Language (SQL) to perform data interaction operations.

Their NoSQL counterparts provide simpler APIs and languages to store and access data. Most of the NoSQL product offerings have unique languages and mechanisms that provide CRUD capabilities. In this installment of our relational to NoSQL comparison series, we’ll compare SQL CRUD operations with MongoDB’s access methods. Our discussion will include data manipulation as well as advanced query constructs that include joins, or lack thereof. In the next article of this series, we’ll take a look at data manipulation and removal.

Relational SQL

Relational’s Structured Query Language, or SQL, is a language specifically designed to interact with data stored in a relational database management system. The language’s foundation is based upon algebra which is a branch of mathematics in which symbols are used to represent numbers or members of a specified set. The symbols represent quantities and general relationships for the set’s members. The set is combined with binary operations that are defined on the set. EF CODD, the father of relational database management systems, first defined relational algebra’s form in 1970. The SQL language can be further decomposed into:

Data Definition Language (DDL) Used to build schemas and related storage objects in the database management system
Data Manipulation Language (DML) Statements that allow applications to interact with database data – SELECT, INSERT, UPDATE, DELETE. Also known as CRUD operations
Data Control Language (DCL) Statements used to control security within the database. Who can access what

Over the years, SQL has become the standard language used to interact with relational databases. It has been adopted as a standard by various organizations that include the American National Standards Institute (ANSI) and the International Standards Organization that is affiliated with the International Electrotechnical Commission (ISO-IEC). The intent of the standardization is to make the language portable between systems. It has become a common language used to interact with all relational database management systems (RDBMS). Since all major RDBMS offerings provide support for the SQL language, developers and DBAs can easily transfer their skills from one relational database to another. We need to use the term “portable” somewhat loosely, in that all database vendors offer product-specific enhancements to the language. Product-specific enhancements are most often not portable between the vendor offerings.

MongoDB Methods

MongoDB provides methods and commands to administer the database environment and interact with stored data. Methods can be accessed from the abstraction layer, and a subset of them are wrappers for a command or set of commands. In this article, we’ll keep the focus on the MongoDB methods that the application layer uses to interact with data stored in a MongoDB database. Before we begin our comparison of MongoDB and SQL query statements, a few definitions and comparisons are in order. Since I’m an Oracle DBA, I’ll use that product as my representative for the relational systems.

Relational (Oracle)	NoSQL (MongoDB)
Database	Database
Table	Collection
Row	Document
Column	Field
Index	Index
Primary Key	_ID field
Join	Reference or Embedded Document
Check Constraint	Validation Rules

Operation	Relational (Oracle)	NoSQL (MongoDB) Method
Query Data	SELECT	db.collection.FIND() – where collection is the name of the collection being searched
Insert Data	INSERT	db.collection.insertOne () – inserts single document db.collection.insertMany() – inserts multiple documents db.collection.insert() – inserts one or more documents
Update Data	UPDATE	db.collection.updateOne() – updates single document db.collection.updateMany() – updates multiple documents db.collection.replaceOne() – replaces a single document db.collection.update() – updates one or more documents
Delete Data	DELETE	db.collection.deleteOne() – deletes single document db.collection.deleteMany() –deletes multiple documents db.collection.remove() – deletes one or more documents

Term	Definition
Projection	Clause that limits the relational columns or MongoDB fields being returned by the query
Selection	Clause that limits the relational rows or MongoDB documents being returned by the query
Cursor	Control structure that allows an application to traverse the results returned by a given query

Let’s take a look at a basic query:

Relational
SELECT empno, ename	Projection
FROM hr. employees	Schema Object
WHERE salary > 35000	Selection

MongoDB
db.employees.find(	Schema Object
{ salary: {$gt: 35000 } }	Selection
{ empno: 1, ename: 1 }	Projection

Both Oracle and MongoDB use query operators to enable simple and complex data selection to occur:

Operator	Oracle	MongoDB
Equal to	=	$eq
Greater than	>	$gt
Greater than or equal to	>=	$gte
Less than	<	$lt
Less than or equal to	<=	$lte
Not equal to	!=	$ne
Matches and values in array	IN	$in
Does not match values in array	NOT IN	$nin
Matches conditions of either clause	OR	$or
Matches conditions of both clauses	AND	$and
Does not match query expression	NOT	$not
Does not match both query expressions	NOT	$nor
String pattern searches	LIKE	$regex
Lack of a value	NULL	NULL*
Matches documents with specified field	N/A	$exists
Matches documents with specified type	N/A	$type

I purposely didn’t include all possible operators as both databases have literally dozens. Oracle, by far, is the clear winner when it comes to the breadth and depth of operators available. This article is intended to be a general comparison of NoSQL and relational systems data access mechanisms, not an in-depth analysis of SQL, NoSQL syntax. I’ll provide a more in-depth comparison in the last article of the Relational vs NoSQL data access series. Here’s an example of sorting the results being returned:

Relational
SELECT empno, ename, address	Projection
FROM hr. employees	Schema Object
WHERE salary > 35000	Selection
ORDER BY empno	Sorting Results (use DESC keyword for descending)

MongoDB
db.employees.find(	Schema Object
{ salary: {$gt: 35000 } }	Selection
{ empno: 1, ename: 1 } )	Projection
.sort ( { empno: 1 } )	Sorting Results (use -1 keyword for descending)

Relational Joins

A SQL join clause combines columns from two or more relational tables. Attributes that are common to both tables are used to relate the objects together. Let’s take a look at an Employee and Department table as an example. Our sample Employee table stores information related to personnel working in our organization. The table stores their employee number, name, job and department identifiers. The DEPTNO column in the employee’s row contains the number of the department that the employee is assigned to. But, there may be other attributes that further describe a given department that we want to store in our database. Departments have a department number, but they also have department name and location attributes.

Employee Table
EMPNO	Employee’s Identification Number
ENAME	Employee’s Name
JOBNO	Employee’s Job Number
MGRNO	Employee Manager’s Number
DEPTNO	Department Employee Works In
…	Additional Information for Employee
Department Table
DEPTNO	Department Number
DEPTNAME	Department Name
DEPTLOC	Location of Department
…	Additional Information for Department

When we want to retrieve information from both tables that is related to a given employee, we would use a SQL join operation that combines, or joins, both tables together. SQL supports various join syntaxes. Here’s some of the more common join clauses:

Inner Join – Join between two tables with an explicit join clause. Returns data based on matching values contained in the columns specified in the join clause
Left Outer Join – Join between two tables with an explicit join clause but also returns unmatched rows from first table
Right Outer Join – Join between two tables with an explicit join clause but also returns unmatched rows from second table

The SQL statement below retrieves the names of our employees and the departments they work in. We are joining the two tables together using the DEPTNO column which is an attribute that is common to both tables. Information from both tables will be returned to our application when the DEPTNO column from our Employee table matches the value from the DEPTNO column in our Department table.

SELECT employee.ename, department.deptname

FROM employee

INNER JOIN department

ON employee.deptno = department.deptno

Normalization and Relational Schema Design

Here is where relational schema normalization becomes important. Normalization is the process that has the goal of storing a fact in its most appropriate place. No schema design is perfect, but the goal of normalization is to store a fact in one place to minimize update, delete and insert anomalies. The more times a fact is duplicated in the schema, the more challenging it is to ensure its validity. The graphic below depicts our sample Employee table. In a denormalized model, when we need to change the location (DEPTLOC column) for Department 10 from “Pittsburgh” to “New York,” we are required to find and update every row that has a DEPTNO=10. The application would be required to change all rows of the employee table that have that DEPTNO value, in this case, FOOT and ALLEN. The second table is our newly created Department table. If we move all of the related attributes for a department to a separate table, we are able to more easily ensure their quality. When our applications need to update information related to a department, the applications don’t have to update multiple rows, and oftentimes multiple tables depending on schema design, to ensure all information related to a department is consistent.

MongoDB Embedded Documents, Manual References and DBRefs

MongoDB, like the majority of NoSQL offerings, does not support join operations. However, MongoDB does provide a few alternatives. I’ll provide examples of the first two, which are the most common implementations:

Embedded documents

The data is denormalized. In our example, the employee’s department information would be stored as an embedded document in our Employee document. If any information related to our departments needs to be changed, all employee documents would have to be searched and updated accordingly.

{   empno: 7388,
ename: ‘Foot’,
jobno: 87,
mgrno: 6599,
hiredate: new Date(‘Dec 12, 2015’),               
            department:
{ deptno: 10,
deptname: ‘sales’,
depltloc: ‘Pittsburgh’
}
}

Manual references Users would save the _ID field of the Department document in the Employee document. The application would execute a second query using the Department’s _ID field to return the related data from the Document collection. The application would have to maintain the _ID fields in the Employee documents programmatically to ensure their validity.

{   empno: 7388,
ename: ‘Foot’,
jobno: 87,
mgrno: 6599,
hiredate: new Date(‘Dec 12, 2015’),
       department_id_locator: <object ID of Department 10 Document>
}

DBRefs

DBRefs are references from one document to another using the value of the first document’s _id field, collection name, and, optionally, its database name. By including these names, DBRefs allow documents located in multiple collections to be more easily linked with documents from a single collection. To resolve DBRefs, our application must perform additional queries to return the referenced documents. Some MongoDB drivers have helper methods that form the query for the DBRef automatically.

$LOOKUP (Aggregation Stage)

MongoDB 3.2 introduces the $lookup operator that can now be included as a stage in an aggregation pipeline. $lookup performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. The $lookup stage does an equality match between a field from the input documents with a field from the documents of the “joined” collection.

Wrap-up

In the next article of the series, we’ll discuss data manipulation and removal. We’ll finish the data access series with an overview of the strengths and weaknesses of both relational and NoSQL systems. We’ll then move forward with our Relational-NoSQL comparison series by looking at transaction support and locking mechanisms. Thanks for reading!

Cassandra as distributed NoSQL cloud database across multiple data centers

Blog

by Megan Ferringer

Database High Availability in Azure

Blog

by Megan Ferringer

About the Author

Aaron Boissonnault

Aaron is the Chief Information Security Officer at Navisite and is responsible for security and compliance programs and the delivery of our managed security and advisory services to help customers implement and maintain the highest levels of cybersecurity protection and best practices. Aaron has more than 18 years of experience in the industry, holding various leadership and consulting roles in cloud security, infrastructure security and security operations.

The NoSQL-Relational Comparison Series Continues- Relational SQL vs MongoDB Methods

Cassandra as distributed NoSQL cloud database across multiple data centers

Database High Availability in Azure

Categories

About Us

Follow Us & Share

Press Releases