Getty Images/iStockphoto

Tip

NoSQL database types explained: Document-based databases

NoSQL document-based databases store information in documents with specific keys, similar to a key-value store, but with different benefits and disadvantages.

Scientists are problem solvers. There are technological advancements that happened by chance and those that happened because of research. Regardless of how much effort is invested in coming up with an answer, these answers are always a means to a goal, solutions to a problem. 

NoSQL databases were created in response to the requirement to overcome certain limitations of relational databases. These limitations usually come as a result of the need to break free of the tight table schema of relational databases.

A major advantage of NoSQL databases is their performance and scalability. They are typically quick to process the data stored. Finally, another limitation overcome with NoSQL is the variety of data types that can be stored. 

What are document-based databases?

A document-based database, aka a document store, stores information within CML, YAML, JSON or binary documents such as BSON. To organize these documents in one whole, there is a specific key assigned to each document. This characteristic makes document stores similar to key-value stores.

Even though document stores do not have a unified schema, they are usually organized in order to easily use and eventually analyze data. This means they are structured, to an extent. Seeing that each object is commonly stored in a single document, there is no need for defining relationships between documents. 

These documents are in no way similar to tables of a relational database; they do not have a set number of fields, slots, etc. and there are no empty spaces -- the missing info is simply omitted rather than there being an empty slot left for it. Data can be added, edited, removed and queried.

The keys assigned to each document are unique identifiers required to access data within the database, usually a path, string or Uniform Resource Identifier. IDs tend to be indexed in the database to speed up data retrieval. 

The content of documents within a document store is classified using metadata. Due to this feature, the database "understands" what class of information it holds -- whether a field contains addresses, phone numbers or social security numbers, for example. For improved efficiency and user experience, document stores have query language, which allows querying documents based on the metadata or the content of the document. This allows you to retrieve all documents which contain the same value within a certain field.

Amazon has provided the following terminology comparison between SQL and a document database, MongoDB. The following list helps draw a parallel between the two types of databases:

  • SQL: Table, Row, Column, Primary key, Index, View, Nested table or object, Array
  • MongoDB: Collection, Document, Field, ObjectId, Index, View, Embedded document, Array

Advantages

One of the top priorities in any business is making the most of the time and resources given, increasing overall efficiency. Selecting the right database based on its purpose and the type of data collected is a crucial step. The following features of a document store could make it the right choice for you:

  • Flexibility. Document stores have the famous advantage of all NoSQL databases, which is a flexible structure. As mentioned previously, documents of one database do not require consistency. They do not have to be of the same type, nor do they have to be structured the same.
  • Easy to update. Any new piece of information, when added to a relational database, has to be added to all data sets to maintain the unified structure within a table of a relational database. With document stores, you can add new pieces of information easily without having to add them to all existing data sets.
  • Improved performance. Rather than pulling data from multiple related tables, you can find everything you need within one document. With everything kept in a single location, it is much faster to reach and retrieve the data.

Disadvantages

Regardless of the size of a database, by virtue of being only semistructured, NoSQL databases are simple when compared to relational databases. If you jeopardize the simplicity of a document store, you will also jeopardize the previously mentioned improved performance. You can create references between documents of a document store by interlinking them, but doing so can create complex systems and deprive you of fast performance. 

If you have a large-volume database and you would like to create a network of mutually referenced data, you are better off finding a way of mapping it and fitting it into a relational database. 

Use cases of document-based databases

A document store is an easy way to store and retrieve information contained within documents and pertaining to a single object. For this reason, this type of database is convenient for user profiles where you have information about a single user and the user chooses what to provide and how to do it. 

Similarly, to certain other NoSQL database types, it is also convenient for content management and user-generated content such as blogs.

Most popular document-based databases

Some of the most popular document stores are already mentioned in this article. Others include MongoDB Atlas, Amazon DynamoDB, Google Cloud Firestore and Couchbase Server. 

All NoSQL databases have something in common. They are usually designed to fit specific purposes and accommodate requests that a relational database would have a hard time fulfilling. You can now make the most of your data by not having to spend time placing it in a table of a relational database and mapping out the relationships between the tables. Simply store it from the existing data source and extract it most efficiently. 

Dig Deeper on Database management