Browse Definitions :
serverless database NoSQL (Not Only SQL database)
X
Definition

graph database

What is a graph database?

A graph database, also referred to as a semantic database, is a software application designed to store, query and modify network graphs. A network graph is a visual construct that consists of nodes and edges. Each node represents an entity, such as a person, and each edge represents a connection or relationship between two nodes.

Graph databases are a type of database design that has been around in some variation for a long time. As an example, a family tree is a simple graph database.

The concept behind graphing a database is often credited to 18th-century mathematician Leonhard Euler. The concept of using databases to map relationships digitally became popular in business around 2015 when increased compute power, in-memory computing and agreed-upon standards moved the concept from academics to real-world uses in business and enterprise computing.

Graph databases store and represent data in the form of nodes, edges and properties. Where nodes represent entities and edges represent the relationships between entities, properties are the attributes associated with nodes and edges that provide additional context. Nodes are stored as records, with edges and properties represented as pointers between nodes.

Graph databases are well suited for analyzing interconnections, which is why they're being used to mine data from social media. Graph databases are also useful for working with data in business disciplines that involve complex relationships and dynamic schema, such as supply chain management, identifying the source of an Internet Protocol telephony issue and creating "Customers who bought this also looked at ..." recommendation engines.

How do graph databases work?

Graph databases are structured using a graph model -- different from traditional relational databases. Graph databases are unique in their structure, how they manage and store information, and how they're queried.

The structure of a graph database

Graph databases work using a graph model, which consists of nodes, edges and properties that form the graph structure. Nodes represent entities like people, places or things, and edges represent relationships between entities. Properties are the attributes -- stored as additional metadata -- associated with edges and nodes.

Traditionally classified as a type of NoSQL database, graph databases exist in several types, such as triple stores, for example. This type of database uses a special index that stores information about nodes, edges and the relationships between them in groups of three.

A triple, which is also referred to as an assertion, has three main fields: a subject, a predicate and an object. Each subject, predicate or object is represented by a unique resource identifier (URI).

How information is indexed

In a triple store, the first field in the database holds the URI for the subject, the second field holds the URI for the predicate, and the third field holds the URI for the object. While there are several strategies that graph databases can use for storing triples, most use an index that abbreviates the three primary fields to ?s, ?p and ?o.

For example, if the visual construct for a graph is given as follows:

An example of a visual construct for a graph database.
Graph models consist of nodes, edges and properties that form a graph.

Then the index will look like this:

Row ?s ?p ?o
1 :Chris :marriedTo :Sheila
2 :Chris :brotherOf :Tony
3 :Chris :listensTo :RockMusic
4 :Sheila :listensTo :RockMusic
5 :Sheila :sisterInLawTo :Tony
6 :Tom :worksFor :XYZ
7 :Tom :hasPet :Rover
8 :Chris :drives :Audi
9 :Chris :colleagueOf :Tom
10 :Chris :worksFor :XYZ

Nodes might also be stored in clusters. Clustering in graph databases is the process of identifying groups, also called communities, of nodes that show adjacent connections in a graph. These node clusters are closely related to each other, compared with nodes that aren't in the cluster.

How information in a graph database is queried

Graph algorithms are used to analyze the relationships of interconnected graph data. They perform tasks like finding patterns, shortest connected paths and distance between vertices, as well as the vertices' importance and clustering.

Graph query languages are programming languages used to interact with graph databases. They let users retrieve, manipulate and analyze graph data. Graph query languages have features that enable users to add, edit and query data, and can efficiently process complex queries. For example, the ISO/IEC 39075:2024 standard, published in April 2024, outlines data structures and basic operations on property graphs.

Each triple in a graph database only gets stored once in the index. Just like relational databases, it's a simple process to do a straight lookup query in a graph database. If the query is for what information is known about Chris, the indexer programming only needs to search rows 1-3 of the database.

The real power and speed of a graph database come from indexing combinations of triples. The following are a few examples:

  • If the query is for who Chris is married to, the indexer will look for the predicate :marriedTo in rows 1-3 and then retrieve the matching object. Chris is married to Sheila.
  • If the query is to identify everyone who listens to the same kind of music as Chris, the indexer will first ask { :Chris :listensTo ?o } and identify :RockMusic as the object.

In the second query, the results will return :RockMusic in rows 3 and 4. The subject in row 3 is Chris himself, so whoever is the subject in row 4 will be the other person who listens to rock music. It turns out to be Sheila, Chris' wife.

Types of graph databases

There are several usable types of graph databases, including the following:

  • Knowledge graphs. Knowledge graphs like the one above focus on the semantic aspects of data and store information in triples.
  • Property graphs. Property graphs support nodes, edges and model relationships among data points, with detailed information about the subject and how that data interrelates.
  • Resource Description Framework graphs. RDF graphs express data in graphs using object, predicate and subject parts. They're used to manage interlinking resources and metadata, and conform to World Wide Web Consortium standards.
  • Triple store. Triple store graph data models store data in the triple format consisting of a subject-predicate-object data structure. They're used for storage and querying of data, including semantic relationships.
  • Hypergraph databases. Hypergraph models enable hyperedges, which are relationships that connect two or more nodes. They're used with graphs that have more complex many-to-many relationships.

Use cases for graph databases

Current use cases for graph databases include the following:

  • Enabling data analysts to federate data sets without having to create and run complex queries that join combinations of tables together, as in the relational database model.
  • Helping developers create the back end for voice assistants by mapping possible user questions to correct answers.
  • Examining direct connections to identify potential indirect connections for recommendation engines.
  • Detecting cases of fraud by quickly analyzing patterns and connections between entities such as users, accounts or transactions, for example. Anomalous behavior is identified by detecting unusual patterns in the graph.
  • Modeling relationships between interactions on social media platforms, such as users, posts, comments and likes. This enables more efficient querying and analysis of social connections and more personalized social media features.
A chart comparing graph databases to relational databases.
From nodes and edges to their top use cases, graph databases differ from relational databases in several ways.

Future of graph databases

Graph database tools like Amazon Neptune and Neo4j are becoming increasingly available. In addition, they offer benefits such as enabling more flexible data modeling, efficient querying, real-time insights and simplified data integration. As such, it's becoming easier for developers to add graph analytics to existing applications while still maintaining data integrity, scalability and consistency.

Graph databases are expected to play an increased role in areas such as machine learning, Bayesian analysis, data science and artificial intelligence, as well as help to manage enterprise data and data interchange over the next decade.

Another potential impact on this type of database is improvements in data federation. When knowledge graphs can be easily federated, one database will be able to determine that it needs data it doesn't have and automatically retrieve that data from other knowledge graphs. With this ability, it's possible that federation can help developers create blockchains that use relevant metadata to authenticate transactions in banking, finance, voting and smart contracts.

Graph databases excel at creating data and analytics innovations. Learn more about Gartner's predictions for the future of graph databases.

This was last updated in May 2024

Continue Reading About graph database

Networking
  • subnet (subnetwork)

    A subnet, or subnetwork, is a segmented piece of a larger network. More specifically, subnets are a logical partition of an IP ...

  • secure access service edge (SASE)

    Secure access service edge (SASE), pronounced sassy, is a cloud architecture model that bundles together network and cloud-native...

  • Transmission Control Protocol (TCP)

    Transmission Control Protocol (TCP) is a standard protocol on the internet that ensures the reliable transmission of data between...

Security
  • cyber attack

    A cyber attack is any malicious attempt to gain unauthorized access to a computer, computing system or computer network with the ...

  • digital signature

    A digital signature is a mathematical technique used to validate the authenticity and integrity of a digital document, message or...

  • What is security information and event management (SIEM)?

    Security information and event management (SIEM) is an approach to security management that combines security information ...

CIO
  • product development (new product development)

    Product development -- also called new product management -- is a series of steps that includes the conceptualization, design, ...

  • innovation culture

    Innovation culture is the work environment that leaders cultivate to nurture unorthodox thinking and its application.

  • technology addiction

    Technology addiction is an impulse control disorder that involves the obsessive use of mobile devices, the internet or video ...

HRSoftware
  • organizational network analysis (ONA)

    Organizational network analysis (ONA) is a quantitative method for modeling and analyzing how communications, information, ...

  • HireVue

    HireVue is an enterprise video interviewing technology provider of a platform that lets recruiters and hiring managers screen ...

  • Human Resource Certification Institute (HRCI)

    Human Resource Certification Institute (HRCI) is a U.S.-based credentialing organization offering certifications to HR ...

Customer Experience
  • contact center agent (call center agent)

    A contact center agent is a person who handles incoming or outgoing customer communications for an organization.

  • contact center management

    Contact center management is the process of overseeing contact center operations with the goal of providing an outstanding ...

  • digital marketing

    Digital marketing is the promotion and marketing of goods and services to consumers through digital channels and electronic ...

Close