neo4j

Understanding relationships ..

neo4j

Neo4j is a leading graph database management system designed to store, manage, and query highly connected data.

Unlike traditional relational databases that organize data in tables, Neo4j structures data as a graph of nodes, relationships, and properties.

Neo4j stores data in a property graph model where relationships are first-class citizens.

This approach makes traversing connections between data points extremely efficient - a significant advantage over relational databases that require expensive join operations.

Cypher Query Language

Neo4j uses Cypher, a declarative query language specifically designed for working with graph data.

Cypher syntax is visually intuitive, using ASCII art to represent patterns:

MATCH (person:Person)-[:LIVES_IN]->(city:City {name: "London"})
RETURN person.name

neo4j

Create the Neo4j directories.

cd
mkdir -p Neo4j/neo4j_db/{data,logs,import,plugins}

Create a docker-compose.yml file.

cd
cd Neo4j
nano docker-compose.yml

services:
  neo4j:
    container_name: neo4j
    image: neo4j:latest
    ports:
      - 7474:7474
      - 7687:7687
    environment:
      - NEO4J_AUTH=neo4j/${NEO4J_PASSWORD}
      - NEO4J_apoc_export_file_enabled=true
      - NEO4J_apoc_import_file_enabled=true
      - NEO4J_apoc_import_file_use__neo4j__config=true
      - NEO4J_PLUGINS=["apoc", "graph-data-science"]
    volumes:
      - ./neo4j_db/data:/data
      - ./neo4j_db/logs:/logs
      - ./neo4j_db/import:/var/lib/neo4j/import
      - ./neo4j_db/plugins:/plugins

APOC (Awesome Procedures on Cypher) -
Graph Data Science plugins -

Create a file called docker-compose.override.yml.

services:
  neo4j:
    environment:
      - NEO4J_server_memory_heap_initial__size=6G
      - NEO4J_server_memory_heap_max__size=6G

This enables you to commit parameters in the configuration you would like to change, but not the values of those parameters. This in turn avoids overriding different memory limit values from different machines - WSL is assigned 50% of total mem and 25% of the 50% to SWAP.

Again, you can create a .env.example and gitignoring .env

cd
cd Neo4j
nano .env

NEO4J_PASSWORD=<your password>

Onnce everything is in place.

cd
cd Neo4j
docker-compose up -d

Log into Neo4j:

Username: neo4j

Password: password

Movie Graph

The database contains information about movies and person (actors). Additionally, it contains ratings about movies by users.

To get the basics of creating a Knowledge Graph - try out the Movie Graph tutorial.

Create the data - Cypher CREATE clause is used to create data.

Explore .. click on the various Node & Relationships perspectives .

Scroll down and run the Cypher query to constrain the dataset to Tom Hanks films and roles.

To improve Query performance, add indexes - movie_title & person_name.

Create a couple of indexes - person_born & movie_released.

Evaluate the size of the database with the following APOC procedure:

CALL apoc.meta.stats YIELD nodeCount, relCount, labels

Run some Cypher queries against the Nodes (Subjects) : Movie or Person (constraint)

Turning to Relationships (Predicate) - Match the pattern : ACTED_IN

There's an error in the query:

MATCH (p:Person {name: "Tom Hanks"})-[:ACTED_IN]->(movie) RETURN p.movie

The issue is the query is trying to return movie as a property key of Person when its a Node:

MATCH (p:Person {name: "Tom Hanks"})-[:ACTED_IN]->(movie) RETURN movie

Six degrees .. Traversing the Nodes

Cypher has a built-in algorithm - "shortestPath" - to determine the degree of separation between Nodes:

MATCH path=shortestPath( (:Person {name:"Kevin Bacon"})-[*]-(:Person {name:"Meg Ryan"}) ) RETURN path, length(path) / 2 as distance

Run queries that illustrate relationships, using shortestPath, in the dataset.

Recommendations

At the core of knowledge graph recommendations is the concept of graph traversal. When a user interacts with an item (like watching a movie or purchasing a product), the system can navigate through the knowledge graph to find related entities through various relationship pathways. These pathways often reveal non-obvious connections that purely statistical approaches might miss.

For example, in a movie recommendation system, a knowledge graph might connect films not just by genre or director, but through more nuanced relationships like "inspired by," "shares screenplay writer with," or "features similar thematic elements." This allows for recommendations that capture deeper similarities than surface-level attributes.

Run queries that list actors that have worked with his co-actors .. this can then be refined to a list actors that have worked with both actors.

To reset the database:

Run a schema reset query like:

CALL db.clearQueryCaches(); CALL apoc.schema.assert({},{},true); (if APOC is installed)

PreviousKnowledge Graphs NextWhyHow.ai

Last updated 2 months ago

neo4j

neo4j

Graph Builder

Movie Graph

Six degrees .. Traversing the Nodes

Recommendations