Data Science: Getting started with Neo4j and Gephi Tool

Neo4j is a graph database management system. In Neo4j, everything is stored in the form of an edge, node, or attribute. Each node and edge can have any number of attributes. Both nodes and edges can be labelled. Labels can be used to narrow searches.
In simple words, Neo4j is the MySQL of the graph databases. It provides a graph database management system, a language to query the database, a.k.a CYPHER, and a visual interface with the neo4j browser.
Let us execute some queries in neo4j. Neo4j provides a movie dataset, upon which we will execute or queries upon.
- Show movies that are released after the year 2005.
Query:
MATCH (m:Movie) where m.released > 2005 RETURN m

Displaying 8 nodes i.e 8 movies released after 2005 available in database
2. Query movies released after 2002 and limit the movie count upto 5 only
MATCH (m:Movie) where m.released > 2002 RETURN m limit 5

3. The below query returns name of the person, director and movie name that are released after year 2007 upto limit of 5 and represents the relation between the nodes through edges in the graphical form.
MATCH (p:Person)-[d:DIRECTED]-(m:Movie) where m.released > 2007 RETURN p,d,m limit 5


The queried data in the table form
4. If we want to know the list of the persons that are available in the database we can use the following which queries list of person but limits the output upto 20 person only.
MATCH (p:Person) RETURN p limit 20

5. Next query is is to get the list of the movies name and their release year. The data is displayed in the tabular form.
MATCH (m:Movie) RETURN m.title, m.released

6. If one wants to search whether a movie with a particular name is present or not the following query is used which is used to search for a movie name A Few Good Men.
MATCH (m:Movie {title: 'A Few Good Men'}) RETURN m

7. We can also make a query to list movies which have release year within a particular interval of time, like below example list movies released between the year 2010 amd 2017.
MATCH (m:Movie) where m.released >= 2010 and m.released<=2017 RETURN m

Gephi is an open-source network analysis and visualization software package written in Java on the NetBeans platform. It is mainly used for visualizing, manipulating, and exploring networks and graphs from raw edge and node graph data. It is an excellent tool for data analysts and data science enthusiasts to explore and understand graphs.
Import your Dataset

Below is how all the nodes and edges are displayed when initially dat is loaded.

Now we can represent the data in various layout. In he left pane choose the layout option and choose the layout of your choice and click on Run. In the below image I have chosen the ForceAtlas 2 layout which displays the data in the following form.

You can change the color of any node and visualize the dataset in a better way.
