Cypher and the outer join
May 15
I have been experimenting, on and off, with Neo4j, a graph database. One way to get data out of the DB is to use Cypher, "a declarative graph query language".
This post is about one Cypher query ...
I want to return a list of people 'known' by a person, however, in the returned data I want to indicate who, in the list, is also known by another person.
The data looks something like this:
personA -[:KNOWS]-> personX
personA -[:KNOWS]-> personY
personA -[:KNOWS]-> personZ
personB -[:KNOWS]-> personX
personB -[:KNOWS]-> personW
This pseudo-Cypher gives me who personA AND personB know:
2MATCH primary-[:KNOWS]->known<-[:KNOWS]-secondary
3RETURN known
I'd get 'personX'.
But, and it's a big butt, I want to get all the people personA knows and some sort of indication that personB also knows personX which, in SQL terms, is an outer join. I see in the docs - http://docs.neo4j.org/chunked/milestone/query-sql-match.html.
So a '?' signifies an outer join. So I have to put the '?' on the relationship between the 'secondary' node and the 'known' node. Where there is no relationship it will return a NULL. I use the count() function since I don't really want to deal with nulls.
This is what I ended up with:
2MATCH primary-[:KNOWS]->known<-[r?:KNOWS]-secondary
3RETURN known.fullname, count(r) AS incommon

