SCC neo4j java homework
I need to compute Strongly Connected Components in a graph. Graph is
pretty big
875 714 nodes 5 105 043 relationships
Because it doesn't fit in RAM I learned about neo4j and tried to use it.
So I've started with DFS implementation
void dfs(GraphDatabaseService g, Node node, long counter) {
Transaction tx = g.beginTx();
node.setProperty("explored", true);
tx.success();
tx.finish();
System.out.println("Exporing node " + node.getProperty("name") + "with
depth " + counter);
Iterator<Relationship> it = node.getRelationships(Direction.OUTGOING,
RelTypes.KNOWS).iterator();
while (it.hasNext()) {
Node end = it.next().getEndNode();
if (!(boolean) end.getProperty("explored"))
dfs(g, end, ++counter);
}
}
But it's getting me stackoverflow exception.
at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:127)
So... How can I implements SCC algorithm (with double DFS loop) with
neo4j? Why is exception were thrown (except obvious reason, which is I
have not enough memory, because depth of recursion is very big)? I mean,
maybe there is something not right with my code, in the way that I'm using
transactions... I don't know... And maybe there is some other library of
tool for doing this kind of think on graph that big?
PS
I've tried JGraphT and my own implementation with adjacency list and were
getting outOfMemory exceptions when I was creating graph. Source of a
graph is coursera programming question from algorithms course.
The file contains the edges of a directed graph. Vertices are labeled as
positive integers from 1 to 875714. Every row indicates an edge, the
vertex label in first column is the tail and the vertex label in second
column is the head (recall the graph is directed, and the edges are
directed from the first column vertex to the second column vertex). So for
example, the 11th row looks liks : "2 47646". This just means that the
vertex with label 2 has an outgoing edge to the vertex with label 47646
No comments:
Post a Comment