from Spreadsheets to Graph vs relational for School Use-Case

16 views
Skip to first unread message

Richard Conrardy

unread,
Apr 12, 2016, 7:04:29 AM4/12/16
to Neo4j
Hi,
I am a mathematics teacher and I'd like to store my data in a more convenient way than Spreadsheets.
I'm still not sure what database form to use, graph or relational.
While I suppose that these forums are biased, I hope to get some good ideas. I don't have experience in databases or any query language, but I'm willing to invest.

The Main part of the databse would be the marks per student per exercice. It would scale up to around 5000 students and 2000 exercices (not every student has done every exercice, thus a sparse matrix).
While this seems to be excellent for spreadsheets, it seems subpar for SQL since crosstabs would produce 5000*2000 rows. In Neo4j I have two nodes and I link them by Grade with a numeric argument.

The students themselves should contain information like email, but also be linked again to classes.
The Exercices should be linked to class papers and maybe topics with propreties such as max marks. As far as I've read I shouldn't include binary files into my DB (a shame).

I really like that Neo4j is easy to understand and intuitive. Both (Neo4j and SQL) seems easy to get information into and out of (via csv).

I still have some worries about portability. I've got a hosting space with an SQL database (over phpmyAdmin) and Neo4j seems to made mainly for local use, it's not as easy to install as Joomla (a CMS) for example.

So, what do you think about the situation? Is one sort of database clearly better than the other? Should I lean more towards SQL since it has more documentation (and is more "standard") or is Neo4j better suited for complete beginners?

Thanks in advance
Richard

Michael Hunger

unread,
Apr 12, 2016, 7:14:15 AM4/12/16
to ne...@googlegroups.com
Richard,

I think you understood the graph data model well enough to get started, I recommend to take the online intro course to get up to speed with the query language.

it should be easy to get your data imported into Neo4j with LOAD CSV + MERGE

for hosting there are cloud hosting offerings for neo4j, your database could even fit on one of their free plans see: http://neo4j.com/developer/guide-cloud-deployment

Not sure how you would connect to the database from an application or if you just want to use the plain database for your own or department/school use?

Michael


--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aran Mulholland

unread,
Apr 12, 2016, 7:59:46 AM4/12/16
to ne...@googlegroups.com
Hi Richard,

Your assumption that it is not good data structure for SQL because it is a sparse matrix is flawed. It would all depend on how you stored your data. SQL usually ends up caching a lot of it's data in memory and there might be no reason why you ever need to do a join that would join every student to every exercise.

I'm not telling you not to use Neo4j, Neo4j is great but perhaps you are not getting what you want from SQL because of bad schema design.

jo...@experoinc.com

unread,
Apr 13, 2016, 3:07:02 AM4/13/16
to Neo4j
Richard, 

It sounds like you've got a good handle on your data. Before you choose an engine, I recommend you figure out how you will query it. What are the questions you are asking of the data? 

If you have particularly graphy questions, such as ones where you focused on the network of students & exercise (measures of centrality or influencers) or a variable number of hops, then by all means go with a graph db and Neo4j is a great engine. 

However, if your questions are fairly simple (what students struggled with exercise 37, how many more exercises does Johnny have to do?) then I don't think that the type of db really matters. More important may be what language you are more comfortable using (SQL vs Cypher). Or you may want to use it as an excuse to learn a new technology, like graph. 

I find that figuring out how the data is to be queried is more critical when picking an engine and than the data itself. Also, knowing the questions you need to answer helps in designing your logical & physical data model. 

Cheers, 
Josh  

Michael Hunger

unread,
Apr 13, 2016, 3:07:57 AM4/13/16
to ne...@googlegroups.com
Great answer Josh !

--
Reply all
Reply to author
Forward
0 new messages