How to access cohort metadata from forum dump (Mongo)

42 views
Skip to first unread message

Stian Håklev

unread,
Jun 15, 2016, 3:25:47 PM6/15/16
to General Open edX discussion
I'm currently analyzing a past MOOC that we ran on EdX.org. In this MOOC, we had manually specified cohorts (we uploaded CSV files with student IDs), and a number of forum discussions (linked to lecture content) that were segmented by cohort. Currently, we are looking at doing some analysis comparing the participation and discussion characteristics of different cohorts. I have been looking at the data in the forum mongo dump, which has DiscussionThreads and Comments, however I cannot find any metadata about which cohort a given comment belongs to? I guess I could extract all the comments, and lookup which cohort a student was in, and segment manually, but it seems strange to me that this data wouldn't be captured somewhere? Surely the system is not going through a long thread, and looking up the cohort-status of every single member who has posted there, to determine what to show every time a user pulls up a thread?

I even tried browsing through the code, and there seemed to be something about cohorts and group_id, however there is no group_id in the mongo data, and anyway I got quite lost in all the code.

Thank you for any insights!
Stian

Andy Armstrong

unread,
Jun 15, 2016, 4:19:25 PM6/15/16
to edx-...@googlegroups.com
Hi,

The cohorts are Django models that live in the LMS because they are used in a variety of ways beyond just the forums. You can find this data in the LMS SQL database in the course_groups_courseusergroup table. The Django model class is called CourseUserGroup which lives here in the code:


In the Mongo model, the group_id field is the primary key from this table.

I hope this helps. Let me know if you need more details.

Thanks,

 - Andy

--
You received this message because you are subscribed to the Google Groups "General Open edX discussion" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/edx-code/0ab38fbc-4532-4062-b04b-ecd1296f518e%40googlegroups.com.



--

Andy Armstrong

edX | UI Architect  | an...@edx.org  

141 Portland Street, 9th floor

Cambridge, MA 02139

http://www.edx.org

http://www.e-learn.nl/media/blogs/e-learn/edX_Logo_Col_RGB_FINAL.jpg?mtime=1336074566

Stian Håklev

unread,
Jun 17, 2016, 11:35:08 AM6/17/16
to edx-...@googlegroups.com
Hi Andy,

thank you very much for your quick response! The problem was that I was looking at the initial forum posts/comments, which I guess were from threads that were not marked as cohorted, thus they had no group_id. 

My second quest is to connect up these group_ids with the actual cohort names. I'm not so familiar with Django, what does this model actually correspond to in terms of an sql export? I tried grepping from group_id among the SQL exports, and the XML documents, and could not find anything. 

thanks!
Stian

--
You received this message because you are subscribed to a topic in the Google Groups "General Open edX discussion" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/edx-code/CAG2ZmnAHp0CVAJvdVAFvGPiEzh-Bu4Wky9ixNo5U8YJDSzEsWA%40mail.gmail.com.



--
http://reganmian.net/blog -- Random Stuff that Matters
Reply all
Reply to author
Forward
0 new messages