edX analytics pipeline and dashboard are now open source

849 views
Skip to first unread message

Victor Shnayder

unread,
Sep 19, 2014, 12:53:06 PM9/19/14
to edx-...@googlegroups.com, openedx-...@googlegroups.com
Hi everyone,

We're happy to announce that we just open sourced our edX analytics pipeline, including our brand new course analytics dashboard! The course analytics dashboard is debuting on edX.org in the next few weeks.

The analytics pipeline consists of several repos, the primary ones being:
  • edx-analytics-pipeline, which computes various aggregates and reports from the event logs and database data.
  • edx-analytics-data-api, which exposes the results via a REST API. 
  • edx-analytics-dashboard, which is a web app that uses this API to show course teams what's happening in their courses.
As is always the case, there is a long list of things that aren't at the level we wish they were, ranging from documentation to installation scripts to cleaner interfaces, but we decided to open source now anyway. If you try this out, we would love pull requests which focus on improving the docs or the installation process.

As usual, please contact us before making large code changes that you want merged in, so that we can coordinate on an approach. As we work hard on finishing up our projects this quarter, please bear with us if we are not able to provide a quick turn around to our first open source pull requests. 

Cheers,
-The edX Analytics Team

Ali Hasan

unread,
Oct 22, 2014, 7:07:09 PM10/22/14
to edx-...@googlegroups.com, openedx-...@googlegroups.com
I managed to run the data-api and dashboard, but I can't find a way to setup the pipeline. any pointers or a starting point would be great.

Thanks.

Brian Wilson

unread,
Oct 23, 2014, 2:40:47 PM10/23/14
to edx-...@googlegroups.com, openedx-...@googlegroups.com
Hi Ali,

Thanks for your interest!  We know we haven't very much documentation for standing up the pipeline.  However, Jason at Stanford has made a great initial pass in writing up how an outside group might do so, based on his experience.  You can find his documentation on the github wiki (https://github.com/edx/edx-analytics-pipeline/wiki), and especially start with https://github.com/edx/edx-analytics-pipeline/wiki/How-Stanford-Online-runs-the-analytics-stack.  I'm sure there will be many more questions, and we're happy to help, but this would be a good start.

= Brian =

Ali Hasan

unread,
Oct 25, 2014, 9:35:02 AM10/25/14
to edx-...@googlegroups.com, openedx-...@googlegroups.com
Awesome, I am going to try setting it up next week, and if all goes well I'll try to compile some docs about installation process.

Thanks brian

- Ali
Software engineer, Edraak.org

Cristóbal Acosta

unread,
Oct 28, 2014, 3:35:48 PM10/28/14
to edx-...@googlegroups.com, openedx-...@googlegroups.com
There are instructions for install this tools in a fresh ubuntu server? for small scala, something like the vagrant instance with the fullstack. I see there are a vagrant instance with insights  https://github.com/edx/datajam, but not for production. 

sorry for my bad english...

Brian Wilson

unread,
Oct 28, 2014, 6:08:22 PM10/28/14
to openedx-...@googlegroups.com, edx-...@googlegroups.com
Hi Cristóbal,

Unfortunately we don't yet have any support for vagrant instances.  The "insights" that you found in the "datajam" repository is from an experimental prototype that is not under active development here.  (It was built for a "datajam" we hosted here almost a year ago.)

It should be possible to set up local development of the edx-analytics-data-api and edx-analytics-dashboard repositories.  The edx-analytics-pipeline repository is not as amenable to local development, due to the reliance on Hadoop and Hive.  We currently use Amazon's EMR service to provide those for us, as does Stanford.  But it would be wonderful if someone developed the scripts for installing and running these locally!

= Brian =   

Cristóbal Acosta

unread,
Oct 28, 2014, 8:35:36 PM10/28/14
to edx-...@googlegroups.com, openedx-...@googlegroups.com
thanks for the answer. I wil try to do something and post if I've succeed...

Sarina Canelake

unread,
Oct 30, 2014, 10:43:13 AM10/30/14
to edx-code, openedx-...@googlegroups.com
Cristóbal,

Brian is correct - the datajam repository is very old and unmaintained. We just released our first named release, Aspen: https://groups.google.com/forum/#!topic/edx-code/sH6jUbEyl2o - you can try using this as a more up to date Vagrant image.

Tushar Sharma

unread,
Jan 7, 2015, 11:48:32 AM1/7/15
to edx-...@googlegroups.com, openedx-...@googlegroups.com
1)can anyone tell me where i have to clone  edx insight,dashboard ,data-api and pipeline.?(e.g. /edx/app/edxapp/edxinsight/)
2)Should I have to install in edxapp vertualenv?
3) how to use edx lms tracking log in dashborad?
4) on analytic-dashborad ,login button  click i found error



Page not found (404)

Request Method: GET
Request URL: http://xxxxxxxxx/login/edx-oidc/None/authorize/?nonce=kCZ96tUZYGlqnESQPDY9sgdD5lKbVtiUYmkzmdxFAcgVufg88TVFXiwk1IMwSRHV&state=YOGr30eHJ0XSLHdy46RAeW2tKQ13ZYm7&redirect_uri=http://xxxxxxxxx/complete/edx-oidc/&response_type=code&client_id=None&scope=openid+profile+email+course_staff

Using the URLconf defined in analytics_dashboard.urls, Django tried these URL patterns, in this order:

  1. ^$ [name='home']
  2. ^jsi18n/$
  3. ^status/$ [name='status']
  4. ^health/$ [name='health']
  5. ^courses/
  6. ^admin/
  7. ^login/(?P<backend>[^/]+)/$ [name='begin']
  8. ^complete/(?P<backend>[^/]+)/$ [name='complete']
  9. ^disconnect/(?P<backend>[^/]+)/$ [name='disconnect']
  10. ^disconnect/(?P<backend>[^/]+)/(?P<association_id>[^/]+)/$ [name='disconnect_individual']
  11. ^accounts/login/$ [name='login']
  12. ^accounts/logout/$ [name='logout']
  13. ^accounts/logout_then_login/$ [name='logout_then_login']
  14. ^test/auto_auth/$ [name='auto_auth']
  15. ^auth/error/$ [name='auth_error']
  16. ^__debug__/
  17. ^403/$
  18. ^404/$
  19. ^500/$

The current URL, login/edx-oidc/None/authorize/, didn't match any of these.


Reply all
Reply to author
Forward
0 new messages