These are just a a few notes for prospective or new CS graduate students who are considering working with me on their MS research.
I'm happy to work with students on a Plan A (thesis) or plan B (project). The difference between them is a matter of depth. Plan A assumes 2 years (4 semesters) of fairly concentrated and intensive work, leading to a Master's thesis. This is a formal document describing your research that is usually somewhere between 70-200 pages long. A plan B project expects 120 hours of work (at least), and should result in a short conference or workshop paper, and software that has been tested, documented, and released (that supports whatever is in the
paper). Plan B also requires you take 1 or 2 more classes than plan A.
How to decide between plan A and plan B? I think at some level it's matter of personal choice. In both cases our goal will be to write and publish at least one short paper. Plan A extends that by going deeper and also writing a thesis.
We will work together to identify a suitable research topic, but in general it will be in the area of Natural Language Processing and/or Computational Linguistics. You can get a very good idea of the kind of research we do by looking at previous MS student's work and my publiciations in general :
http://www.d.umn.edu/~tpederse/masters.htmlhttp://www.d.umn.edu/~tpederse/pubs.htmlIn addition, various plan A and plan B work has centered around annual SemEval tasks. I'd encourage you to look over some recent years of this to get an idea of other problems that are potentially good sources for plan A and plan B work.
https://en.wikipedia.org/wiki/SemEvalIf you are working with me on a plan A or plan B, I will expect you to take CS 5242, Introduction to Natural Language Processing. In the best case you should take this in the Fall semester of your first year. This will help you get oriented to NLP, and will in the end save you a lot of time as you get up to speed preparing to do your research. This class will count the course hours required by the MS degree. I will also, from time to time, be offering a second semester Advanced NLP class (CS 5642). If this is being offered I will expect students working with me on plan A or B to take this as well.
If you've read this far and feel at least somewhat interested, you should go ahead and join the Duluth NLP mailing list. This is a place where we make announcements and point out issues in the news that might be of interest. Please sign up here :
https://groups.google.com/forum/#!forum/duluthnlpFor both the plan A and plan B option, I would like to start working with you in your first semester. Plan A will take four semesters of hard work to finish. For plan B we can time things so that you finish well before your graduation date (especially if we start in the first year). I have found that finishing a plan A thesis by May of your second year is challenging, and really does require concentrated effort over a 4 semester period. By concentrated effort I mean spending about as much time on your thesis during a semester as you would
on any of the classes you are taking.
Your research work will most likely include programming Our default choices are Python and Linux. Any code that is used to produce results that are either in a paper or thesis *must* be released as open-source. The motivations behind this policy are described in a short piece that appeared in Computational Linguistics in 2008 :
http://www.d.umn.edu/~tpederse/Pubs/pedersen-last-word-2008.pdfThis philosophy is central to much of what we do, so please make sure you go over the above very carefully.
All of your writing for papers or your thesis must be original. While I will review and comment on what you do, in the end the writing is yours. It's very important that the writing be done in a proper academic or scholarly fashion, which means among other things that there must be no plagiarism in the work. Please make sure you read and understand the following, and always ask questions if you are ever in doubt about what is appropriate in your writing:
http://www.d.umn.edu/~tpederse/Docs/A-Plagiarism-Case-Study.pdfAlso, please read the following, which includes some general thoughts and tips on writing about research :
http://www.d.umn.edu/~tpederse/Docs/The-Art-of-WAR.pdfFinally, it's important to say I really enjoy working with students on research in NLP. I tend to spend a lot of time working with students and try to develop a genuine collaboration that is both productive and fun. Please don't be intimidated by all of the above, if you have read this far please get in touch and we can discuss further.
Good luck,
Ted