![](https://lh4.googleusercontent.com/-y4dZhh66hPY/Uu5dTnKfy3I/AAAAAAAACEQ/Fr28zh-j684/s320/copy_all.png)
This is poor-man's MOSS. It can detect most kind to cheating done in assignment. It has some success in detecting copying in pdf files given that python library `pdfmider` is able to convert the pdf to a text file.
Pypi :
https://pypi.python.org/pypi/code-sniffergithub:
www.github.com/dilawar/sniffer You need to download the zip file from moodle and unzip in a directory. Below that directory each student have his own directory. Student directory can be nested. For example, if my path of assignment is /path/to/A and I have three student X, Y, Z then.
/path/to/A
|--X
| |- file1.vhd
| |- test
| |- testbench.vhd
|- Y
| |- file1.vhd
|- Z
| |- file1.vhd
|- testbench.vhd
My experience is that copying in coding assignment run as high as 30-45%, and once caught if you cry enough in front of instructor they let you go (who know they might be thinking of committing suicide. Its not good to be to too strict. Was the reason I was told). Moreover,
unlike many other universities they don't like to make a big deal out of it, no matter how loudly they show speak against it in public. Perhaps they have breathed the same air which their students are breathing now.
I developed this tool during my TAship for VLSI Lab course. Over the time, some empirical parameters have been tunes and algorithm gives good enough result. A cython fork is under development to speed up the matching. Its default configuration file is `~/.config/sniffer/config`. If you put the config file somewhere else you can use `--config config-file` option with this tool. In the end, it generates text files with various level of severity of matching and graphviz file which can provide an overview on how much cheating has taken place in class. One example is shown in figure from Assingmnet 04 of VLSI Lab class in 2011. Each edge is a copy case. Each node is a student. Thicker the edge, larger the copy.
Application does not do anything terribly smart for most of the copy is not terribly smart. It uses the fact the programming languages impose a structure and breaking that structure and copying is not possible. So far, this application has not reported any false positive; although it is capable of under-reporting which can be verified manually by looking at `severity-moderate.csv` file.
See the github page for more detail on how to use the application and for reporting bugs.
- Dilawar