This post will be very helpful for teaching assistants like me who is trying to make plagiarism detection for the assignments they receive.
We shall use MOSS, thanks to Professor Aiken.
I have developed a simple script that will prepare the data for the MOSS. As MOSS will read only one level of directories, and hence comes my script to move all the files to the student's parent folder.
Linux OS and Perl should be installed.
1. Register for a new account though MOSS home page, an email will be sent to you with a script, there will be a different user id.
2. Copy the script from your email and paste it to "moss.pl"
3. Find my preparation script
echo "Staring the script"; echo "[IMP] Is that the correct folder '$1'? (y/n) " read ans; if [ $ans != "y" ]; then exit; fi; # move source code to root directories IFS=$'\n' for student in "$1"/*; do if [ -d "$student" ]; then echo "Processing student: "$student; for files in "$student"/*; do if [ -d "$files" ]; then find "$files"/ -type f -regextype posix-extended -regex ".*\.($2)" | wc -l > tempFile; count=$(cat tempFile); rm -f tempFile; if [ $count -ne 0 ]; then find "$files"/ -type f -regextype posix-extended -regex ".*\.($2)" -print0 | xargs -0 mv -t "$student"/; if [ $? -eq 0 ]; then echo "Removing ... "$files; rm -rf "$files"; else echo "run the script one more time"; fi; fi; fi; done; echo ""; fi; done;
4. Copy the script and paste it to "prepare.sh"
5. Give both scripts execute permissions
sudo chmod ug+x moss.pl sudo chmod ug+x prepare.sh
6. Run the "prepare.sh" script
./prepare.sh Minesweeper "c|h"
Where "Minesweeper" is the folder has a set of folders, each one represents a student.
Where "c|h" is the file extensions you want to move to the student's parent folder.
7. Run "moss.pl" script
perl moss.pl -l c -m 20 -d Minesweeper/*/*.c Minesweeper/*/*.h
Where "-l" is the language attribute (c, java, ...)
Where "-m" 20 is a parameter to indicate if a segment of code has been found in more than 20 students, then it is not cheating.
Where "-d" is followed by the files to be checked.
8. Wait until it finishes, then you will be given a link in the terminal like this
The above link contains the results of plagiarism detection.
9. You can save the results also for offline use
wget -r -np http://moss.stanford.edu/results/xxxxxxxx
I hope that would help you :)
Ahmed Hamdy, M.Sc.
Teaching Assistant at Computer and Systems Engineering Dept.
Faculty of Engineering, Alexandria University, Egypt.