Re: Computerised authorship attribution



John Burrows' "Delta" technique is very simple and easy to understand,
and works very well. Same for Naive Bayes. Support Vector Machines also
work well. If your assignments allows it you could adapt the libSVM
code available for download from the university of Taiwan. In my
experience, simple nearest neighbours approaches do not work well.
However "Delta" is a nearest neighbour approach. CUSUM is "contentious"
to say the least. In any case it's a technique more suited to literary
experts as it requires you to edit the language of the original
documents ... the cause of much of the contention as it's possible to
edit the original text until it gives the result you expect. See papers
by David Holmes criticising CUSUM, or the book by Farringdon if you
want to see the case for it and a lot of details about it. See papers
in the journals "Literary and Linguistic Computing", "Computers and the
Humanities", and similar journals.

Cheers,

Ross-c

.



Relevant Pages

  • Re: Difficulties in Network Mapping & port scanning
    ... > in-depth technical papers on network scanning and enumeration are thin on ... Another technique I've used in the past is that a lot of applications ... Cross site scripting and other web attacks before hackers do! ...
    (Pen-Test)
  • Re: sorting student papers
    ... papers before I return them to the students. ... what technique should I use to sort ... I think any reasonable technique will do. ... What I generally do nowadays is split the papers into 6 or 7 piles ...
    (sci.math)
  • Re: Tho X Bui, Alan Carruth and David Hajicek....French Polish Update.....
    ... I will give the papers a try in the next few days. ... I agree on the padded shellac method vs classic FP. ... The padding technique is different than FP (which is ...
    (rec.music.makers.builders)
  • Re: sorting student papers
    ... I am teaching a class with roughly 50 students in it. ... papers before I return them to the students. ... what technique should I use to sort ... what techniques do you use to sort papers in ...
    (sci.math)