Henry Cowles on "How to find traces of a text (e.g. article) in a more recent...
I want to do two different sorts of text comparison, though I suspect the answer for one will apply for the other. Ideally, I'll find a web-based or freeware solution to one or both of the following:1....
View ArticleArno Bosse on "How to find traces of a text (e.g. article) in a more recent...
You may find this article on sequence alignment in large text corpora by the folks at ARTFL (http://artfl-project.uchicago.edu/) useful...
View ArticleJim Ridolfo on "How to find traces of a text (e.g. article) in a more recent...
Hmm. For 1a) you should be able to use the Unix/Linux/OS X terminal tool diff to compare the files. For 1b) you might be able to import the texts into SQL and then do something more complex with...
View Articleelotroalex on "How to find traces of a text (e.g. article) in a more recent...
This is the subject of a lot of my current research. The suggestions above are good places to start. I would add the Juxta tool to the mix. Like Ridolfo's, this is a DIFF solution. With texts where the...
View ArticleArno Bosse on "How to find traces of a text (e.g. article) in a more recent...
Browsing the Mathematica online help today, I found that it offers several functions for aligning & comparing string sequences. A web search for #dh projects using sequence alignment brought me...
View ArticlePatrick Murray-John on "How to find traces of a text (e.g. article) in a more...
Have you pushed the texts through whatever plagiarism detection apps are available to you (if any)? Seems like those algorithms are close to the ones you are looking for, and so might make a good start.
View Articlelmullen on "How to find traces of a text (e.g. article) in a more recent one...
I've been working on a very similar problem with a colleague who wants to identify text reuse among nineteenth-century codes of civil procedure. Here are two sample analyses with code in R, one that...
View ArticlePatrick Murray-John on "How to find traces of a text (e.g. article) in a more...
Hmm...sorry if a repost, first attempt failed.If you have access to a plagiarism-detection app, have you tried seeing what that might return? Seems like same or similar algorithms might be appropriate.
View Article