Over the last 5 years IT Group has seen a significant increase in the number of instructions relating to intellectual property theft.
Increasingly we are deploying techniques over and above the basic test of whether or not in our opinion two pieces of code are substantially the same, whether one derived from the other or whether they contain shareware available in the public domain.
Those techniques include the analysis of “white space”, the investigation of the potential theft process (email, FTP, USB, dropbox, p2p etc.), the analysis of circumvention techniques and forensic dating – which came first?
This short article discusses these techniques and how they apply to the available legislation – the Copyright, Designs and Patents Act 1988 (“the Act”) and the amendments arising from the EU Copyright Directive.
While the ultimate test is the application of one or more measures offered by the Act, the uncovering of the tracks to show that the original work was stolen, removed, or copied often provides very persuasive supporting evidence.
“White Space” Analysis
White space is the term used to describe the regions of a document portrayed on a computer screen as empty. These include margins, tabs, spaces and everything after a carriage return. All have the same look – nothing there, hence the expression white space. Most human software creators, in common with anyone who types a document, will exhibit some degree of inconsistency when typing.
Modern accepted typing standards require a single space after a full stop. Previous standards usually called for two spaces.There is no requirement for a space (or two) before a carriage return but often coders and typists will inadvertently include a space before the carriage return – but crucially not consistently. The effect is to give a piece of typed script (a software program perhaps) a hidden fingerprint.
To the casual observer, two pieces of typed script may look identical.
Using simple techniques such as turning on the “display paragraph marks” in Word for example [Ctrl + *], it is easy to analyse two seemingly identical pieces of typed script to check them at a different level.
This technique comes into its own when software has been disguised – re-written after copying to look different but actually to do the same thing and based on the same code. The pseudo-random white space format can often quickly establish strong evidence of direct copying rather than a possible claim for a complete rewrite using the software developers “tool-kit”.
Forensic IT Analysis
At IT Group we have developed tools and techniques that are able to analyse millions of emails very rapidly looking for patterns and identifying when attachments were first included and to whom they were subsequently forwarded.
In combination with other simple forensic techniques such as time-line analysis (software is often emailed out of a company outside the core business day) and content analysis - emails with stolen software attached rarely have any basic content; one case was solved by searching for “smiley face” emoticons.
The offending email with the software attached simply said “Here you go ☺”.
Tools exist that analyse when a USB stick was introduced to a PC or laptop and then time line analysis of file activity can quickly align the copying of files and the use of the USB stick.
Many larger corporations routinely include copy protection measures in their software. This is particularly common (and necessary) in sectors such as video games. Products frequently find their way on to the market that purport to offer enhancements to games or to offer the means to use a gaming console for other purposes but often are really marketed to provide circumvention methods enabling people to copy games illegally.
Analysis of the functions of such products can be challenging and often requires a detailed knowledge of the underlying software and electronics which typically is not available from the manufacturer and therefore has to be derived by reverse engineering.
Which piece of software came first is a very powerful piece of evidence if it can be reliably established. In a real-life case I was instructed in some time ago, a final year student was threatened with being discharged from his university because anti-plagiarism software had detected copying. He claimed that his work had been copied but he had handed his work in after another student who he claimed had asked to see his work as he was struggling with the assignment.
A full forensic analysis of the student’s laptop recovered several deleted drafts of the student’s work over the preceding weeks proving he had created the original. The other student was discharged from the University.