WO2003064688A2

WO2003064688A2 - Method of generation and management of unique sequences in dna production

Info

Publication number: WO2003064688A2
Application number: PCT/GB2003/000369
Authority: WO
Inventors: Michael Cleary
Original assignee: Smartwater Limited
Priority date: 2002-01-29
Filing date: 2003-01-29
Publication date: 2003-08-07
Also published as: HK1054572A1; AU2003207001A1; GB0301971D0; HK1054572B; WO2003064688A3; GB2385853A; GB0201966D0; GB2385853B

Abstract

A method of modeling and sythesising nucleic acid chains (DNA) to store a unique identifying marker. Each identifying marker being assigned to a particular origin. The DNA produced by the method can be used in a security tagging product. Such product possibly having a second level of security provided by a key code held within a specific formulation of trace materials. The security product being such that it can be generally applied to the surfaces of a property so as to provide a unique identifying marker which is particular to the origin or rightful owner of said property.

Description

METHOD OF GENERATION AND MANAGEMENT OF UNIQUE SEQUENCES IN DNA PRODUCTION

Field of the Invention The present invention concerns improvements in or relating to synthetic

DNA production and particularly in relation to security or tracking devices.

Background of the Invention

In the current climate of mass production and global commercialisation, there is a need for methods to uniquely mark or tag products so that they can be traced or linked back to their origin. The facility of traceability is of importance and can be used by companies in product tracking. Product tracking can be particularly useful in addressing the problem of grey-market imports. Although product tagging has a number of potential applications, one of the most important is arguably in crime reduction.

Crimes such as theft and burglary are common social problems. The loss of personal property as a result of crime can be a distressing, not to mention expensive, experience. Generally speaking, it is the responsibility of the Police to investigate such crimes and hopefully recover any stolen property.

The recovery of stolen property is only half the problem, the identification of the rightful owners of such property can be an equally difficult task.

There are a variety of ways of marking property so as to uniquely identify them and thus, in the event they are stolen and subsequently recovered, they stand a better chance of being returned to their rightful owners.

Synthetic DNA has been used as a tracking and security device, as discussed in International PCT applications PCT/GB91/00719 and PCT/GB93/01822. However the coding used to describe the base sequence does not lend itself to easy computer handling. The sequences are represented usually by a series of letters and digits separated by commas. Although these do provide all the necessary information to fully describe the sequence, this type of representation does not lend itself to automated production of sequences or easy management of those sequences produced. Synthetic DNA has to date been exclusively used as a tracking/security product due to the unique nature of the sequences and the small amount of material required to perform an identification.

Summary of the Invention

The present invention discloses a way of modeling and producing synthetic nucleic acid chains (DNA) so as to contain a unique identifying marker that can be assigned to a unique origin. In one embodiment, the present invention provides a method of generating and managing unique sequences of synthetic nucleic acid, comprising: applying a secure interpretation system to known unique decimal number; and synthesising a nucleic acid chain based on the sequence provided by the interpretation of said decimal number.

In another embodiment, the present invention provides a method of tracing and/or identifying goods comprising: modeling and synthesising at least one nucleic acid chain with a base sequence contained therein; applying a secure interpretation system to obtain a unique identifying marker from the base sequence; establishing a database in which the unique identifying marker is assigned to a unique source; and determining to which items the synthesised nucleic acid chain has be applied and identifying the base sequence therein and obtaining the unique identifying marker from said sequence so as to determine the unique source from the database.

Preferably, an indicator is also applied to any items to which the nucleic acid chain is applied, thus facilitating identification of the tagged items. The invention further provides for a security composition for tracing or identifying goods, comprising an indicator material and at least one nucleic acid chain, which has been synthesised to store a unique identifying marker. Preferably the above composition also comprises a solvent system for the indicator material, said solvent system containing a solvent which is volatile under conditions of application.

Preferably, the present invention involves the use of a multilevel security product. At least one additional level of security is provided by the composition further comprising a plurality of separately identifiable trace materials that can be varied in such a manner as to produce unique formulations, the combination of trace materials being varied by modeling each composition on a binary string to produce a unique code. Preferably, the unique chemical code may provide the information required to determine the primers necessary to breed the nucleic acid to a level suitable for analysis of the unique identifying marker stored therein. The primer specification can be obtained via the mathematical processing of the unique code. Such security product provides a concealed extra level that would not be apparent to any would-be counterfeiter and furthermore without knowledge of the mathematical process involved, the chemical from the first layer product cannot be converted into the information required to identify the primers required to access the unique identifying code held within the nucleic acid sequence.

Alternatively, the unique chemical code may indicate the start location and/or size of a sequence of bases within the nucleic acid chain, such sequence providing the unique identifying marker.

Preferably the indicator, which shows where the composition has been applied, is covert. A suitable covert indicator could be visible under ultraviolet light only, but alternative types will be appreciated by the skilled man.

It may be of further advantage if the composition is adapted for aerosol spraying.

The multilevel security product in accordance with the present invention can suitably be utilised in connection with the compositions disclosed in our UK Patent Nos. 2286044 and 2319337. Detailed Description of the Invention

The synthesis of modeled sequences of nucleic acid can be achieved by methods and procedures currently known in the technology field of nucleic acid synthesis, such as Polymerase Chain Reaction (PCR). The present invention utilises this technology to provide nucleic acid chains with specific base sequences which have been modeled according to the secure interpretation systems discussed below, such base sequences being modeled to provide unique identifying markers following their interpretation using a secure interpretation system. The present invention provides a number of these secure interpretation systems that utilise the unique codes held within the modeled base sequences to store an identifying marker. Such markers are in turn assigned to a unique source i.e. the owner or maker of the item to which the tag has been applied. As discussed above, the present invention provides mechanisms for representing the base sequence within a nucleic acid chain (i.e. DNA) with a simple numerical code. Each sequence can therefore be represented by a numerical code that is unique to that sequence.

Under a first preferred system each of the four main bases: Adenine (A), Cytosine (C), Guanine (G) and Thymine (T), are assigned values of 1 ,2,3 and 4 respectively. This value may be represented via a 3 digit binary string. By replacing the bases in a particular sequence with their binary equivalents the resulting strings can then be combined to form a single composite string. Such a composite string may be used as a unique identifying marker in itself. However, it is more preferable that the decimal equivalent of the string is used to express the unique numerical value of that sequence. Example 1

The following theoretical examples show how this would work with different sequences:

T A A A A A T G A C 100 001 001 001 001 001 100011 001 010 composite string 100001001001001100011001010 decimal code = 556046538

T T T T T T T T T T 100100100100100100100100100100 decimal code = 613566756

This coding can be used as a model to produce unique nucleic acid strands in an automated and computer controlled manner. It provides a mathematical block on the duplication of nucleic acid strands and is more easily managed than the accepted alphabetic labeling of the base sequences of the oligonucleotide. Such a system can be applied in a single level security product in which the unique base sequence in its entirety encodes the unique identifying marker.

Alternatively, a higher level of security may be created by using a two level marker system, wherein the first level of information is provided by a unique chemical formulation of separately identifiable trace materials, being represented by its own unique code and serving as the first level of information within the product. The second level being contained in the nucleic acid.

The nucleic acid strands can then be manufactured based on a mathematical relationship between it and the first level device. The mathematical relationship between the two, for security purposes, can be varied and be part of the information stored with the first level unique code. The information stored in the first level unique code may be used to indicate the appropriate primer necessary to synthesize an effective amount of the nucleic acid chain, thus enabling the analysis of the unique identifying marker stored therein. It is appreciated that the first level unique code can be used to store other information relevant to the interpretation of the unique Identifying marker stored within the nucleic acid chain.

It is also appreciated that the first preferred system of the present invention is more suitable for relatively short oligonucleotides, e.g. less than 20 bases.

An alternative approach for a larger sequence would be to use just one base to carry the code. In a second preferred system, the positions occupied by a particular base within the coding section could be used to provide the code, using a binary approach. Therefore, within this system, the presence or absence of the chosen base can be represented by a 1 or 0 respectively. Any other bases can be used to make up the sequence and these would simply add a 0 to the string.

Furthermore, it will be appreciated that the information stored in the first level unique code may be used to indicate the chosen base for the interpretation of the code.

Either of the above preferred systems could be applied to a nucleic acid chain wherein all of the bases therein contribute to the unique identifying marker.

Alternatively, the above systems could be applied to a specific region of bases on a nucleic acid strand. In such cases, the information stored in the first level unique code may be used to indicate the start/end location and/or size of a sequence of bases within the nucleic acid chain, such sequence providing the unique identifying marker.

The start and end location points of the base sequence may alternatively be marked by a specific base sequence (usually four bases long).

Such an alternative may be appropriate in compositions of the present invention that have only one level security, i.e. composition that only contain the nucleic acid strand.

Example 2

The start of the coding sequence could be given by the four part sequence AGCT, which sequence will only appear again at the end of the coding section. This also indicates that the coding will be obtained from the position of base A within the sequence.

AAAACCAAACAGCTAAACCCGGTGCAGCTGCTTTTTAAAA start{ }end

The coding sequence reads AAACCCGGTGC which produces the binary code 11100000000. To conform with normal binary code usage this should be reversed and read right to left i.e. 00000000111 or decimal value 7. Alternatively the coding sequence within the code area could be assembled to run right to left to match this form of usage.

Alternatively the sequence:

AAACCTTTGGAAGCTTTTTGGAAATGTTGGAAAAAAAAAAAGCTTTGGGGGAAAA

Code 0 0 0 0 0 0111 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 read as binary code

= 1111111111000000111000000 which in decimal equals 33522112. A thirty base coding sequence used in this manner would provide a basis for the generation and management of over 1 billion unique sequences.

It is appreciated that the present invention provides a mechanism for representing a nucleic acid (DNA) sequence as a simple decimal number, this could be proved mathematically (using a secure interpretation system)to have a specific sequence.

For example, where x is a decimal number, x = ACGTACGT

Simple software would then produce the number x+1 :-

hence x+1 = ACGTACTG.

The software would then move on to x+2 and calculate that :-

x+2 = ACGTATCG etc...

Obviously, processor speeds would allow the production of thousands of such representation in seconds, which could then be used as a basis for synthesising the specific nucleic chain sequences by PCR.

The above disclosed secure interpretation systems will permit automation of the synthesis of nucleic acid chain that encode a unique identifying marker.

A computer programmed with the appropriate secure interpretation system could apply a simple process to produce a large number base sequences corresponding to a known sequence of unique decimal numbers.

Such a facility, would make the use of (DNA) nucleic acid chains in the storage of unique identifying markers a much more economical option.

Claims

1. A method of generating and managing unique sequences of synthetic nucleic acid, comprising: applying a secure interpretation system to a selected number; and synthesising a nucleic acid chain based on the sequence provided by the interpretation of said selected number.

2. A method according to claim 1 , wherein the selected number is computer generated.

3. A method according to claim 1 , wherein the selected number is taken from a known database list.

4. A method according to any one of claims 1 , 2 or 3, wherein the secure interpretation system involves assigning the values of 1 , 2, 3 and 4 to the bases Adenine, Cytosine, Guanine and Thymine respectively.

5. A method according to claim 4, wherein the sequence of assigned values correspond to the selected number.

6. A method according to claim 4, wherein each of the assigned values is expressed in a three digit binary code, and the sequence of such assigned values correspond to the selected number.

7. A method according to any one of claims 1 , 2 or 3, wherein the secure interpretation system involves assigning the presence of a particular base at any one point in the sequence a "1" result and its absence a "0" result, thus producing a binary numerical code which represents the positions of a specific base in the sequence and which also corresponds to the selected number.

8. A method according to any of the preceding claims, wherein the entire synthesised nucleic acid chain is interpreted using the secure interpretation system.

9. A method according to any one of claims 1 to 7, wherein only a predetermined section of the synthesised nucleic acid chain is interpreted using the secure interpretation system.

10. A method according to claim 9, wherein a specific order of bases in the nucleic acid chain identifies the start location and/or end location of the predetermined section of the synthesised nucleic acid chain.

11. A composition for tracing or identifying goods comprising an indicator material and at least one nucleic acid chain, which has synthesised to store a unique identifying marker.

12. A composition according to claim 11 , wherein a higher level of security is provided by using a two level marker system, whereby the first level of information is provided by a unique chemical formulation of separately identifiable trace materials, being represented by its own unique code and serving as the first level of information within the composition; and the second level being contained in the nucleic acid.

13. A composition according to claim 12, wherein the unique code held by the plurality of trace materials corresponds to the appropriate primer necessary to synthesize an effective amount of the nucleic acid chain, thus enabling the analysis of the unique identifying marker stored therein.

14. A composition according to claim 12 or 13, wherein the unique code held by the plurality of trace materials indicates the start location and size of a sequence of bases within the nucleic acid chain, such sequence providing the unique identifying marker.

15. A composition according to any one of claims 11 to 14, wherein the indicator is covert.

16. A composition according to any one of claims 11 to 15, further comprising a solvent system for the indicator material, said solvent system containing a solvent which is volatile under conditions of application.

17. A composition according to any one of claims 11 to 16, being adapted for aerosol spraying.

18. A system for tracing and/or identifying goods comprising the method claimed in any of claims 1 to 10 and further comprising: applying the synthetic nucleic acid chain to an item; establishing a database in which the unique identifying marker is assigned to a unique source; and determining to which items the synthesised nucleic acid chain has be applied and identifying the base sequence therein and obtaining the unique identifying marker from said sequence so as to determine the unique source from the database.

19. The system of claim 18, wherein a two level security system is provided by the presence of a unique chemical formulation of separately identifiable trace materials, each formulation being varied by modeling on a binary string so as to produce a unique code, and whereby the first level of security is provided by the chemical formulation and the second level is provided by the nucleic acid chain.

20. The system of claim 18 or 19, wherein the unique code provides information which determines how the nucleic acid chain is interpreted.

21. The system of claim 20, wherein a unique code held by the plurality of trace materials determines the base to which the binary code corresponds.

22. A system according to any one of claims 18 to 21 , wherein an indicator is also applied to any items to which the nucleic acid chain is applied, thus facilitating identification of the tagged items.