Laboratory-based surveillance in the molecular era: the TYPENED model, a joint data-sharing platform for clinical and public health laboratories

Assuming that clinical and public health laboratories may be able to use the same data for their own purposes when sequence-based testing and typing are used, we explored ways to develop a collaborative approach and a jointly owned database (TYPENED) in the Netherlands.

One problem in the use of laboratory-based surveillance systems is that they require information that typically is collected at the clinical level and therefore is not focused on surveillance.

Multiplex real-time PCR and sequence-based detection and typing techniques may be used for clinical diagnosis, to guide treatment (by, for example, resistance profiling, strain characterisation and typing), for hospital infection control and quality management (for cluster detection). We anticipate increasing resistance from clinical laboratories to data requests for surveillance purposes because of these competing priorities.

Assuming that clinical and public health laboratories may be able to use the same data for their own purposes when sequence-based testing and typing are used, we explored ways to develop a collaborative approach and a jointly owned database in the Netherlands. Within this initiative, called TYPENED (TYPeer netwerk NEDerland [Typing network Netherlands]), two pilots were started in 2009: one for bacterial typing and one for viruses. These laboratories have all been long-term suppliers of surveillance information, by sending to RIVM isolates or clinical specimens as well as clinical information for a number of viruses such as influenza A virus, norovirus, enterovirus, rotavirus and hepatitis A, B and E viruses.

Selection of pilot pathogens An inventory was made of the currently used typing methods in the six clinical laboratories and the public health laboratory participating in VIRO-TYPENED using a structured online questionnaire. The options provided were: (i) the exchange of protocols, control reagents and quality-control panels; (ii) a centralised reference data collection; (iii) a common database; and (iv) no collaboration considered necessary. Molecular platform database In order to achieve efficiency and continuity, a generic database infrastructure for sharing of molecular typing data and metadata was developed at RIVM between 2008 and 2011.

The database can be configured for a specific pathogen, at the request of a laboratory network, which also appoints a coordinator or curator. A minimal dataset can be defined by the network, based on the questions addressed, coupled with a feasibility assessment. This dataset minimally comprises time and place, but can be complemented with additional epidemiological or clinical metadata specific to the targeted organism. Besides online data entry forms, the platform provides a bulk upload option using Microsoft Excel and FASTA formats.

The added value of a database like this – compared with the database of GenBank [9], in which laboratories all over the world share their sequences – is threefold. Firstly, the data are more comparable because of the agreed typing region and the standardised typing results and secondly, the data are shared before laboratories have decided to make them publicly available, for example, through GenBank. The third important advantage is the linked, standardised set of epidemiological and clinical data with each sequence, which allows in-depth analysis.

A minimum dataset was agreed, including age and sex of patient, type of sample from which the virus was detected, whether the patient was hospitalised, travel history (by country visited), clinical symptoms in broad categories (skin, neurological, respiratory, enteric). For each patient, at least one sequence of the major capsid protein VP1 gene has to be provided of the agreed genomic region (nucleotides 2,604–2,909 NC_001612, CVA16). In addition, samples that could not be typed as an enterovirus but were typed as poliovirus-like, were sent to the enterovirus section of the Center for Infectious Disease Control at RIVM, as part of the enterovirus surveillance programme in place, to document the absence of wild-type poliovirus circulation. Enterovirus diagnostics and sequencing Each laboratory used a laboratory-developed test, adapted from the protocol described by Nix et al. [10] (2006) for the detection of enteroviruses.

One laboratory used an additional protocol described by McWilliam Leitch et al. All laboratories participated in an external proficiency testing programme organised through Quality Control for Molecular Diagnostics (QCMD), Glasgow, United Kingdom, an International Organization of Standardization (ISO) 17043-accredited organisation. Amplification of the 5′ non-coding region of enterovirus was performed at the individual participating laboratory. Genotype assignment using a standardised sequence-based typing tool Upon entering of sequences into the database, an automated algorithm was run to assign the genotype. This tool has been validated against most currently known picornaviruses and has been shown to correlate highly with the serotype assignment [8].

Some laboratories also typed parechoviruses (n=5), rhinoviruses (n=3), hepatitis E virus (n=3), norovirus (n=2), hepatitis A virus (n=2), cytomegalovirus (n=2), herpes simplex virus (HSV) (n=2), adenovirus (n=2), human immunodeficiency virus (HIV) (n=3), as well as hepatitis B virus and hepatitis C virus for specific research or clinical study-related questions. A need for a more structured collaboration between the laboratories, possibly including the operation of a joint reference database, was indicated by the majority of respondents regarding influenza virus, parechovirus, rhinovirus and hepatitis B virus. For the less commonly used typing approaches, a need for collaboration was expressed for hepatitis viruses A, C and E. Given the consensus that a type of collaborative network would meet a need, a pilot TYPENED database was set up for enteroviruses. Most of the sequences belonged to HEV-A (n=168; 25.8%) and B (n=466; 71.6%), whereas only a few belonged to HEV-C (n=6; 0.9%) and D (n=6; 0.9%). Following automatic typing of the sequences submitted to the TYPENED database, it appeared that some of the viruses that were enterovirus positive in the molecular diagnostic assay appeared to be a rhinovirus A (n=5; 0.8%), most probably due to the cross-reactivity of the primers used for detection.

In addition, three poliovirus sequences were identified within the HEV-C set: all three isolates were obtained from children from the former Netherlands Antilles (Curaçao and Sint Maarten), where oral polio vaccines were used. The laboratories that submitted the sequences received samples from laboratories all over the Netherlands. For example, of the 48 CV-A9 sequences submitted, 43 were found in samples collected from May to August 2010 with a clear peak (n=38) in June and July. In addition, five of the six EV-D68 sequences were found in samples collected from August to November 2010; 46 of the 65 E-7 sequences were found in samples collected from May to August 2011 and 51 of the 69 E-25 sequences were found in samples collected from August to December 2011. We have described a data-sharing concept that combines the capacities of clinical and public health laboratories in the Netherlands in a database to which all laboratories have equal and full access. After initial discussions to align expectations and develop a code of conduct, all laboratories were able to share a first set of historical data within two months.
We managed to get consensus on the typing protocol and a data sharing agreement between the central public health laboratory (RIVM), large university laboratories and some large general hospitals that are geographically dispersed, thus potentially enabling broad coverage of surveillance of viruses of common interest. Within the enterovirus pilot, all sequences generated in two years by six of the seven collaborating laboratories were shared. One pitfall of a consensus typing method may be that some viruses will be missed if they are not detected in the particular molecular test. This is of concern, given that the previously common practice of viral culture, which could serve as a safety net, is diminishing very rapidly. Since RNA viruses diverge rapidly, there is a need to get updated full-length sequences, not only for epidemiological reasons but also to keep diagnostic assays based on molecular testing up to date.

At present, the availability of whole genome sequences is limited, but with next generation sequencing techniques rapidly coming within reach of academic and even clinical laboratories, this situation will change quickly. The same system is currently being set up for a number of other viruses for which collaboration was valued according to the questionnaire – with parechovirus, norovirus and hepatitis E virus on the priority list [13-15]. Sequence-based characterisation is becoming more common within the larger diagnostic centres: the availability of sequence-based information will assist both the clinicians and diagnostic laboratories as well as the public health laboratories. Furthermore, by using sequencing technologies, a more in-depth analysis of circulating strains can be carried out, as individual sequences can be analysed, instead of serotypes. Sequences have a much higher discriminatory power, as most sequences within one serotype will be different from each other, thus facilitating, for example, the tracing of transmission patterns. Sequence techniques are particularly valuable for viruses that are difficult to grow.

In an economic climate with shrinking budgets, it may prove difficult for facilities to perform sequencing for diagnostic and epidemiological purposes, although it is expected that large centres will continue to perform routine sequencing. The harmonisation of typing protocols and sharing of data with a more extensive group of laboratories, or even cross-border centres, will be a next step.

