You know how people refer to DNA as the genetic code? And they explain how every piece of DNA can be represented as a string of 4 characters, namely G,T, C and A? In my naivete, I thought genomics researchers had the greatest – and most natural – data standard in the world. Turns out life isn't that simple.
The Workgroup for Electronic Data Interchange (WEDI) in genomic medicine issued a report including nine big problems with genetic data, and one of these is data standards. Others are to do with the sheer size of records, the rapid ageing of data, security, and ownership. In fact, another of the issues cited is problems with data exchange, which is also a standards matter.
Here is the workgroup's key point about data standards: “Genetic tests are typically presented in a narrative text report that must be transmitted via scan, fax or PDF attachment prior to being uploaded into an EHR [electronic health record] or repository.” The report goes on to say: “Until genomic data is provided in standardized, structured and discrete formats that are both human- and machine-readable, value will be limited.”
You could take that last sentence, drop the word “genomic”, and you'd have a neat description of the problem for any information-intensive domain that doesn't use industry standards.
I don't claim to know much about genomics, but I know that if experts are writing free text which is supposed to be used by other professionals, there's a data standards gap. This is a symptom telling us that professionals need to get together – right across the healthcare chain – and agree the core concepts they want to share with each other. Genomic information is too important to waste. Health IT