It's a little unsettling to find there isn't a common data standard for genomics. Isn't DNA already a standard – the great ancestor of every data standard since? I guess “the book of life” doesn't advertize the language it's written in.
Professor Brenton Graveley says: “In genomics in general, most of the data formats are fairly standardised already (e.g., sam, bam, bigwig, etc.), but certainly if you were to have any sort of large summary table of multiple experiments and gene expression values, those are just flat files for which there’s no standard format. And even though there are standards for some file formats, there still can be some differences in the way that people generate them as well as how they use optional fields. There’s still variability; you can’t just download a file and run with it, you still need to take a particular file and manipulate it into a slightly different format.”
Treatment of optional fields is a common problem across business. These attributes need careful commenting. Examples often help. Data Matters
Comments