Phew. After considerably more pain and tears than usual, miRBase 20 is finally available on the website and for download on the FTP site (see also the README file). The gap between releases has also been longer than usual, which means that the increase in data is greater than usual (probably explaining the increase in pain). In all, we have 3355 new hairpin sequences and 5393 new mature microRNAs from around 40 new publications, increasing the totals to 24521 hairpin sequences and 30424 mature sequences. As always, the full list of additions, deletions and name changes in available in the miRNA.diff file on the FTP site, along with all other miRBase data in various file formats. There are minor changes to the structure of the MySQL database underlying the website, and therefore to the database dumps. As we still don’t have sensible documentation for those dumps, you should ask if you care about this.
Ana has also spent a fair bit of time adding datasets to the deep sequencing section of the site: we have now mapped reads from 306 small RNA deep sequencing experiments to miRBase hairpins, increasing the coverage to 37 species. In all, approximately 25% of all mature microRNAs have at least 10 reads mapping to them across all datasets. As we’ve said before, these data can be used for expression analysis, and for judging the validity of microRNA annotations. We’ve been working on a system to use these aggregated data to assess the confidence in a given microRNA annotation, and allow users to filter the data by this confidence measure. We aim to have something to show on that in the next release or two. Feel free to point us in the direction of publicly available datasets that we don’t already capture, preferably in the form of a GEO or SRA accession.
Comments, criticism, suggestions, abuse to the usual address.
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.