After a little more pain than usual, miRBase 18 is finally released. The database contains 18226 entries representing hairpin precursor miRNAs, expressing 21643 mature miRNA products, in 168 species. That represents 1488 new hairpin sequences and 1929 novel mature products. The full README file is available on the FTP site.
As previously discussed, we have continued to rename mature sequences, phasing out the miR/miR* nomenclature in favour of the -5p/-3p nomenclature. That affects approximately 1400 mature sequences this time, from human, mouse and C. elegans. (We had planned to do rat as well, but decided to hold off until we had incorporated more rat deep sequencing data.)
There are also significant changes to the zebrafish miRNA complement, rationalising the entries with respect to the (now not so new) Zv9 genome assembly. That has lead to the deletion of 26 zebrafish entries, and the creation of 12 entries that represent duplicate loci. The full list of changes are itemized in the miRNA.diff file on the FTP site.
The website also shares new deep sequencing data — now approaching 250 datasets from NCBI GEO. In addition to raw read counts, we also show normalized read counts, currently calculated as reads per thousand reads that map to miRNAs (designated RPT on the website). We have also implemented a new feature to allow the comparison of normalized read counts from multiple experiments. For example, from the list of all D. melanogaster datasets (accessible from the “by tissue expression” box on the search page), you can tick up to 5 different experiments to compare read counts. This is getting dangerously close to allowing some really complex and powerful analyses through the website! You can also download the read counts from the results page, for offline processing. This is all Ana Kozomara’s work. As with all new features, it is wise to consider this to be in beta. We’ll be very happy to get your comments, bugs, praise, and abuse, as usual, here or by email.