Long non-coding RNAs (lncRNAs) have emerged as important regulators of many biological processes, although their regulatory roles remain poorly characterized in woody plants, especially in gymnosperms. A major challenge of working with lncRNAs is to assign functional annotations, since they have a low coding potential and low cross-species conservation. We utilised an existing RNA-Sequencing resource and performed short RNA sequencing of somatic embryogenesis developmental stages in Norway spruce (Picea abies L. Karst). We implemented a pipeline to identify lncRNAs located within the intergenic space (lincRNAs) and generated a co-expression network including protein coding, lincRNA and miRNA genes. To assign putative functional annotation, we employed a guilt-by-association approach using the co-expression network and integrated these results with annotation assigned using semantic similarity and co-expression. Moreover, we evaluated the relationship between lincRNAs and miRNAs, and identified which lincRNAs are conserved in other species. We identified lincRNAs with clear evidence of differential expression during somatic embryogenesis and used network connectivity to identify those with the greatest regulatory potential. This work provides the most comprehensive view of lincRNAs in Norway spruce and is the first study to perform global identification of lincRNAs during somatic embryogenesis in conifers. The data have been integrated into the expression visualisation tools at the PlantGenIE.org web resource to enable easy access to the community. This will facilitate the use of the data to address novel questions about the role of lincRNAs in the regulation of embryogenesis and facilitate future comparative genomics studies.
© 2024 The Author(s). Physiologia Plantarum published by John Wiley & Sons Ltd on behalf of Scandinavian Plant Physiology Society.