Search | arXiv e-print repository

arXiv:2309.09816 [pdf]

Application of Novel PACS-based Informatics Platform to Identify Imaging Based Predictors of CDKN2A Allelic Status in Glioblastomas

Authors: Niklas Tillmanns, Jan Lost, Joanna Tabor, Sagar Vasandani, Shaurey Vetsa, Neelan Marianayagam, Kanat Yalcin, E. Zeynep Erson-Omay, Marc von Reppert, Leon Jekel, Sara Merkaj, Divya Ramakrishnan, Arman Avesta, Irene Dixe de Oliveira Santo, Lan Jin, Anita Huttner, Khaled Bousabarah, Ichiro Ikuta, MingDe Lin, Sanjay Aneja, Bernd Turowski, Mariam Aboian, Jennifer Moliterno

Abstract: Gliomas with CDKN2A mutations are known to have worse prognosis but imaging features of these gliomas are unknown. Our goal is to identify CDKN2A specific qualitative imaging biomarkers in glioblastomas using a new informatics workflow that enables rapid analysis of qualitative imaging features with Visually AcceSAble Rembrandtr Images (VASARI) for large datasets in PACS. Sixty nine patients under… ▽ More Gliomas with CDKN2A mutations are known to have worse prognosis but imaging features of these gliomas are unknown. Our goal is to identify CDKN2A specific qualitative imaging biomarkers in glioblastomas using a new informatics workflow that enables rapid analysis of qualitative imaging features with Visually AcceSAble Rembrandtr Images (VASARI) for large datasets in PACS. Sixty nine patients undergoing GBM resection with CDKN2A status determined by whole-exome sequencing were included. GBMs on magnetic resonance images were automatically 3D segmented using deep learning algorithms incorporated within PACS. VASARI features were assessed using FHIR forms integrated within PACS. GBMs without CDKN2A alterations were significantly larger (64% vs. 30%, p=0.007) compared to tumors with homozygous deletion (HOMDEL) and heterozygous loss (HETLOSS). Lesions larger than 8 cm were four times more likely to have no CDKN2A alteration (OR: 4.3; 95% CI:1.5-12.1; p<0.001). We developed a novel integrated PACS informatics platform for the assessment of GBM molecular subtypes and show that tumors with HOMDEL are more likely to have radiographic evidence of pial invasion and less likely to have deep white matter invasion or subependymal invasion. These imaging features may allow noninvasive identification of CDKN2A allele status. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 23 pages, 5 figures

arXiv:2202.01104 [pdf, other]

Optimal fenestration of the Fontan circulation

Authors: Zan Ahmad, Lynn H. Jin, Daniel J. Penny, Craig G. Rusin, Charles S. Peskin, Charles Puelz

Abstract: In this paper, we develop a pulsatile compartmental model of the Fontan circulation and use it to explore the effects of a fenestration added to this physiology. A fenestration is a shunt between the systemic and pulmonary veins that is added either at the time of Fontan conversion or at a later time for the treatment of complications. This shunt increases cardiac output and decreases systemic ven… ▽ More In this paper, we develop a pulsatile compartmental model of the Fontan circulation and use it to explore the effects of a fenestration added to this physiology. A fenestration is a shunt between the systemic and pulmonary veins that is added either at the time of Fontan conversion or at a later time for the treatment of complications. This shunt increases cardiac output and decreases systemic venous pressure. However, these hemodynamic benefits are achieved at the expense of a decrease in the arterial oxygen saturation. The model developed this paper incorporates fenestration size as a parameter and describes both blood flow and oxygen transport. It is calibrated to clinical data from Fontan patients, and we use it to study the impact of a fenestration on several hemodynamic variables. In certain scenarios corresponding to high-risk Fontan physiology, we demonstrate the existence of an optimal fenestration size that maximizes oxygen delivery to the systemic tissues. △ Less

Submitted 21 June, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

arXiv:2003.09800 [pdf]

Forecasting and evaluating intervention of Covid-19 in the World

Authors: Zixin Hu, Qiyang Ge, Shudi Li, Eric Boerwincle, Li Jin, Momiao Xiong

Abstract: When the Covid-19 pandemic enters dangerous new phase, whether and when to take aggressive public health interventions to slow down the spread of COVID-19. To develop the artificial intelligence (AI) inspired methods for real-time forecasting and evaluating intervention strategies to curb the spread of Covid-19 in the World. A modified auto-encoder for modeling the transmission dynamics of the epi… ▽ More When the Covid-19 pandemic enters dangerous new phase, whether and when to take aggressive public health interventions to slow down the spread of COVID-19. To develop the artificial intelligence (AI) inspired methods for real-time forecasting and evaluating intervention strategies to curb the spread of Covid-19 in the World. A modified auto-encoder for modeling the transmission dynamics of the epidemics is developed and applied to the surveillance data of cumulative and new Covid-19 cases and deaths from WHO, as of March 16, 2020. The average errors of 5-step forecasting were 2.5%. The total peak number of cumulative cases and new cases, and the maximum number of cumulative cases in the world with later intervention (comprehensive public health intervention is implemented 4 weeks later) could reach 75,249,909, 10,086,085, and 255,392,154, respectively. The case ending time was January 10, 2021. However, the total peak number of cumulative cases and new cases and the maximum number of cumulative cases in the world with one week later intervention were reduced to 951,799, 108,853 and 1,530,276, respectively. Duration time of the Covid-19 spread would be reduced from 356 days to 232 days. The case ending time was September 8, 2020. We observed that delaying intervention for one month caused the maximum number of cumulative cases to increase 166.89 times, and the number of deaths increase from 53,560 to 8,938,725. We will face disastrous consequences if immediate action to intervene is not taken. △ Less

Submitted 21 March, 2020; originally announced March 2020.

Comments: 28 pages, 5 figures and 5 tables. arXiv admin note: text overlap with arXiv:2002.07112

arXiv:2003.05776 [pdf]

A deep belief network-based method to identify proteomic risk markers for Alzheimer disease

Authors: Ning An, Liuqi Jin, Huitong Ding, Jiaoyun Yang, Jing Yuan

Abstract: While a large body of research has formally identified apolipoprotein E (APOE) as a major genetic risk marker for Alzheimer disease, accumulating evidence supports the notion that other risk markers may exist. The traditional Alzheimer-specific signature analysis methods, however, have not been able to make full use of rich protein expression data, especially the interaction between attributes. Th… ▽ More While a large body of research has formally identified apolipoprotein E (APOE) as a major genetic risk marker for Alzheimer disease, accumulating evidence supports the notion that other risk markers may exist. The traditional Alzheimer-specific signature analysis methods, however, have not been able to make full use of rich protein expression data, especially the interaction between attributes. This paper develops a novel feature selection method to identify pathogenic factors of Alzheimer disease using the proteomic and clinical data. This approach has taken the weights of network nodes as the importance order of signaling protein expression values. After generating and evaluating the candidate subset, the method helps to select an optimal subset of proteins that achieved an accuracy greater than 90%, which is superior to traditional machine learning methods for clinical Alzheimer disease diagnosis. Besides identifying a proteomic risk marker and further reinforce the link between metabolic risk factors and Alzheimer disease, this paper also suggests that apidonectin-linked pathways are a possible therapeutic drug target. △ Less

Submitted 11 March, 2020; originally announced March 2020.

arXiv:2002.07112 [pdf]

Artificial Intelligence Forecasting of Covid-19 in China

Authors: Zixin Hu, Qiyang Ge, Shudi Li, Li Jin, Momiao Xiong

Abstract: BACKGROUND An alternative to epidemiological models for transmission dynamics of Covid-19 in China, we propose the artificial intelligence (AI)-inspired methods for real-time forecasting of Covid-19 to estimate the size, lengths and ending time of Covid-19 across China. METHODS We developed a modified stacked auto-encoder for modeling the transmission dynamics of the epidemics. We applied this mod… ▽ More BACKGROUND An alternative to epidemiological models for transmission dynamics of Covid-19 in China, we propose the artificial intelligence (AI)-inspired methods for real-time forecasting of Covid-19 to estimate the size, lengths and ending time of Covid-19 across China. METHODS We developed a modified stacked auto-encoder for modeling the transmission dynamics of the epidemics. We applied this model to real-time forecasting the confirmed cases of Covid-19 across China. The data were collected from January 11 to February 27, 2020 by WHO. We used the latent variables in the auto-encoder and clustering algorithms to group the provinces/cities for investigating the transmission structure. RESULTS We forecasted curves of cumulative confirmed cases of Covid-19 across China from Jan 20, 2020 to April 20, 2020. Using the multiple-step forecasting, the estimated average errors of 6-step, 7-step, 8-step, 9-step and 10-step forecasting were 1.64%, 2.27%, 2.14%, 2.08%, 0.73%, respectively. We predicted that the time points of the provinces/cities entering the plateau of the forecasted transmission dynamic curves varied, ranging from Jan 21 to April 19, 2020. The 34 provinces/cities were grouped into 9 clusters. CONCLUSIONS The accuracy of the AI-based methods for forecasting the trajectory of Covid-19 was high. We predicted that the epidemics of Covid-19 will be over by the middle of April. If the data are reliable and there are no second transmissions, we can accurately forecast the transmission dynamics of the Covid-19 across the provinces/cities in China. The AI-inspired methods are a powerful tool for helping public health planning and policymaking. △ Less

Submitted 1 March, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

Comments: 14 pages, 5 figures, 1 table

arXiv:1911.03839 [pdf, ps, other]

In Vitro Fertilization (IVF) Cumulative Pregnancy Rate Prediction from Basic Patient Characteristics

Authors: Bo Zhang, Yuqi Cui, Meng Wang, Jingjing Li, Lei Jin, Dongrui Wu

Abstract: Tens of millions of women suffer from infertility worldwide each year. In vitro fertilization (IVF) is the best choice for many such patients. However, IVF is expensive, time-consuming, and both physically and emotionally demanding. The first question that a patient usually asks before the IVF is how likely she will conceive, given her basic medical examination information. This paper proposes thr… ▽ More Tens of millions of women suffer from infertility worldwide each year. In vitro fertilization (IVF) is the best choice for many such patients. However, IVF is expensive, time-consuming, and both physically and emotionally demanding. The first question that a patient usually asks before the IVF is how likely she will conceive, given her basic medical examination information. This paper proposes three approaches to predict the cumulative pregnancy rate after multiple oocyte pickup cycles. Experiments on 11,190 patients showed that first clustering the patients into different groups and then building a support vector machine model for each group can achieve the best overall performance. Our model could be a quick and economic approach for reliably estimating the cumulative pregnancy rate for a patient, given only her basic medical examination information, well before starting the actual IVF procedure. The predictions can help the patient make optimal decisions on whether to use her own oocyte or donor oocyte, how many oocyte pickup cycles she may need, whether to use embryo frozen, etc. They will also reduce the patient's cost and time to pregnancy, and improve her quality of life. △ Less

Submitted 9 November, 2019; originally announced November 2019.

arXiv:1905.07680 [pdf]

doi 10.1371/journal.pcbi.1006222

Predicting 3D structure and stability of RNA pseudoknots in monovalent and divalent ion solutions

Authors: Ya-Zhou Shi, Lei Jin, Chen-Jie Feng, Ya-Lan Tan, Zhi-Jie Tan

Abstract: RNA pseudoknots are a kind of minimal RNA tertiary structural motifs, and their three-dimensional (3D) structures and stability play essential roles in a variety of biological functions. Therefore, to predict 3D structures and stability of RNA pseudoknots is essential for understanding their functions. In the work, we employed our previously developed coarse-grained model with implicit salt to mak… ▽ More RNA pseudoknots are a kind of minimal RNA tertiary structural motifs, and their three-dimensional (3D) structures and stability play essential roles in a variety of biological functions. Therefore, to predict 3D structures and stability of RNA pseudoknots is essential for understanding their functions. In the work, we employed our previously developed coarse-grained model with implicit salt to make extensive predictions and comprehensive analyses on the 3D structures and stability for RNA pseudoknots in monovalent/divalent ion solutions. The comparisons with available experimental data show that our model can successfully predict the 3D structures of RNA pseudoknots from their sequences, and can also make reliable predictions for the stability of RNA pseudoknots with different lengths and sequences over a wide range of monovalent/divalent ion concentrations. Furthermore, we made comprehensive analyses on the unfolding pathway for various RNA pseudoknots in ion solutions. Our analyses for extensive pseudokonts and the wide range of monovalent/divalent ion concentrations verify that the unfolding pathway of RNA pseudoknots is mainly dependent on the relative stability of unfolded intermediate states, and show that the unfolding pathway of RNA pseudoknots can be significantly modulated by their sequences and solution ion conditions. △ Less

Submitted 18 May, 2019; originally announced May 2019.

Comments: 23 pages, 8 figures

Journal ref: PLOS Computational Biology, 14(6): e1006222, 2018

arXiv:1901.05537 [pdf]

Shared Causal Paths underlying Alzheimer's dementia and Type 2 Diabetes

Authors: Zixin Hu, Rong Jiao, Jiucun Wang, Panpan Wang, Yun Zhu, Jinying Zhao, Phil De Jager, David A Bennett, Li Jin, Momiao Xiong

Abstract: Background: Although Alzheimer's disease (AD) is a central nervous system disease and type 2 diabetes mellitus (T2DM) is a metabolic disorder, an increasing number of genetic epidemiological studies show clear link between AD and T2DM. The current approach to uncovering the shared pathways between AD and T2DM involves association analysis; however, such analyses lack power to discover the mechanis… ▽ More Background: Although Alzheimer's disease (AD) is a central nervous system disease and type 2 diabetes mellitus (T2DM) is a metabolic disorder, an increasing number of genetic epidemiological studies show clear link between AD and T2DM. The current approach to uncovering the shared pathways between AD and T2DM involves association analysis; however, such analyses lack power to discover the mechanisms of the diseases. Methods: We develop novel statistical methods to shift the current paradigm of genetic analysis from association analysis to deep causal inference for uncovering the shared mechanisms between AD and T2DM, and develop pipelines to infer multilevel omics causal networks which lead to shifting the current paradigm of genetic analysis from genetic analysis alone to integrated causal genomic, epigenomic, transcriptional and phenotypic data analysis. To discover common causal paths from genetic variants to AD and T2DM, we also develop algorithms that can automatically search the causal paths from genetic variants to diseases and Results: The proposed methods and algorithms are applied to ROSMAP dataset with 432 individuals who simultaneously had genotype, RNA-seq, DNA methylation and some phenotypes. We construct multi-omics causal networks and identify 13 shared causal genes, 16 shared causal pathways between AD and T2DM, and 754 gene expression and 101 gene methylation nodes that were connected to both AD and T2DM in multi-omics causal networks. Conclusions: The results of application of the proposed pipelines for identifying causal paths to real data analysis of AD and T2DM provided strong evidence to support the link between AD and T2DM and unraveled causal mechanism to explain this link. △ Less

Submitted 16 January, 2019; originally announced January 2019.

Comments: 53 pages

arXiv:1805.04164 [pdf]

Bivariate Causal Discovery and its Applications to Gene Expression and Imaging Data Analysis

Authors: Rong Jiao, Nan Lin, Zixin Hu, David A Bennett, Li Jin, Momiao Xiong

Abstract: The mainstream of research in genetics, epigenetics and imaging data analysis focuses on statistical association or exploring statistical dependence between variables. Despite their significant progresses in genetic research, understanding the etiology and mechanism of complex phenotypes remains elusive. Using association analysis as a major analytical platform for the complex data analysis is a k… ▽ More The mainstream of research in genetics, epigenetics and imaging data analysis focuses on statistical association or exploring statistical dependence between variables. Despite their significant progresses in genetic research, understanding the etiology and mechanism of complex phenotypes remains elusive. Using association analysis as a major analytical platform for the complex data analysis is a key issue that hampers the theoretic development of genomic science and its application in practice. Causal inference is an essential component for the discovery of mechanical relationships among complex phenotypes. Many researchers suggest making the transition from association to causation. Despite its fundamental role in science, engineering and biomedicine, the traditional methods for causal inference require at least three variables. However, quantitative genetic analysis such as QTL, eQTL, mQTL, and genomic-imaging data analysis requires exploring the causal relationships between two variables. This paper will focus on bivariate causal discovery. We will introduce independence of cause and mechanism (ICM) as a basic principle for causal inference, algorithmic information theory and additive noise model (ANM) as major tools for bivariate causal discovery. Large-scale simulations will be performed to evaluate the feasibility of the ANM for bivariate causal discovery. To further evaluate their performance for causal inference, the ANM will be applied to the construction of gene regulatory networks. Also, the ANM will be applied to trait-imaging data analysis to illustrate three scenarios: presence of both causation and association, presence of association while absence of causation, and presence of causation, while lack of association between two variables. △ Less

Submitted 10 May, 2018; originally announced May 2018.

arXiv:1802.05820 [pdf]

Phonemic evidence reveals interwoven evolution of Chinese dialects

Authors: Meng-Han Zhang, Wu-Yun Pan, Shi Yan, Li Jin

Abstract: Han Chinese experienced substantial population migrations and admixture in history, yet little is known about the evolutionary process of Chinese dialects. Here, we used phylogenetic approaches and admixture inference to explicitly decompose the underlying structure of the diversity of Chinese dialects, based on the total phoneme inventories of 140 dialect samples from seven traditional dialect gr… ▽ More Han Chinese experienced substantial population migrations and admixture in history, yet little is known about the evolutionary process of Chinese dialects. Here, we used phylogenetic approaches and admixture inference to explicitly decompose the underlying structure of the diversity of Chinese dialects, based on the total phoneme inventories of 140 dialect samples from seven traditional dialect groups: Mandarin, Wu, Xiang, Gan, Hakka, Min and Yue. We found a north-south gradient of phonemic differences in Chinese dialects induced from historical population migrations. We also quantified extensive horizontal language transfers among these dialects, corresponding to the complicated socio-genetic history in China. We finally identified that the middle latitude dialects of Xiang, Gan and Hakka were formed by admixture with other four dialects. Accordingly, the middle-latitude areas in China were a linguistic melting pot of northern and southern Han populations. Our study provides a detailed phylogenetic and historical context against family-tree model in China. △ Less

Submitted 15 February, 2018; originally announced February 2018.

arXiv:1512.00947 [pdf]

A New Statistical Framework for Genetic Pleiotropic Analysis of High Dimensional Phenotype Data

Authors: Panpan Wang, Mohammad Rahman, Li Jin, Momiao Xiong

Abstract: The widely used genetic pleiotropic analysis of multiple phenotypes are often designed for examining the relationship between common variants and a few phenotypes. They are not suited for both high dimensional phenotypes and high dimensional genotype (next-generation sequencing) data. To overcome these limitations, we develop sparse structural equation models (SEMs) as a general framework for a ne… ▽ More The widely used genetic pleiotropic analysis of multiple phenotypes are often designed for examining the relationship between common variants and a few phenotypes. They are not suited for both high dimensional phenotypes and high dimensional genotype (next-generation sequencing) data. To overcome these limitations, we develop sparse structural equation models (SEMs) as a general framework for a new paradigm of genetic analysis of multiple phenotypes. To incorporate both common and rare variants into the analysis, we extend the traditional multivariate SEMs to sparse functional SEMs. To deal with high dimensional phenotype and genotype data, we employ functional data analysis and the alternative direction methods of multiplier (ADMM) techniques to reduce data dimension and improve computational efficiency. Using large scale simulations we showed that the proposed methods have higher power to detect true causal genetic pleiotropic structure than other existing methods. Simulations also demonstrate that the gene-based pleiotropic analysis has higher power than the single variant-based pleiotropic analysis. The proposed method is applied to exome sequence data from the NHLBI Exome Sequencing Project (ESP) with 11 phenotypes, which identifies a network with 137 genes connected to 11 phenotypes and 341 edges. Among them, 114 genes showed pleiotropic genetic effects and 45 genes were reported to be associated with phenotypes in the analysis or other cardiovascular disease (CVD) related phenotypes in the literature. △ Less

Submitted 2 December, 2015; originally announced December 2015.

arXiv:1504.06463 [pdf]

The dichotomy structure of Y chromosome Haplogroup N

Authors: Kang Hu, Shi Yan, Kai Liu, Chao Ning, Lan-Hai Wei, Shi-Lin Li, Bing Song, Ge Yu, Feng Chen, Li-Jun Liu, Zhi-Peng Zhao, Chuan-Chao Wang, Ya-Jun Yang, Zhen-Dong Qin, Jing-Ze Tan, Fu-Zhong Xue, Hui Li, Long-Li Kang, Li Jin

Abstract: Haplogroup N-M231 of human Y chromosome is a common clade from Eastern Asia to Northern Europe, being one of the most frequent haplogroups in Altaic and Uralic-speaking populations. Using newly discovered bi-allelic markers from high-throughput DNA sequencing, we largely improved the phylogeny of Haplogroup N, in which 16 subclades could be identified by 33 SNPs. More than 400 males belonging to H… ▽ More Haplogroup N-M231 of human Y chromosome is a common clade from Eastern Asia to Northern Europe, being one of the most frequent haplogroups in Altaic and Uralic-speaking populations. Using newly discovered bi-allelic markers from high-throughput DNA sequencing, we largely improved the phylogeny of Haplogroup N, in which 16 subclades could be identified by 33 SNPs. More than 400 males belonging to Haplogroup N in 34 populations in China were successfully genotyped, and populations in Northern Asia and Eastern Europe were also compared together. We found that all the N samples were typed as inside either clade N1-F1206 (including former N1a-M128, N1b-P43 and N1c-M46 clades), most of which were found in Altaic, Uralic, Russian and Chinese-speaking populations, or N2-F2930, common in Tibeto-Burman and Chinese-speaking populations. Our detailed results suggest that Haplogroup N developed in the region of China since the final stage of late Paleolithic Era. △ Less

Submitted 24 April, 2015; originally announced April 2015.

Comments: main text 14 pages, 3 figures, 1 table, 3 SI tables

arXiv:1503.01880 [pdf]

Genetic structure of Sino-Tibetan populations revealed by forensic STR loci

Authors: Hong-Bing Yao, Chuan-Chao Wang, Jiang Wang, Xiaolan Tao, Shao-Qing Wen, Qiajun Du, Qiongying Deng, Bingying Xu, Ying Huang, Hong-Dan Wang, Shujin Li, Bin Cong, Liying Ma, Li Jin, Johannes Krause, Hui Li

Abstract: The origin and diversification of Sino-Tibetan populations have been a long-standing hot debate. However, the limited genetic information of Tibetan populations keeps this topic far from clear. In the present study, we genotyped 15 forensic autosomal STRs from 803 unrelated Tibetan individuals from Gansu Province (635 from Gannan and 168 from Tianzhu). We combined these data with published dataset… ▽ More The origin and diversification of Sino-Tibetan populations have been a long-standing hot debate. However, the limited genetic information of Tibetan populations keeps this topic far from clear. In the present study, we genotyped 15 forensic autosomal STRs from 803 unrelated Tibetan individuals from Gansu Province (635 from Gannan and 168 from Tianzhu). We combined these data with published dataset to infer a detailed population affinities and admixture of Sino-Tibetan populations. Our results revealed that the genetic structure of Sino-Tibetan populations was strongly correlated with linguistic affiliations. Although the among-population variances are relatively small, the genetic components for Tibetan, Lolo-Burmese, and Han Chinese were quite distinctive, especially for the Deng, Nu, and Derung of Lolo-Burmese. Southern indigenous populations, such as Tai-Kadai and Hmong-Mien populations might have made substantial genetic contribution to Han Chinese and Altaic populations, but not to Tibetans. Likewise, Han Chinese but not Tibetan shared very similar genetic makeups with Altaic populations, which did not support the North Asian origin of Tibetan populations. The dataset generated here are also valuable for forensic identification. △ Less

Submitted 6 March, 2015; originally announced March 2015.

Comments: 11 pages, 2 figures

arXiv:1406.1975 [pdf]

doi 10.1002/hbm.22560

A brain-wide association study of DISC1 genetic variants reveals a relationship with the structure and functional connectivity of the precuneus in schizophrenia

Authors: Xiaohong Gong, Wenlian Lu, Keith M. Kendrick, Weidan Pu, Chu Wang, Li Jin, Guangmin Lu, Zhening Liu, Haihong Liu, Jianfeng Feng

Abstract: The Disrupted in Schizophrenia Gene 1 (DISC1) plays a role in both neural signalling and development and is associated with schizophrenia, although its links to altered brain structure and function in this disorder are not fully established. Here we have used structural and functional MRI to investigate links with six DISC1 single nucleotide polymorphisms (SNPs). We employed a brain-wide associati… ▽ More The Disrupted in Schizophrenia Gene 1 (DISC1) plays a role in both neural signalling and development and is associated with schizophrenia, although its links to altered brain structure and function in this disorder are not fully established. Here we have used structural and functional MRI to investigate links with six DISC1 single nucleotide polymorphisms (SNPs). We employed a brain-wide association analysis (BWAS) together with a Jacknife internal validation approach in 46 schizophrenia patients and 24 matched healthy control subjects. Results from structural MRI showed significant associations between all six DISC1 variants and gray matter volume in the precuneus, post-central gyrus and middle cingulate gyrus. Associations with specific SNPs were found for rs2738880 in the left precuneus and right post-central gyrus, and rs1535530 in the right precuneus and middle cingulate gyrus. Using regions showing structural associations as seeds a resting-state functional connectivity analysis revealed significant associations between all 6 SNPS and connectivity between the right precuneus and inferior frontal gyrus. The connection between the right precuneus and inferior frontal gyrus was also specifically associated with rs821617. Importantly schizophrenia patients showed positive correlations between the six DISC-1 SNPs associated gray matter volume in the left precuneus and right post-central gyrus and negative symptom severity. No correlations with illness duration were found. Our results provide the first evidence suggesting a key role for structural and functional connectivity associations between DISC1 polymorphisms and the precuneus in schizophrenia. △ Less

Submitted 8 June, 2014; originally announced June 2014.

Comments: 43 pages, 8 figures, 3 tables

arXiv:1404.7766 [pdf]

Genome-wide Scan of Archaic Hominin Introgressions in Eurasians Reveals Complex Admixture History

Authors: Ya Hu, Yi Wang, Qiliang Ding, Yungang He, Minxian Wang, Jiucun Wang, Shuhua Xu, Li Jin

Abstract: Introgressions from Neanderthals and Denisovans were detected in modern humans. Introgressions from other archaic hominins were also implicated, however, identification of which poses a great technical challenge. Here, we introduced an approach in identifying introgressions from all possible archaic hominins in Eurasian genomes, without referring to archaic hominin sequences. We focused on mutatio… ▽ More Introgressions from Neanderthals and Denisovans were detected in modern humans. Introgressions from other archaic hominins were also implicated, however, identification of which poses a great technical challenge. Here, we introduced an approach in identifying introgressions from all possible archaic hominins in Eurasian genomes, without referring to archaic hominin sequences. We focused on mutations emerged in archaic hominins after their divergence from modern humans (denoted as archaic-specific mutations), and identified introgressive segments which showed significant enrichment of archaic-specific mutations over the rest of the genome. Furthermore, boundaries of introgressions were identified using a dynamic programming approach to partition whole genome into segments which contained different levels of archaic-specific mutations. We found that detected introgressions shared more archaic-specific mutations with Altai Neanderthal than they shared with Denisovan, and 60.3% of archaic hominin introgressions were from Neanderthals. Furthermore, we detected more introgressions from two unknown archaic hominins whom diverged with modern humans approximately 859 and 3,464 thousand years ago. The latter unknown archaic hominin contributed to the genomes of the common ancestors of modern humans and Neanderthals. In total, archaic hominin introgressions comprised 2.4% of Eurasian genomes. Above results suggested a complex admixture history among hominins. The proposed approach could also facilitate admixture research across species. △ Less

Submitted 30 April, 2014; originally announced April 2014.

Comments: 42 Pages, 1 Table, 4 Figures, 1 Supplementary Table, and 10 Supplementary Figures

arXiv:1311.6857 [pdf]

Agriculture driving male expansion in Neolithic Time

Authors: Chuan-Chao Wang, Yunzhi Huang, Shao-Qing Wen, Chun Chen, Li Jin, Hui Li

Abstract: The emergence of agriculture is suggested to have driven extensive human population growths. However, genetic evidence from maternal mitochondrial genomes suggests major population expansions began before the emergence of agriculture. Therefore, role of agriculture that played in initial population expansions still remains controversial. Here, we analyzed a set of globally distributed whole Y chro… ▽ More The emergence of agriculture is suggested to have driven extensive human population growths. However, genetic evidence from maternal mitochondrial genomes suggests major population expansions began before the emergence of agriculture. Therefore, role of agriculture that played in initial population expansions still remains controversial. Here, we analyzed a set of globally distributed whole Y chromosome and mitochondrial genomes of 526 male samples from 1000 Genome Project. We found that most major paternal lineage expansions coalesced in Neolithic Time. The estimated effective population sizes through time revealed strong evidence for 10- to 100-fold increase in population growth of males with the advent of agriculture. This sex-biased Neolithic expansion might result from the reduction in hunting-related mortality of males. △ Less

Submitted 26 November, 2013; originally announced November 2013.

Comments: 9 pages, 2 figures

arXiv:1310.7883 [pdf]

Global patterns of sex-biased migrations in humans

Authors: Chuan-Chao Wang, Li Jin, Hui Li

Abstract: A series of studies have revealed the among-population components of genetic variation are higher for the paternal Y chromosome than for the maternal mitochondrial DNA (mtDNA), which indicates sex-biased migrations in human populations. However, this phenomenon might be also an ascertainment bias due to nonrandom sampling of SNPs. To eliminate the possible bias, we used the whole Y chromosome and… ▽ More A series of studies have revealed the among-population components of genetic variation are higher for the paternal Y chromosome than for the maternal mitochondrial DNA (mtDNA), which indicates sex-biased migrations in human populations. However, this phenomenon might be also an ascertainment bias due to nonrandom sampling of SNPs. To eliminate the possible bias, we used the whole Y chromosome and mtDNA sequence data of 491 individuals from the 1000 Genomes Project Phase I to address the sex-biased migration dispute. We found that genetic differentiation between populations was higher for Y chromosome than for the mtDNA at global scales. The migration rate of female might be three times higher than that of male, assuming the effective population size is the same for male and female. △ Less

Submitted 29 October, 2013; originally announced October 2013.

Comments: 5 pages, 2 tables. arXiv admin note: text overlap with arXiv:1310.5935

arXiv:1310.5935 [pdf]

Natural selection on human Y chromosomes

Authors: Chuan-Chao Wang, Li Jin, Hui Li

Abstract: The paternally inherited Y chromosome has been widely used in population genetic studies to understand relationships among human populations. Our interpretation of Y chromosomal evidence about population history and genetics has rested on the assumption that all the Y chromosomal markers in the male-specific region (MSY) are selectively neutral. However, the very low diversity of Y chromosome has… ▽ More The paternally inherited Y chromosome has been widely used in population genetic studies to understand relationships among human populations. Our interpretation of Y chromosomal evidence about population history and genetics has rested on the assumption that all the Y chromosomal markers in the male-specific region (MSY) are selectively neutral. However, the very low diversity of Y chromosome has drawn a long debate about whether natural selection has affected this chromosome or not. In recent several years, the progress in Y chromosome sequencing has helped to address this dispute. Purifying selection has been detected in the X-degenerate genes of human Y chromosomes and positive selection might also have an influence in the evolution of testis-related genes in the ampliconic regions. Those new findings remind us to take the effect of natural selection into account when we use Y chromosome in population genetic studies. △ Less

Submitted 22 October, 2013; originally announced October 2013.

Comments: 12 pages

Journal ref: Journal of Genetics and Genomics,2014,41(2):47-52

arXiv:1310.5466 [pdf]

Present Y chromosomes support the Persian ancestry of Sayyid Ajjal Shams al-Din Omar and Eminent Navigator Zheng He

Authors: Chuan-Chao Wang, Ling-Xiang Wang, Manfei Zhang, Dali Yao, Li Jin, Hui Li

Abstract: Sayyid Ajjal is the ancestor of many Muslims in areas all across China. And one of his descendants is the famous Navigator of Ming Dynasty, Zheng He, who led the largest armada in the world of 15th century. The origin of Sayyid Ajjal's family remains unclear although many studies have been done on this topic of Muslim history. In this paper, we studied the Y chromosomes of his present descendants,… ▽ More Sayyid Ajjal is the ancestor of many Muslims in areas all across China. And one of his descendants is the famous Navigator of Ming Dynasty, Zheng He, who led the largest armada in the world of 15th century. The origin of Sayyid Ajjal's family remains unclear although many studies have been done on this topic of Muslim history. In this paper, we studied the Y chromosomes of his present descendants, and found they all have haplogroup L1a-M76, proving a southern Persian origin. △ Less

Submitted 21 October, 2013; originally announced October 2013.

Comments: 5 pages, 1 figure

arXiv:1310.5413 [pdf]

Convergence of Y chromosome STR haplotypes from different SNP haplogroups compromises accuracy of haplogroup prediction

Authors: Chuan-Chao Wang, Ling-Xiang Wang, Rukesh Shrestha, Shaoqing Wen, Manfei Zhang, Xinzhu Tong, Li Jin, Hui Li

Abstract: Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) are two kinds of commonly used markers in Y chromosome studies of forensic and population genetics. There has been increasing interest in the cost saving strategy by using the STR haplotypes to predict SNP haplogroups. However, the convergence of Y chromosome STR haplotypes from different haplogroups might compromise the accura… ▽ More Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) are two kinds of commonly used markers in Y chromosome studies of forensic and population genetics. There has been increasing interest in the cost saving strategy by using the STR haplotypes to predict SNP haplogroups. However, the convergence of Y chromosome STR haplotypes from different haplogroups might compromise the accuracy of haplogroup prediction. Here, we compared the worldwide Y chromosome lineages at both haplogroup level and haplotype level to search for the possible haplotype similarities among haplogroups. The similar haplotypes between haplogroups B and I2, C1 and E1b1b1, C2 and E1b1a1, H1 and J, L and O3a2c1, O1a and N, O3a1c and O3a2b, and M1 and O3a2 have been found, and those similarities reduce the accuracy of prediction. △ Less

Submitted 20 October, 2013; originally announced October 2013.

Comments: 13 pages, 2 figures

arXiv:1310.3897 [pdf]

doi 10.1371/journal.pone.0105691

Y Chromosomes of 40% Chinese Are Descendants of Three Neolithic Super-grandfathers

Authors: Shi Yan, Chuan-Chao Wang, Hong-Xiang Zheng, Wei Wang, Zhen-Dong Qin, Lan-Hai Wei, Yi Wang, Xue-Dong Pan, Wen-Qing Fu, Yun-Gang He, Li-Jun Xiong, Wen-Fei Jin, Shi-Lin Li, Yu An, Hui Li, Li Jin

Abstract: Demographic change of human populations is one of the central questions for delving into the past of human beings. To identify major population expansions related to male lineages, we sequenced 78 East Asian Y chromosomes at 3.9 Mbp of the non-recombining region (NRY), discovered >4,000 new SNPs, and identified many new clades. The relative divergence dates can be estimated much more precisely usi… ▽ More Demographic change of human populations is one of the central questions for delving into the past of human beings. To identify major population expansions related to male lineages, we sequenced 78 East Asian Y chromosomes at 3.9 Mbp of the non-recombining region (NRY), discovered >4,000 new SNPs, and identified many new clades. The relative divergence dates can be estimated much more precisely using molecular clock. We found that all the Paleolithic divergences were binary; however, three strong star-like Neolithic expansions at ~6 kya (thousand years ago) (assuming a constant substitution rate of 1e-9/bp/year) indicates that ~40% of modern Chinese are patrilineal descendants of only three super-grandfathers at that time. This observation suggests that the main patrilineal expansion in China occurred in the Neolithic Era and might be related to the development of agriculture. △ Less

Submitted 14 October, 2013; originally announced October 2013.

Comments: 29 pages of article text including 1 article figure, 9 pages of SI text, and 2 SI figures. 5 SI tables are in a separate ancillary file

Journal ref: Plos ONE 9(8): e105691 (2014)

arXiv:1212.4116 [pdf]

The GenoChip: A New Tool for Genetic Anthropology

Authors: Eran Elhaik, Elliott Greenspan, Sean Staats, Thomas Krahn, Chris Tyler-Smith, Yali Xue, Sergio Tofanelli, Paolo Francalacci, Francesco Cucca, Luca Pagani, Li Jin, Hui Li, Theodore G. Schurr, Bennett Greenspan, R. Spencer Wells, the Genographic Consortium

Abstract: The Genographic Project is an international effort using genetic data to chart human migratory history. The project is non-profit and non-medical, and through its Legacy Fund supports locally led efforts to preserve indigenous and traditional cultures. In its second phase, the project is focusing on markers from across the entire genome to obtain a more complete understanding of human genetic vari… ▽ More The Genographic Project is an international effort using genetic data to chart human migratory history. The project is non-profit and non-medical, and through its Legacy Fund supports locally led efforts to preserve indigenous and traditional cultures. In its second phase, the project is focusing on markers from across the entire genome to obtain a more complete understanding of human genetic variation. Although many commercial arrays exist for genome-wide SNP genotyping, they were designed for medical genetic studies and contain medically related markers that are not appropriate for global population genetic studies. GenoChip, the Genographic Project's new genotyping array, was designed to resolve these issues and enable higher-resolution research into outstanding questions in genetic anthropology. We developed novel methods to identify AIMs and genomic regions that may be enriched with alleles shared with ancestral hominins. Overall, we collected and ascertained AIMs from over 450 populations. Containing an unprecedented number of Y-chromosomal and mtDNA SNPs and over 130,000 SNPs from the autosomes and X-chromosome, the chip was carefully vetted to avoid inclusion of medically relevant markers. The GenoChip results were successfully validated. To demonstrate its capabilities, we compared the FST distributions of GenoChip SNPs to those of two commercial arrays for three continental populations. While all arrays yielded similarly shaped (inverse J) FST distributions, the GenoChip autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. The GenoChip is a dedicated genotyping platform for genetic anthropology and promises to be the most powerful tool available for assessing population structure and migration history. △ Less

Submitted 17 December, 2012; originally announced December 2012.

Comments: 11 pages, 5 Figures, 2 supplementary notes

Showing 1–22 of 22 results for author: Jin, L