›› 2020, Vol. 11 ›› Issue (3): 151-155.

• 论著 • 上一篇    下一篇

基于人口腔上皮细胞颅颌面特异增强子序列机器学习预测IRF6位点唇腭裂致病突变

张涵舒1,刘欢2   

  1. 1. 武汉大学口腔医学院口腔基础医学省部共建国家重点实验室培训基地和口腔生物医学教育部重点实验室
    2. 武汉大学口腔医学院口腔基础医学省部共建国家重点实验室培训基地和口腔生物医学教育部重点实验室,武汉大学口腔医院牙周科
  • 收稿日期:2020-07-19 修回日期:2020-08-26 出版日期:2020-09-25 发布日期:2020-09-30
  • 通讯作者: 刘欢 E-mail:liu.huan@whu.edu.cn
  • 基金资助:
    国家自然科学基金

Machine learning based human oral epithelium active enhancers predicts a pathogenic variant associated with orofacial cleft near IRF6

  • Received:2020-07-19 Revised:2020-08-26 Online:2020-09-25 Published:2020-09-30
  • Contact: Huan Liu E-mail:liu.huan@whu.edu.cn

摘要: 目的:利用基于人口腔上皮细胞颅颌面特异增强子序列机器学习,优化唇腭裂相关非编码突变研究靶点的筛选。方法:采用永生化人口腔上皮细胞(HIOEC)组蛋白H3第27位赖氨酸的乙酰化(H3K27Ac)染色质免疫共沉淀测序(ChIP-Seq)整合人类胚胎期颅颌面超级增强子区域,获得人口腔上皮细胞颅颌面特异增强子区域。采用gapped k-mer SVM算法总结增强子序列特征,并用该特征判断干扰素调节因子6(IRF6)附近唇腭裂相关单核苷酸多态性(SNP)位点或突变对增强子活性的影响。结果:人口腔上皮细胞颅颌面特异性增强子能够涵盖超过半数唇腭裂相关SNP位点(P<0.01),而基于该序列特征的机器学习预测与范德伍德综合征相关的350dupA位点显著降低所在增强子活性。双荧光素酶报告基因显示350dupA显著降低增强子在人口腔上皮细胞内的活性(P<0.01)。结论:基于人口腔上皮细胞颅颌面特异增强子序列机器学习能够精准判断IRF6附近唇腭裂相关非编码突变,优化唇腭裂遗传功能研究。

关键词: 唇腭裂, 功能基因组, 机器学习, IRF6

Abstract: Objective: This study aimed to validate whether machine learning based on craniofacial-specific oral epithelium enhancers could be used to non-coding variants associated with orofacial cleft. Methods: Anti H3K27Ac ChIP-seq was performed in human immortalized oral epithelial cells (HIOEC), which were integrated published craniofacial super-enhancers. The sequences of these enhancers were used as training set for gapped k-mer SVM (gkm SVM) machine learning, which could summarize all the DNA sequence features. Delta SVM scores were employed to validate the effects of non-coding variants near IRF6, and dual luciferase assays were used for biological validation. Results: Published orofacial cleft-associated SNPs were more enriched in oral epithelium-specific enhancers. gkm SVM showed 350dupA mutation could significantly decrease enhancer activity, which was then validated using dual luciferase assays in HIOEC cells. Conclusions: The classifiers based on oral epithelium-specific enhancers are useful in nominating functional SNPs identified in orofacial clefting genome wide association studies.

Key words: cleft lip and palate, functional genome, machine learning, IRF6