Score the Candidates Sequences for Specificity
We are forced to balance the prediction of knockdown efficiency against the desire to minimize interaction with off-target genes, without a clear understanding of just how to predict off-target "hits". We calculate a "specificity score" to promote candidates without obvious off-target transcripts. Each candidate is compared by BLASTN to two distinct abstractions of the transcriptome: the NCBI Unigene "unique" database (vaguely defined by NCBI as the "longest, best" sequence from each unigene cluster), and the transcripts from Refseq. We deem a 'miss' any sequence pair with at least three differences, with at least two of the differences in the core positions 3-19, i.e., not on the ends of the 21mer target region. We then determine if each candidate hits one unigene cluster, one Locuslink transcript, one Locuslink gene, and for those genes with muliple transcripts, all the the transcripts in the gene. Using just the "hits-One-Unigene" and the "hits-One-NM" values, we apply a "specificity score" to each candidate whereby candidates that uniquely hit one unigene cluster AND one Locuslink transcript are rewarded, those that hit one unigene OR one Locuslink transcript are rewarded, but less so, and those that had neither unigene or Locuslink specificity are penalized. After determining and storing this "specificityScore", we resort the candidates.
Spacing the candidate 21mers along the transcript
Since we synthesize 5 oligo pairs for each transcript, and since we hypothesize a role for the secondary structure of the target transcript in the effectiveness of an shRNA, we want to have the candidates spread out along the transcript, with one from the 3-prime UTR region and 4 along the CDS. To pick the five candidates, the highest scoring three-prime UTR candidate, if available, is chosen first. Next the top scoring candidate among the CDS candidates is chosen. A position-penalty is then applied to all the other CDS candidates, where the penalty is more severe the closer the candidate is to the first CDS candidate picked. After applying the position penalty, all the CDS candidates are resorted by their newly calculated, position-weighted score. From the list of remaining CDS candidates, the highest-scoring candidate is chosen and the position penalty is applied to all the remaining candidates based upon the already picked CDS candidates. This process is repeated until all the candidates are rescored. Finally the top 5 position-, specificity-weighted candidates are chosen for oligo synthesis.
日前,国产期刊TheInnovation获得首个影响因子(IF=32.1),成为科睿唯安JCR综合性期刊分类下排名仅次于《自然》(IF=64.8)和《科学》(IF=56.9)的期刊,并且这本期刊在目前......
调控基因组元件的高阶三维(3D)组织为基因调控提供了拓扑基础,但尚不清楚哺乳动物基因组中的多个调控元件如何在单个细胞内相互作用。2023年8月28日,北京大学汤富酬团队在NatureMethods(I......
近日,服务科学领域的全球领导者赛默飞世尔科技(以下简称赛默飞)宣布,在达成收购意向两个月之后,赛默飞以28亿美元、折合人民币约190亿元的价格,完成了对TheBindingSiteGroup的全现金收......
11月15日,施普林格·自然和TheLens平台宣布结成重要的合作伙伴关系,以更深入地揭示学术研究和数据如何能通过经济和社会成效,加速推动创新的问题解决方式。通过将科学、投资和企业领域的开放数据更好地......
多年来,出现了多种“大脑地图”,每一种都关注不同的大脑过程,从新陈代谢到认知功能。虽然这些地图很重要,但单独使用它们会限制研究人员从中得出的发现。现在,来自蒙特利尔神经学研究所的一个团队和其他研究人员......
分析测试百科网讯2019年10月10日,第三届磁性相关测量讲习班暨QuantumDesign中国子公司2019年华北区用户会在北京中科院物理所举办,由中国电子学会应用磁学分会主办,QuantumDes......
6月10日,QS教育集团正式发布了2021年世界大学排名,中国共有83所高校上榜,包括内地高校51所,港澳台地区高校32所。中国大学的总体排名情况已经连续数年呈上升趋势,今年再度刷新了榜单。大学排名,......
磷酸甘油酸突变酶1(PGAM1)通过其代谢活性以及与其他蛋白质(例如α平滑肌肌动蛋白(ACTA2))的相互作用,在癌症代谢和肿瘤进展中起关键作用。变构调节被认为是发现针对PGAM1的高选择性和有效抑制......
如果说宇宙蕴藏无数奥秘,那么大脑必定是其中最难解谜团之一。对人类大脑的研究不仅关乎我们对人体内这一最复杂器官发育与功能的理解,相关的病理学、药物发现、再生医学等研究更是与国计民生直接相关。基于干细胞、......
在国家自然科学基金项目(批准号:31127901,31730054,31661143041,31700743)等资助下,中国科学院生物物理研究所徐涛院士和纪伟教授级高级工程师在提高光学显微镜分辨率技术......