发布时间:2019-08-02 23:23 原文链接: TheTRCshRNA设计方法与原则

Overview

We design shRNA molecules with an algorithm. Our algorithm uses several criteria to rank potential 21mer targets within each human and mouse Refseq transcript. The algorithm applies a set of rules, including those derived from the siRNA literature, our cloning scheme, constraints on the synthesis of the oligonucleotides and others. In applying the algorithm, our aim is to achieve a balance of two competing goals: make hairpins that effectively knockdown the target transcript and, as best possible, design hairpins that knockdown only one gene and not other so-called 'off-target' genes. Each goal presents distinct challenges. The criteria for predicting effective knockdown with either siRNA or shRNA are not well understood. Our rules are primarily derived from the siRNA literature; how well these rules apply to shRNA design is unclear. Genome evolution constrains target specificity. Many genes are part of extensive gene families, which may make targeting any one gene difficult. Functionally distinct genes share many motifs. Our knowledge of transcript structure and variants is still very incomplete as well. For all these reasons and more, we construct 5 shRNAs for each transcript with the expectation of getting a range of knockdown efficiencies across the set and at least one or two which knockdown effectively.

Users of this database should be aware that in order to have consistent and reliable annotation, the TRC consortium decided early on to use NCBI's REFSEQ collection of transcripts as the definitive source of information for the primary target sequence for the design of shRNA molecules.

As a general rule in the construction of the library, we construct shRNA molecules targeting just the first Refseq transcript reported from each NCBI gene. In part due to our design process, see below, the majority of the shRNAs target all known transcript variants.

A brief narrative of the candidate selection process

Get the Candidate Sequences

For each human and mouse Refseq transcript, we generate all 21mers starting 25 bp after the beginning of the CDS up to those starting 150 bp from the end of the transcript. Each 21mer is called a 'candidate'.

Score the Candidate Sequences For Knockdown Efficiency

Each candidate is given an "original score" by applying a set of rules that either penalize or reward features predicting successful knockdown and clone-design considerations, and then calculating the product of all the penalties/rewards. The individual rules are listed below. The candidates are then sorted by score and all those above a minimum score are stored.

Score the Candidates Sequences for Specificity

We are forced to balance the prediction of knockdown efficiency against the desire to minimize interaction with off-target genes, without a clear understanding of just how to predict off-target "hits". We calculate a "specificity score" to promote candidates without obvious off-target transcripts. Each candidate is compared by BLASTN to two distinct abstractions of the transcriptome: the NCBI Unigene "unique" database (vaguely defined by NCBI as the "longest, best" sequence from each unigene cluster), and the transcripts from Refseq. We deem a 'miss' any sequence pair with at least three differences, with at least two of the differences in the core positions 3-19, i.e., not on the ends of the 21mer target region. We then determine if each candidate hits one unigene cluster, one Locuslink transcript, one Locuslink gene, and for those genes with muliple transcripts, all the the transcripts in the gene. Using just the "hits-One-Unigene" and the "hits-One-NM" values, we apply a "specificity score" to each candidate whereby candidates that uniquely hit one unigene cluster AND one Locuslink transcript are rewarded, those that hit one unigene OR one Locuslink transcript are rewarded, but less so, and those that had neither unigene or Locuslink specificity are penalized. After determining and storing this "specificityScore", we resort the candidates.

Spacing the candidate 21mers along the transcript

Since we synthesize 5 oligo pairs for each transcript, and since we hypothesize a role for the secondary structure of the target transcript in the effectiveness of an shRNA, we want to have the candidates spread out along the transcript, with one from the 3-prime UTR region and 4 along the CDS. To pick the five candidates, the highest scoring three-prime UTR candidate, if available, is chosen first. Next the top scoring candidate among the CDS candidates is chosen. A position-penalty is then applied to all the other CDS candidates, where the penalty is more severe the closer the candidate is to the first CDS candidate picked. After applying the position penalty, all the CDS candidates are resorted by their newly calculated, position-weighted score. From the list of remaining CDS candidates, the highest-scoring candidate is chosen and the position penalty is applied to all the remaining candidates based upon the already picked CDS candidates. This process is repeated until all the candidates are rescored. Finally the top 5 position-, specificity-weighted candidates are chosen for oligo synthesis.


相关文章

190亿!赛默飞收购欧洲IVD巨头

近日,服务科学领域的全球领导者赛默飞世尔科技(以下简称赛默飞)宣布,在达成收购意向两个月之后,赛默飞以28亿美元、折合人民币约190亿元的价格,完成了对TheBindingSiteGroup的全现金收......

施普林格·自然与TheLens达成合作

11月15日,施普林格·自然和TheLens平台宣布结成重要的合作伙伴关系,以更深入地揭示学术研究和数据如何能通过经济和社会成效,加速推动创新的问题解决方式。通过将科学、投资和企业领域的开放数据更好地......

连看三大世界大学排名榜我国哪所大学是排名的“宠儿”?

6月10日,QS教育集团正式发布了2021年世界大学排名,中国共有83所高校上榜,包括内地高校51所,港澳台地区高校32所。中国大学的总体排名情况已经连续数年呈上升趋势,今年再度刷新了榜单。大学排名,......

LGC收购加拿大化学品公司TRC强化标准品全球领导地位

LGC现已完成对多伦多研究化学品公司(TRC)的收购,这是一家领先的合成有机生物化学品制造商。该公司的研究化学品、研究工具和构建模块在全球范围内广泛应用于制药、生物技术和应用检测领域。关于TRCTRC......

ThePlantCell:茉莉酸信号转录调控机理研究取得进展

作为一种重要的植物激素,茉莉酸不仅调控植物对于机械损伤、昆虫取食和腐生型病原菌侵害的防御反应,还参与调控诸多生长发育过程。basicHelix-Loop-Helix(bHLH)类型转录因子MYC2是茉......

李家洋应邀在PLANTCELL撰写ReflectionsonPlantCellClassics文章

ThePlantCell是植物领域的著名学术期刊,对植物学的发展起到了重要的引领作用。为庆祝创刊30周年,ThePlantCell杂志社邀请部分编委会成员及其他科学家对发表在该杂志的重要研究工作进行评......

【安特百货】加拿大TRC/TLC进口标准品全新上线

分析测试百科网讯安特百货现重磅推出加拿大TRC/TLC进口原装标准品,共计64000种产品目前已全面进驻。一次购满2000元赠送50元京东卡。所有产品原装进口,品质保证,供货周期2~3周,各位小伙伴们......