{"title":"UBE2I imputed & refined","methodText":"##Scoring procedure:\r\nDMS-BarSeq and DMS-TileSeq reads were processed using the [dmsPipeline](https://bitbucket.org/rothlabto/dmspipeline) software. Briefly, Barseq read counts were used to establish relative frequencies of each strain at each timepoint and converted to estimates of absolute frequencies using OD measurement data. Absolute counts were used to establish growth curves from which fitness parameters were estimated and then normalized to 0-1 scale where 0 corresponds to null controls and 1 corresponds to WT controls. Meanwhile, TileSeq read counts were used to establish relative allele frequencies in each condition. Non-mutagenized control counts were subtracted from counts (as estimates of sequencing error). log ratios of selection over non-selection counts were calculated. The resulting TileSeq fitness values were then rescaled to the distribution of the BarSeq fitness scores. Fitness scores were joined using confidence-weighted averages. Random-Forest base machine learning was used to impute missing values and refine low-confidence measurements, based on intrinsic, structural, and biochemical features.\r\n\r\nSee [**Weile *et al.* 2017**](http://msb.embopress.org/content/13/12/957) for more details.\r\n\r\n## Additional columns:\r\n* exp.score = experimental score from the joint DMS-BarSeq/DMS-TileSeq screens\r\n* exp.sd = standard deviation of the experimental score\r\n* df = degrees of freedom (number of replicates contributing to the experimental score)\r\n* pred.score = machine-learning predicted score","abstractText":"Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes. \r\n\r\nSee [**Weile *et al.* 2017**](http://msb.embopress.org/content/13/12/957)","shortDescription":"A joint Deep Mutational Scan of the human SUMO E2 conjugase UBE2I using functional complementation in yeast, combining DMS-BarSeq and DMS-TileSeq data, followed by machine-learning-based imputation and refinement.","extraMetadata":{},"recordType":"ScoreSet","urn":"urn:mavedb:00000001-a-1","numVariants":3180,"license":{"longName":"CC0 (Public domain)","shortName":"CC0","active":true,"link":"https://creativecommons.org/publicdomain/zero/1.0/","version":"1.0","id":1,"recordType":"ShortLicense"},"metaAnalyzesScoreSetUrns":[],"metaAnalyzedByScoreSetUrns":[],"doiIdentifiers":[],"primaryPublicationIdentifiers":[{"identifier":"29269382","dbName":"PubMed","recordType":"PublicationIdentifier","title":"A framework for exhaustively mapping functional missense variants.","authors":[{"name":"Weile, Jochen","primary":true},{"name":"Sun, Song","primary":false},{"name":"Cote, Atina G","primary":false},{"name":"Knapp, Jennifer","primary":false},{"name":"Verby, Marta","primary":false},{"name":"Mellor, Joseph C","primary":false},{"name":"Wu, Yingzhou","primary":false},{"name":"Pons, Carles","primary":false},{"name":"Wong, Cassandra","primary":false},{"name":"van Lieshout, Natascha","primary":false},{"name":"Yang, Fan","primary":false},{"name":"Tasan, Murat","primary":false},{"name":"Tan, Guihong","primary":false},{"name":"Yang, Shan","primary":false},{"name":"Fowler, Douglas M","primary":false},{"name":"Nussbaum, Robert","primary":false},{"name":"Bloom, Jesse D","primary":false},{"name":"Vidal, Marc","primary":false},{"name":"Hill, David E","primary":false},{"name":"Aloy, Patrick","primary":false},{"name":"Roth, Frederick P","primary":false}],"abstract":"Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes.","doi":"10.15252/msb.20177908","publicationYear":2017,"publicationJournal":"Mol Syst Biol","url":"http://www.ncbi.nlm.nih.gov/pubmed/29269382","referenceHtml":"Weile J, <i>et al</i>. A framework for exhaustively mapping functional missense variants. <i>Mol. Syst. Biol</i>. 2017; <b>13</b>:957.","id":3}],"secondaryPublicationIdentifiers":[],"publishedDate":"2018-06-26","creationDate":"2018-06-26","modificationDate":"2024-11-26","createdBy":{"orcidId":"0000-0003-1628-9390","firstName":"Jochen","lastName":"Weile","recordType":"User"},"modifiedBy":{"orcidId":"0000-0003-1628-9390","firstName":"Jochen","lastName":"Weile","recordType":"User"},"targetGenes":[{"name":"UBE2I","category":"protein_coding","externalIdentifiers":[{"identifier":{"dbName":"Ensembl","identifier":"ENSG00000103275","recordType":"ExternalGeneIdentifier","url":"http://www.ensembl.org/id/ENSG00000103275"},"offset":0,"recordType":"ExternalGeneIdentifierOffset"},{"identifier":{"dbName":"RefSeq","identifier":"NM_003345","recordType":"ExternalGeneIdentifier","url":"http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=NM_003345"},"offset":159,"recordType":"ExternalGeneIdentifierOffset"},{"identifier":{"dbName":"UniProt","identifier":"P63279","recordType":"ExternalGeneIdentifier","url":"http://purl.uniprot.org/uniprot/P63279"},"offset":0,"recordType":"ExternalGeneIdentifierOffset"}],"id":5,"recordType":"TargetGene","targetSequence":{"sequenceType":"dna","sequence":"ATGTCGGGGATCGCCCTCAGCAGACTCGCCCAGGAGAGGAAAGCATGGAGGAAAGACCACCCATTTGGTTTCGTGGCTGTCCCAACAAAAAATCCCGATGGCACGATGAACCTCATGAACTGGGAGTGCGCCATTCCAGGAAAGAAAGGGACTCCGTGGGAAGGAGGCTTGTTTAAACTACGGATGCTTTTCAAAGATGATTATCCATCTTCGCCACCAAAATGTAAATTCGAACCACCATTATTTCACCCGAATGTGTACCCTTCGGGGACAGTGTGCCTGTCCATCTTAGAGGAGGACAAGGACTGGAGGCCAGCCATCACAATCAAACAGATCCTATTAGGAATACAGGAACTTCTAAATGAACCAAATATCCAAGACCCAGCTCAAGCAGAGGCCTACACGATTTACTGCCAAAACAGAGTGGAGTACGAGAAAAGGGTCCGAGCACAAGCCAAGAAGTTTGCGCCCTCATAA","recordType":"TargetSequence","taxonomy":{"code":9606,"organismName":"Homo sapiens","commonName":"human","rank":"SPECIES","hasDescribedSpeciesName":true,"articleReference":"NCBI:txid9606","genomeIdentifierId":5,"id":15,"recordType":"Taxonomy","url":"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=info&id=9606"}},"uniprotIdFromMappedMetadata":"P63279"}],"datasetColumns":{"scoreColumns":["score","sd","se","exp.score","exp.sd","df","pred.score"],"countColumns":[],"recordType":"DatasetColumns"},"externalLinks":{},"contributors":[],"scoreCalibrations":[],"experiment":{"title":"UBE2I yeast complementation","shortDescription":"A Deep Mutational Scan of the human SUMO E2 conjugase UBE2I using functional complementation in yeast.","abstractText":"Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes.","methodText":"A Deep Mutational Scan of UBE2I using functional complementation in yeast was performed using two different methods: DMS-BarSeq and DMS-TileSeq, both datasets were combined and a machine-learning method was used to impute the effects of missing variants and refine measurements of lower confidence. See [**Weile *et al.* 2017**](http://msb.embopress.org/content/13/12/957) for details.","extraMetadata":{},"recordType":"Experiment","urn":"urn:mavedb:00000001-a","createdBy":{"orcidId":"0000-0003-1628-9390","firstName":"Jochen","lastName":"Weile","recordType":"User"},"modifiedBy":{"orcidId":"0000-0003-1628-9390","firstName":"Jochen","lastName":"Weile","recordType":"User"},"creationDate":"2018-06-26","modificationDate":"2019-08-08","publishedDate":"2018-06-26","experimentSetUrn":"urn:mavedb:00000001","doiIdentifiers":[],"primaryPublicationIdentifiers":[{"identifier":"29269382","dbName":"PubMed","recordType":"PublicationIdentifier","title":"A framework for exhaustively mapping functional missense variants.","authors":[{"name":"Weile, Jochen","primary":true},{"name":"Sun, Song","primary":false},{"name":"Cote, Atina G","primary":false},{"name":"Knapp, Jennifer","primary":false},{"name":"Verby, Marta","primary":false},{"name":"Mellor, Joseph C","primary":false},{"name":"Wu, Yingzhou","primary":false},{"name":"Pons, Carles","primary":false},{"name":"Wong, Cassandra","primary":false},{"name":"van Lieshout, Natascha","primary":false},{"name":"Yang, Fan","primary":false},{"name":"Tasan, Murat","primary":false},{"name":"Tan, Guihong","primary":false},{"name":"Yang, Shan","primary":false},{"name":"Fowler, Douglas M","primary":false},{"name":"Nussbaum, Robert","primary":false},{"name":"Bloom, Jesse D","primary":false},{"name":"Vidal, Marc","primary":false},{"name":"Hill, David E","primary":false},{"name":"Aloy, Patrick","primary":false},{"name":"Roth, Frederick P","primary":false}],"abstract":"Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes.","doi":"10.15252/msb.20177908","publicationYear":2017,"publicationJournal":"Mol Syst Biol","url":"http://www.ncbi.nlm.nih.gov/pubmed/29269382","referenceHtml":"Weile J, <i>et al</i>. A framework for exhaustively mapping functional missense variants. <i>Mol. Syst. Biol</i>. 2017; <b>13</b>:957.","id":3}],"secondaryPublicationIdentifiers":[],"rawReadIdentifiers":[{"identifier":"SRP109101","id":2,"recordType":"RawReadIdentifier","url":"http://www.ebi.ac.uk/ena/data/view/SRP109101"},{"identifier":"SRP109119","id":3,"recordType":"RawReadIdentifier","url":"http://www.ebi.ac.uk/ena/data/view/SRP109119"}],"contributors":[],"keywords":[],"scoreSetUrns":["urn:mavedb:00000001-a-1","urn:mavedb:00000001-a-2","urn:mavedb:00000001-a-3","urn:mavedb:00000001-a-4"],"externalLinks":{},"numScoreSets":4,"officialCollections":[]},"officialCollections":[],"private":false,"processingState":"success","mappingState":"complete"}