Efficient C• G-to-G• C base editors developed using CRISPRi screens, target-library analysis, and machine learning

Authors:
Koblan LW, Arbab M, Shen MW, Hussmann JA, Anzalone AV, Doman JL, Newby GA, Yang D, Mok B, Replogle JM, Xu A, Sisley TA, Weissman JS, Adamson B, Liu DR.
In:
Source: Nat Biotechnol
Publication Date: (2021)
Issue: 39(11): 1414-1425
Cells used in publication:
K-562
Species: human
Tissue Origin: blood
U-2 OS
Species: human
Tissue Origin: bone
HAP-1
Species: human
Tissue Origin: blood
Platform:
4D-Nucleofector® X-Unit
Experiment

Nucleofection was performed on K562, HeLa, and U2OS cells as previously described 38. 750 ng of base editor-expression plasmid and 250 ng sgRNA-expression plasmid were nucleofected in a final volume of 20 µL in a 16-well nucleocuvette strip (Lonza). K562 cells were nucleofected using the SF Cell Line 4D-Nucleofector X Kit (Lonza) with 5×10^5 cells per sample (program FF-120), according to the manufacturer’s protocol. U2OS cells were nucleofected using the SE Cell Line 4D-Nucleofector X Kit (Lonza) with 3–4 ×10^5 cells per sample (program DN-100), according to the manufacturer’s protocol. HeLa cells were nucleofected using the SE Cell Line 4D-Nucleofector X Kit (Lonza) with 2×10^5 cells per sample (program CN-114), according to the manufacturer’s protocol. Nucleofiection of HAP1 cells was performed using the same amounts of DNA and final volume in a 16-well nucelocuvette strip; however, HAP1 cells were nucleofected using the SE Cell Line 4D-Nucleofector X Kit (Lonza) with 4×10^5 cells per sample (program DZ-113), according to the manufacturer’s protocol.

Abstract

Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE-single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.