Split and pool synthesis

The split and pool (split-mix) synthesis is a method in combinatorial chemistry that can be used to prepare combinatorial compound libraries. It is a stepwise, highly efficient process realized in repeated cycles. The procedure makes it possible to prepare millions or even trillions of compounds as mixtures that can be used in drug research.

History
According to traditional methods, most organic compounds are synthesized one by one from building blocks coupling them together one after the other in a stepwise manner. Before 1982 nobody was even dreaming about making hundreds or thousands of compounds in a single process. Not speaking about millions or even trillions. So the productivity of the split and pool method invented by Prof. Á. Furka (Eötvös Loránd University Budapest Hungary), in 1982 seemed incredible at first sight. The method had been described it in a document notarized in the same year. The document is written in Hungarian and translated to English Motivations that led to the invention are found in a 2002 paper and the method was first published in international congresses in 1988 then in print in 1991.

The split and pool synthesis and its features
The split and pool synthesis (S&P synthesis) differs from traditional synthetic methods. The important novelty is the use of compound mixtures in the process. This is the reason of its unprecedentedly high productivity. Using the method one single chemist can make more compounds in a week than all chemists produced in the whole history of chemistry. The S&P synthesis is applied in a stepwise manner by repeating three operations in each step of the process:


 * Dividing a compound mixture into equal portions
 * Coupling one different building block (BB) to each portion
 * Pooling and thoroughly mixing the portions

The original method is based on the solid-phase synthesis of Merrifield The procedure is illustrated in the figure by the flowing diagram showing of a two-cycle synthesis using the same three BBs in both cycles. Choosing the solid phase method in the S&P synthesis is reasonable since otherwise removal of the by-products from the mixture of compounds would be very difficult.

Efficiency
The high efficiency is the most important feature of the method. In a multi step (n) synthesis using equal number of BBs (k) in every step the number of components in a forming combinatorial library (N) is:

N=kn

This means that the number of components increases exponentially with the number steps (cycles) while the number of the required couplings increases only linearly. If a different number of building BBs are used in the cycles (k1, k2, k3....kn) the number of the formed components is:

N=k1.k2.k3...kn.

This feature of the procedure offers the possibility to synthesize a practically unlimited number of compounds. For example, if 1000 BBs are used in four cycles 1 trillion compounds are expected to form. The number of needed couplings is only 4000!

The reason of the high efficiency
The explanation of the extraordinary efficiency is the use of mixtures in the synthetic steps. If in a traditional reaction one compound is coupled with one reactant and one new compound is formed. If a mixture of compounds containing n components is coupled with a single reactant the number of new compounds formed in the single coupling is n. The difference between the traditional and the split and pool synthesis is convincingly shown by the number of coupling steps in the traditional and the split and pool synthesis of 3,2 million pentapeptides.

Conventional synthesis:	3,200,000x5=16,000,000 coupling steps     cca 40,000 years S&P synthesis:                    20x5=100 coupling steps			cca 5 days

It is possible to conduct the conventional synthesis rational way as is shown in the figure. In this case, the number of coupling cycles is:

20+400+8,000+160,000+3,200,000=3,368,420				cca 9,200 years

The theoretical upper limit of the number of components
As often mentioned the split and pool method makes it possible to synthesize an unlimited number of compounds. In fact, the theoretical maximum number of components depends on the quantity of the library expressed in moles. If for example, 1 mol library is synthesized the maximum number of components is equal to the Avogadro number:

6,02214076·1023

In such a library each component would be represented by a single molecule.

Components of the library form in equal molar quantities
As far as the chemistry of the couplings makes it possible the components of the libraries form in nearly equal molar quantity. This is made possible by dividing of the mixtures into equal samples and by homogenization of the pooled samples by thoroughly mixing them. The equal molar quantity of components of the library is very important considering their applicability. The presence of compounds in unequal quantities may lead to difficulties in evaluation of the results in screening. The solid phase method makes it possible to use the reagents in excess to drive the reactions close to completion since the surplus can easily be removed by filtration.

The possibility of using two mixtures in the synthesis
In principle, the use of two mixtures in the S&P synthesis can lead to the same combinatorial library that forms in the usual S&P method. The differences in the reactivity of BBs however, bring about large differences in the concentrations of components, and the differences are expected to increase after each step. Although a considerable amount of labor could be saved by using the two mixtures approach when a high number of BBs are coupled in each position, it is advisable to stick to the normally used S&P procedure.

The presence of all structural varieties in the library
Formation of all structural variants that can be deduced from the BBs is an important feature of the S&P synthesis. Only the S&P method can achieve this in a single process. On the other hand, the presence of all possible structural varieties in a library assures that the library is a combinatorial one and is prepared by combinatorial synthesis.

Forming of one compound in the beads
The consequence of using a single BB in couplings is the formation of a single compound in each bead. The formation of OBOC libraries is an inherent property of the S&P synthesis. The reason is explained in the figure. The structure of the compound formed in a bead depends on the reaction vessels in which the bead happens to occur in the synthetic route. It depends on the decision of the chemist to use the library in the tethered (OBOC) form or cleave down the compounds from the beads and use it as a solution.

Realization of the split and pool synthesis
The split and pool synthesis was first applied to prepare peptide libraries on solid support. The synthesis was realized in a home-made manual device shown in the figure. The device has a tube with 20 holes to which reaction vessels could be attached. One end of the tube is linked to a waste container and a water pump. Left shows loading and filtering, right coupling-shaking position. In the early years of combinatorial chemistry, an automatic machine was constructed and commercialized at AdvancedChemTech (Louisville KY USA). All operations of the S&P synthesis are carried automatically under computer control. At present, the Titan 357 automatic synthesizer is available at aapptec (Louisville KY, USA).

Encoded split and pool synthesis
Although in the S&P synthesis a single compound forms on each bead its structure is not known. For this reason, encoding methods had been introduced to help to determine the identity of the compound contained in a selected bead. Encoding molecules are coupled to the beads in parallel with the coupling of the BBs. The structure of the encoding molecule has to be easier determined than that of the library member on the bead. Ohlmeyer et al. published a binary encoding method. They used mixtures of 18 tagging molecules that after cleaving them from the beads could be identified by Electron Capture Gas Chromatography. Nikolajev et al. applied peptide sequences for encoding Sarkar et al. described chiral oligomers of pentenoic amides (COPAs) that can be used to construct mass encoded OBOC libraries. Kerr et al. introduced an innovative kind of encoding. An orthogonally protected removable bifunctional linker was attached to the beads. One end of the linker was used to attach the non-natural BBs of the library while to the other end the encoding amino acid triplets were linked. One of the earliest and very successful encoding methods was introduced by Brenner and Lerner in 1992. They proposed to attach DNA oligomers to the beads for encoding their content. The method was implemented by Nielsen, Brenner, and Janda using the bifunctional linker of Kerr et al. to attach the encoding DNA oligomers. This made it possible to cleave down the compound with the DNA encoding oligomer attached to it.

Split and pool synthesis in solution
Han et al. described a method that made it possible to keep the advantages of both the high efficiency of S&P synthesis and that of a homogeneous media in the chemical reactions. In their method polyethyleneglycol (PEG) was used as soluble support in S&P synthesis of peptide libraries.

MeO-CH2-CH2-O-(CH2-CH2-O)n-CH2-CH2-OH

PEG proved suitable for this purpose since it is soluble in a wide variety of aqueous and organic solvents and its solubility provides homogeneous reaction conditions even when the attached molecule itself is insoluble in the reaction medium. Separation from the solution of the polymer and the synthesized compounds bound to it can be achieved by precipitation and filtration. The precipitation requires concentrating the reaction solutions then diluting with diethyl ether or tert-butyl methyl ether. Under carefully controlled precipitation conditions the polymer with the bound products precipitates in crystalline form and the unwanted reagents remain in solution. In the solid phase, S&P synthesis a single compound forms on each bead, and as a consequence, the number of compounds can't exceed the number of beads. So, the theoretical maximum number of compounds depends on the quantity of the solid support and the size of the beads. On 1 g polystyrene resin, for example, a maximum of 2 million compounds can be synthesized if the diameter of the resin beads is 90 μm, and 2 billion can be made if the bead size is 10 μm. In practice, the solid support is used in excess (often tenfold) to be sure that all expected components are formed. The above limitation is completely removed if the solid support is omitted and the synthesis is carried out in solution. In this case, there is no upper limit concerning the number of components of the library. Both the number of components and the quantity of the library can be freely decided based only on practical considerations. An important modification was introduced in the synthesis of DNA encoded combinatorial libraries by Harbury and Halpin. The solid support in their case is replaced by the encoding DNA oligomers. This makes it possible to synthesize libraries containing even trillions of components and screen them using affinity binding methods. A different way of carrying out solution-phase S&P synthesis is applying scavenger resins to remove the byproducts. Scavenger resins are polymers having functional groups that make it possible to react with and bind components of the excess of reagents then filtered them out from the reaction mixture Two examples: a resin containing primary amino groups can remove the excess of acyl chlorides from reaction mixtures while an acyl chloride resin removes amines. A fluorous technology was described by Curran The fluorous synthesis employs functionalized perfluoroalkyl (Rf) groups like 4,4,5,5,6,6,7,7,8,8,9,9,9-Tridecafluorononyl {CF3(CF2)4CF2CH2CH2-} group attached to substrates or reagents. The Rf groups make it possible to remove either the product or the reagents from the reaction mixture. At the end of the procedure, the Rf groups attached to the substrate can be removed from the product. By attaching Rf groups to the substrate the synthesis can be carried out in solution and the product can be separated from the reaction mixture by liquid extraction using a fluorous solvent like perfluoromethylcyclohexane or perfluorohexane. It can be seen that the function of the Rf groups in the synthesis is similar to that of the solid or soluble support. If the Rf tag is attached to reagent its excess can be removed from the reaction mixture by extraction. Polymer supported reagents are also used in S&P synthesis.

Self-assembling DNA encoded libraries
One of the best examples of the special features caused by DNA encoding is the synthesis of the self-assembling library introduced by Mlecco et al. First, two sublibraries are synthesized. In one of the sublibraries BBs are attached to the 5’ end of an oligonucleotide containing a dimerization domain followed by the codes of the BBs. In the other sublibrary the BBs are attached to the 3’ end of the oligonucleotides also containing a dimerization domain and the codes of another set of BBs. The two sublibraries are mixed in equimolar quantities, heated to 70 °C then allowed to cool to room temperature, heterodimerize and form the self-assembling combinatorial library. One member of such two pharmacophore library is shown in the figure. In affinity screening, the two BBs of the pharmacophore may interact with the two adjacent binding sites of the target protein.

DNA templated libraries
In the synthesis of DNA templated combinatorial libraries, the ability of the DNA double helix to direct region-specific chemical reactions is harnessed by Gartner et al. The DNA- linked reagents are kept in close proximity. This is equivalent to the virtual increase of local concentration that is nearly constant within a distance of 30 nucleotides. The proximity effect helps reactions to proceed. Two libraries are synthesized. A template library containing at one end one of the BBs and its code followed by two annealing regions for the codes of the BBs of the two reagent libraries. Each of the two reagent libraries contains a coding oligonucleotide linked with cleavable bonds to the reagent (BB) capable of forming a bond with the already linked BB taking advantage of the proximity effect. The synthesis is realized in two steps as shown in the figure. Each step has three operations: mixing, annealing, coupling-cleaving.

Synthesis in Yoctoreactor
The yoctoreactor method introduced by Hansen et al. is based on the geometry and stability of a three-dimensional DNA structure that creates a yoctoliter (10−24 L) size chemical reactor in which proximity of BBs brings about reactions among them. The DNA oligomers comprise the DNA-barcode for the attached BBs and form the structural elements of the reactor. One kind of yoctoreactor format is shown in the figure.

Sequence encoded routing
Harbury and Halpin developed DNA template libraries that direct like genes the synthesis of DNA encoded organic libraries. The members of the template combinatorial library contain the codes of all BBs and their order of couplings. The figure shows one member of a simple ssDNA template library (A) containing the codes of three BBs (2, 4, 6) that planned to be successively attached. The coding regions are separated by the same non-coding regions (1, 3, 5, 7) in all members. The sequence directed procedure uses a series of columns of resin beads each coated with the anticodon of one of the BBs (B). When the template library is transferred to an anticodon column the proper template member is captured by hybridization then is coupled with the appropriate BB. After finished with all anticodon columns of a coupling position (CP) the libraries are eluted from the beads of the anticodon columns mixed and the mentioned operations are repeated with the series of anticodon columns of the next CP. In figure, C shows one member of the template library captured by the “yellow” second CP anticodon library. The template contains the “red” BB already coupled in CP1 and the “yellow” BB attached after its capture. The final library contains all of the synthesized organic compounds attached to their encoding DNA oligomers.

Stepwise coupling and coding
One of the most forward-looking method commonly used for DNA-encoding is applied in the synthesis of single-pharmacophore libraries. As the figure shows the library is built repeating the usual cycles of S&P synthesis, The second operation of the cycle is modified: in addition to coupling with the BBs the encoding DNA oligomer is elongated by attaching the code of the BB by ligation.

Synthesis using macroscopic units of solid support
Modifications had been developed enabling the split and pool synthesis to produce known compounds in larger quantities than the content of a bead of solid support and retain the high efficiency of the original method. As published by Moran et al. and Nicolau et al. the resin normally used in the solid phase synthesis was enclosed into permeable capsules including a radiofrequency label recording the BBs in order of their coupling. Both manual and automatic machine was constructed to sort the capsules into the appropriate reaction vessels. A different kind of labeled macroscopic solid support unit was introduced by Xiao et al. The supports are 1x1 cm polystyrene grafted square plates. The medium carrying the code is a 3x3 mm ceramic plate in the center of the synthesis support The code is etched into the ceramic support by a CO2 laser in the form of a two-dimensional bar code that can be read by a special scanner.

String synthesis
The String Synthesis introduced by Furka et al. uses stringed macroscopic solid support units (crowns) and the units are identified by their position occupied on the string. One string is assigned for every building block in the synthesis. In the coupling stage, the string is in the proper reaction vessel. The content of the strings coming out from a synthetic step must be redistributed into the strings of the next step. The units are not pooled. The redistribution demonstrated in the figure follows the combinatorial distribution rule: all products formed in a synthetic step are equally divided among all reaction vessels of the next synthetic step. Different distribution formats can be followed that allows the identification the content of each crown depending on the position on the new string and the destination reaction vessel of the string. The stringed crowns and the trays used in manual sorting are shown in the figure. The destination tray is moved step by step in the direction of the arrow. The crowns are transferred in groups from the slots of the source tray into the all opposite slots of the destination tray. The transfers are directed by computer and the products are identified by the positions of the crowns occupied on the final strings. A fast automatic sorter machine had also been described. The sorter is outlined in the figure. It has two sets of aligned tubes. The lower ones are step by step moving in the direction showed by the arrow and the coin-like units are dropped from the upper source tubes into the lower destination ones. The tubes may serve as reaction vessels too. A software had also been developed that can direct sorting if not a full combinatorial library is synthesized only a set of its components are prepared that are picked out from the full library.