From protein binding to pharmacokinetics: a novel approach to active drug absorption prediction
aa r X i v : . [ q - b i o . Q M ] O c t From protein binding to pharma okineti s: a novel approa h to a tive drugabsorption predi tion.P.O. Fedi hev, T.V. Kolesnikova, and A.A. VinnikQuantum Pharma euti als, Ul. Kosmonavta Volkova 6-606, 125171, Mos ow, RussiaDue to inherent omplexity a tive transport presents a landmark hurdle for oral absorption prop-erties predi tion. We present a novel approa h arrier-mediated drug absorption parameters al- ulation based on entirely di(cid:27)erent paradigm than QSPR. We apitalize on re ently emerged ideasthat mole ule a tivities against a large protein set an be used for predi tion of biologi al e(cid:27)e tsand performed a large s ale numeri al do king of drug-like ompounds to a large diversi(cid:28)ed set ofproteins. As a result we identi(cid:28)ed for the (cid:28)rst time a protein, binding to whi h orrelates well withthe intestinal permeability of many a tively absorbed ompounds. Although the protein is not atransporter, we spe ulate that it has the binding site for e (cid:28)eld similar to that of an important in-testinal transporter. The observation helped us to improve the passive absorption model by addingnon-liner (cid:29)ux asso iated with the transporting protein to obtain a quantitative model of a tivetransport. This study demonstrates that binding data to a su(cid:30) iently representative set of proteins an serve as a basis for a tive absorption predi tion for a given ompound.I. INTRODUCTIONThe oral route of drug administration is very onve-nient for patients, however it is often ine(cid:30) ient due tolow solubility, intestinal permeability, or high (cid:28)rst-passe(cid:27)e t. Therefore predi tion of oral absorption propertiesis of great interest for pharma euti al industry. Orallyadministered drugs are mainly absorbed in the small in-testine. Here, depending on drug omposition and size,absorption an happen through a variety of pro esses[35℄. Drug pass through the epithelial ells and the lam-ina propria from the lumen into the blood stream in the apillaries. On its way it might be metabolized, trans-ported away from the tra t where absorption is possibleor a umulate in organs other than those of treatment.Besides a fundamental interest in understanding the ba-si me hanisms by whi h a drug is assimilated by the hu-man body, the kineti s of drug absorption is also a topi of mu h pra ti al interest. A detailed knowledge of thispro ess, resulting in the predi tion of the drug absorp-tion pro(cid:28)le, an be of mu h help in the drug developmentstage [19℄.There are a number of kineti absorption models weredeveloped that require experimentally determined in-testinal permeability of a ompound as an input [74℄.Although of great value su h (cid:16)hybrid(cid:17) partially exper-imental, partially omputational models miss the mainadvantages of purely theoreti al approa hes: no need in hemi al synthesis of a ompound and experimental fa il-ities, low ost and high speed. Among omputational ap-proa hes that predi t intestinal permeability solely froma mole ule stru ture and its physi al- hemi al proper-ties instead of using any biologi al experiments data,there are two major dire tions: ab initio and quantitativestru ture-property relationship (QSPR) models. The lastones are overwhelmingly used nowadays and exploit awide spe trum of statisti al methods for absorption dataanalysis (see e.g. [12, 33, 61℄ for a review). Instead ofrelying on basi laws of nature the models are trained at observed statisti al regularities. Su h an approa h pre- onditions the limitations of the models. In ontrast toQSPR there are a handful of studies developing mod-els of the intestinal permeability from the (cid:28)rst prin iples[2, 14, 50℄. The models des ribe su essfully basi prop-erties of passive absorption: dependen e on distribution oe(cid:30) ient, di(cid:27)usional limitation at high
LogD , and para- ellular absorption. However, the major hindran e onthis way the omplexity of intestinal absorption. Apartfrom passive phenomena (di(cid:27)usion through ell mem-brane and para ellular jun tions), there is also a tivetransport of the mole ules in and out of the ells. To thebest of our knowledge urrent ab initio models are limitedto des ription of drug passive absorption. Most of QSPRmodels also deal with passive transport [12℄, though onlya few approa hes go as far as developing QSPR modelsdes ribing both passive and arrier-mediated absorptionme hanisms [67℄. However, arrier-mediated transportplays an important role in drug absorption [18℄ and hen edemands the development of a good a tive absorptionmodel.The major obje tive of this investigation was to de-velop a novel approa h to predi tion of arrier-mediateddrug absorption based on entirely di(cid:27)erent paradigmthan QSPR, thus avoiding its di(cid:30) ulties and apable ofbetter predi tions. Re ently it was observed that experi-mental values of mole ular a tivities against a large pro-tein set an be used for predi tion of a broad spe trumof biologi al e(cid:27)e ts . In this study we took advantage ofthis on ept and developed a novel quantitative methodfor identi(cid:28) ation of a tively transported drugs. To dothat we performed a do king study of a few hundredsof small mole ules (mostly drugs) against a diversi(cid:28)edset of proteins representing human proteom. Usingavailable absorption data for ea h of the mole ules weidenti(cid:28)ed a protein, a(cid:30)nity for whi h orrelates well withthe permeability of many a tively absorbed ompoundsfrom our data set. The observation helped us to improvethe passive absorption model by adding non-liner (cid:29)uxesasso iated with the transporting protein to obtain also aquantitative model of a tive transport.The manus ript is organized as follows. After thestandard Materials and Methods se tion outlining ourapproa hes to the data preparation, the do king studysetup, and the data pro essing routines, we present atwo- ompartment model of drug absorption extended toin lude a tive transport via non-linear (cid:29)uxes terms asso- iated with transporting proteins. As soon as the modelis built and the parameters of passive absorption are (cid:28)t-ted to experimental data, we identify the a tive transportparameters to train the lassi(cid:28)er. After the lassi(cid:28) ationis set up we ompare our predi tions with available ex-perimental informaton and thus validate the ompletemodel for drug absorption predi tion.II. MATERIALS AND METHODSA. Experimental absorption and permeability dataMu h experimental a tivity aimed to analyze the ki-neti aspe ts of the pro ess of drug absorption has beenpursued re ently. For better ontrol, a variety of in-vitro methods on drug absorption have been developed[9℄. One possibility is to seed (epithelial) ell ulturesin a mono-layer, forming the onta t surfa e of two littlepots. Con entrations of an applied drug an be measuredover time in both hambers. Two well known ell ulturemodels are Ca o-2 ells [6, 8℄ and MDCK ells [37℄.To enri h experimental data sets we used two types ofobserved data to build up the model: fra tion of drugsabsorbed after oral administration in humans (
F A ) andpermeability a ross a human olon adeno ar inoma ell(Ca o-2) monolayer ( P ). The latter is a routinely used ell model in pharma euti al industry and a ademia toestimate drug absorption in the intestine [8, 10, 36, 65℄).Previous (cid:28)ndings showed strong relationship betweendrug Ca o-2 permeability and the fra tion absorbed inhumans (e.g. [7, 10, 46, 60, 62, 73℄), suggesting that onevalue an be used to estimate the other. We olle tedfrom literature ompilations observed F A values and
Ca o-2 permeability values for ompounds thatto the best of our knowledge are not subje t to e(cid:31)uxfrom entero ytes [4, 10, 11, 13, 15, 17, 20, 24, 28, 29, 30,32, 34, 38, 39, 40, 42, 43, 45, 46, 49, 51, 53, 54, 55, 56, 57,58, 60, 61, 62, 64, 67, 72, 73, 75℄. Fig. 1 shows
F A val-ues plotted against permeability for the ompounds, forwhi h both values were available. The data were (cid:28)ttedwith the sigmoid equation [8℄:
F A = 100%1 + (
P/P ) p (1)where P is the permeability at F A , and p is a slopefa tor. The (cid:28)tting parameters were P = 7 . × − cm · s − and p = − . that is in reasonable agree-ment with previously found P ∼ × − cm · s − and Figure 1: The relationship between FA and Ca o-2 perme-ability. The points orresponds to experimental values for ompounds with both values of F A and P known. The solidline is the approximation provided by Eq. 1. RMSD is 14%. p = − . [46℄. The (cid:28)tting urve predi ts F A = 90% for logP = − . and F A = 10% for logP = − . , whi his in a reasonable agreement with with logP = − . and logP = − . from [60℄. RMSD of the (cid:28)tting isfairly small and thus Ca o-2 permeability an indeed pre-di t human intestinal absorption of orally taken drugswith reasonable a ura y. Fig. 1 shows that there aretwo outliers orresponding to gly ylsar osine and amox-i illin. Their F A were mu h higher than expe ted fromCa o-2 permeability. Gly ylsar osine and amoxi illin are arried through entero yte membranes by PEPT trans-porters, whi h are reported to have redu ed a tivity inCa o-2 ells [16, 41℄. This fa t may a ount for observeddis repan y between measured Ca o-2 permeability and
F A values.Eq. 1 an be used to estimate the missing values of
F A and P for all the ompounds from our ompilation.However, Eq. 1 requires that if F A → , then P →∞ . Therefore, if the observed value of F A ex eeded ,we assigned P value of × − cm / s orresponding to F A .The distribution oe(cid:30) ients,
LogD ( pH = 7 . ) usedthroughout the resear h, were either olle ted from liter-ature [13, 47, 71℄ or al ulated using Quantum softwareversion 3.3.0 [1℄.B. Preparation of the protein panelOur protein data set in ludes proteins form theProtein Data Bank [68℄. It overs about almost all avail-able ytoplasmi proteins with known X -ray stru tureand also in ludes some important transmembranal pro-teins su h as ion hannels and GPCRs. We use homologymodels for GPCRs sin e no experimentally determinedstru ture is available [59℄.Only the proteins that are o- rystallized with biolog-i ally a tive ligand were taken to the data set. Ligandsmay be either natural ligands (su h as hormone for ahormone re eptor or substrate for enzyme), or drugs, in-hibitors et . If there exist multiple (cid:28)les in PDB reposi-tory for the same protein, we onsider the (cid:28)le with themost omplete stru ture and/or the lowest resolution.Although the hoi e of the proteins for the al ula-tions is a very important step and the overall number ofproteins is hardly manageable, we believe that the PDBar hive ontains a representative set of the most pra ti- ally important proteins, overing the whole interestingvariety of ligand binding domains. Below we show, thatsu essful predi tions do not require the presen e of aspe i(cid:28) ligand binder in the protein set employed for the al ulations. Instead, it proves to be su(cid:30) ient to have astru turally similar protein in the protein panel.C. Do king setup and the binding onstant, K d ,predi tion.Both the proteins and small mole ules typization, andin-sili o s reening were arried out by the mole ular pro- essing and do king tools taken from the QUANTUMdrug dis overy software suit [1℄. The software predi tsthe binding a(cid:30)nities of small mole ules to resolved pro-tein targets using a set of (cid:28)rst prin iples based mole ularsimulations with an advan ed ontinuous water model[22, 23℄. The approa h provides the logarithmi values ofthe binding onstant, pKd ( − lg Kd ).To ompute the binding a(cid:30)nities of mole ules in ourdata set we s reened ea h of the mole ules against everyprotein in our panel. To speed up the al ulations thedo king run were performed against rigid protein stru -tures with no further re(cid:28)nement by mole ular dynami s.Su h a simpli(cid:28)ed approa h turned out to be su(cid:30) ient (seethe dis ussion below) and the results of the al ulationswere organized into s reening assays ontaining pKd val-ues for ea h protein-small mole ules pair ( omplexes) andwere stored for further analysis.D. Data pro essing and modeling.Fitting of the experimental data to the models pro-posed below was performed using BFGS algorithm im-plemented in in-house program. Sele tion of proteins,a(cid:30)nity for whi h orrelates with a tive absorption wasperformed using Weka v.3.5 data mining software [70℄. Figure 2: Model of absorption used in the study. The (cid:28)guresrepresent: and (cid:21) drug di(cid:27)usion from the balk solution ofthe donor tank to entero ytes and from entero ytes to thebalk solution of the a eptor tank; and (cid:21) passive anda tive penetration through ells; (cid:21) drug di(cid:27)usion inside en-tero yte from the api al to basal membrane of entero ytes; (cid:21) para ellular absorption of the drug. The drug dissolvingstage in the intestinal lumen is omitted.III. RESULTSA. The modelFor the sake of simpli ity we onsidered absorption ofpassively and a tively transported drugs with negligiblee(cid:31)ux and intestinal metabolism. Besides, the model as-sumes that the drug is good soluble and stable in the gas-trointestinal (cid:29)uids, and absorption on intestinal ontentand intestinal metabolism are negligible. In this ase theabsorption from intestinum to blood an be representedby a two- ompartment model (see Fig. 2) onsisting of adonor (intestinal lumen) and an a eptor (blood vessel)tanks. The intestinal wall an be represented by a singlelipid membrane sin e there is no phenomena dependingon drug on entration in entero ytes. The drug absorp-tion an be des ribed as drug di(cid:27)usion from the balksolution of the donor tank to the ell layer, penetrationa ross it, and di(cid:27)usion away from the layer to the balksolution of the a eptor tank in series. Drug penetrationa ross lipid layer in ludes passive di(cid:27)usion, a tive trans-port and di(cid:27)usion through pores in the layer simulatingpara ellular absorption.The e(cid:27)e tive permeability oe(cid:30) ient, P , through a ombination of di(cid:27)usional barriers and a tive transportsis determined by the following equation [27℄: P − = P − + ( P pass + P act + P para ) − , (2)where P pass , P act , P para are the passive, a tive, para ellu-lar permeabilities. P UW L is the e(cid:27)e tive permeability ofunstirred water layers (UWL) in the donor and a eptortanks: P − = P − , + P − , The values of the permeabilities ome from the Fi k's law P UWL ,i = D UWL ,i h UWL ,i (3)where D UWL , i and h UWL ,i are the di(cid:27)usion oe(cid:30) ientand e(cid:27)e tive thi kness of UWL on ea h side of the ellmonolayer. D UWL , i an be approximated by the di(cid:27)u-sional oe(cid:30) ient in water, whi h varies within less thana single order of magnitude for low mole ular weight or-gani ompounds [52, 69℄. For su(cid:30) iently dilute solutions h UW L is approximately onstant. Thus, P UW L,i and ef-fe tive permeability of the UWLs, P UW L an be approx-imately treated onstant for all low mole ular weight or-gani ompounds.Similarly to P UW L,i , the drug di(cid:27)usion through mem-brane, P pass , an be estimated as: P pass = D M h M D (4)where D M is the di(cid:27)usion oe(cid:30) ient in lipid, h M is thethi kness of the membrane, and D is the o tanol/waterdistribution oe(cid:30) ient, i.e. the on entration ratio be-tween aqueous and lipid phases. And again, as a (cid:28)rstapproximation D M an be put to a onstant for variousdrug-like ompounds, thus the proportionality fa tor be-tween P pass and D an be onsidered as onstant for alllow mole ular weight organi ompounds.A ording to [2, 3℄, the para ellular permeability, P para , is a size-restri ted di(cid:27)usion within a negative ele -trostati for e (cid:28)eld. Normally it varies within a singleorder of magnitude range [2, 52℄ and hen e its variations an be negle ted. In what follows we keep P para onstanteverywhere. The analysis of the experimental data at ourdisposal proves that this is a very reasonable assumptionindeed.To build up a model of a tive transport (cid:28)rst we es-timated parameters of passive absorption ( P UWL , P para ,and D M /h M ) by (cid:28)tting observed permeabilities for pas-sively absorbed ompounds with Eq. 2 P act = 0 . Thenthese values were frozen and observed permeability val-ues for a tively transported ompounds were (cid:28)tted withEq. 2 where P act was substituted by the proposed modelof a tive transport.B. Estimation the passive absorption modelparameters.To estimate the parameters of the passive absorptionwe (cid:28)tted observed permeability values for drugs, whi h tothe best of our knowledge are passively absorbed, with Figure 3: Model of passive intestinal permeability. Intestinalpermeability of passively absorbed ompounds ((cid:28)lled and hol-low squares) is plotted against LogD. Solid line is predi tionof the model 2, where P act = 0 cm · s − , P para = const . Theparameter values see in the text.Eq. 2 with no a tive transport ( P act = 0 ). Sin e theapproximation ontains only three adjustable parametersof passive absorption, there was no need in a large dataset. Therefore we sele ted passively absorbed ompoundswith Ca o-2 permeabilities measured dire tly. This wasdone be ause the observed F A depend on experimental onditions and may in lude e(cid:27)e ts of drug instability inthe intestinal (cid:29)uids, intestinal metabolism and so on. Onthe ontrary, the data on Ca o-2 permeability are freeof those mentioned problems. On the other hand tightjun tions of Ca o-2 ell monolayer are signi(cid:28) antly lesspermeable [10, 46℄ than in the intestine.Fig. 3 shows the logarithm of permeability of passivelyabsorbed drugs (both (cid:28)lled and hollow squares) plottedagainst the logarithm of distribution oe(cid:30) ient. In a or-dan e with previously proposed model [14℄ the intestinalpermeability of passively absorbed drugs in reases within rease in the distribution oe(cid:30) ient and saturate atboth low and high ends. The in reasing part re(cid:29)e tsgrowth in membrane permeability with in rease in thedistribution oe(cid:30) ient of a drug. The saturation at upperlimit re(cid:29)e ts di(cid:27)usional limitations imposed by UWLsfor highly lipophili drugs. The saturation at low log D orresponds to residual permeability through tight jun -tions. Solid line shows (cid:28)tting of experimental data withEq. 2 where P act = 0 , P UW L , P para , and D M /h M areall assumed onstant for all the ompounds. The best (cid:28)twas a hieved at the following values of the model param-eters: P para = 5 . × − cm · s − , P UW L = 2 . × − cm · s − , D M /h M = 3 . × − cm · s − . RMSD were . log units.The determined value of P para is slightly lower of ex-perimental estimations ranged from − ÷ − cm · s − [2, 52℄, while P UW L is in a good agreement with the ob-served values at slow stirring rate ( × − cm · s − at25 rpm , [2℄). Using the ommonly a epted value of thedi(cid:27)usion oe(cid:30) ient, − cm × s − [52, 69℄, from Eq. 3we (cid:28)nd the e(cid:27)e tive thi kness of UWLs : h UWL ∼ × µm that is in ex ellent agreement with previously estimatedvalues between and µm , [5, 21℄.C. The a tive absorption model.Using literature data [18, 31, 39, 46, 63, 64, 73℄ wesele ted 45 ompounds from our database, whi h arereportedly absorbed using a tive transport and to thebest of our knowledge are not subje t to drug e(cid:31)ux[24, 39, 46, 64, 66, 73℄. To enri h the data set both thevalues of Ca o-2 measured dire tly and the al ulated by F A permeability values were used. If both
F A and Ca o-2 permeability were available for a given ompound, thevalue of P al ulated from the measured F A was em-ployed. This is a reasonable approa h, sin e Ca o-2 ellsare known to under express some important drug trans-porters [16, 41℄, and thus Ca o-2 permeability data fora tively transported ompounds is less reliable than
F A .Fig. 4 shows that permeability of the majority of a -tively absorbed ompounds were higher than predi tedby the model of passive absorption in a ordan e withexisten e of an additional omponent of permeability.Nevertheless there were four outliers, whi h permeabil-ity were substantially below passive permeability urve:fosinopril, diphenhydramine, lobu avir, and efuroximeaxetil. We will spe ulate about possible explanations ofthis in dis ussion. For the rest of the ompounds the to-tal permeability ex eeded passive omponent from . to . log units and rea hed di(cid:27)usion limited rate. Thismeans that the intensity of a tive transport varies in thewide range and may be limited by drug di(cid:27)usion to themembrane.The arrier-mediated absorption (both a tive and pas-sive) an be des ribed by Mi haelis-Menten kineti s [48℄: P act = Σ n i /τ i K D i + C (5) J act = Σ n i /τ i CK D i + C (6)where the summation o urs over all (the types of)transporters; n i is the amount of the i -th transportermole ules on the unit area of membrane; K D i is the dis-so iation onstant of the i-th transporter-ligand omplex; τ i is the time, required for the transporter to bind and arry one mole ule a ross the membrane; C - ompound on entration. From Eq. 6 it follows that if C ≪ K D i , Figure 4: Intestinal permeability of a tively absorbed om-pounds. Pink points are observed values. Solid line is themodel of passive absorption.then J act → (the ompound is passively absorbed). Inthe opposite ase C ≫ K D i J act = Σ n i /τ i i.e. a ompound a tively absorbed and a tive ompo-nent of a drug (cid:29)ux is independent of drug on entra-tion and determined only by the amount of the protein-transporter and time, required for a transporter to arrya ligand a ross membrane. Thus Eq. 6 is similar to a lassi(cid:28)er, with threshold value C, whi h (cid:16)sele ts(cid:17) betweenthe passive and the a tive transport options ((cid:16)possibili-ties(cid:17)). Therefore it is natural to build a lassi(cid:28)er model toidentify proteins that either parti ipate in a tive trans-port dire tly, or have binding site similar to that of atransporter.To identify the proteins related to a tive absorption orwith a tive site for e (cid:28)eld similar to that of protein trans-porters we used all drugs that are reported to be a tivelyabsorbed and have permeability not less than predi tedby passive model (41 ompound). Besides, 71 passivelyabsorbed drugs were used. We studied absorption-Kdrelations for these drugs and proteins from our set andidenti(cid:28)ed a protein that orre tly lassi(cid:28)ed 78% of drugsbetween a tively and passively transported. Fig. 5 showsthat pra ti ally all ompounds with at least some smalla(cid:30)nity for the protein are a tively absorbed. Using JohnPlatt's sequential minimal optimization algorithm fortraining a support ve tor lassi(cid:28)er implemented in WekaData Mining Software we build up a lassi(cid:28)er model. The onfusion matrix, as shown on Fig. 5, shows that only 7%( from passively transported ompounds) were mis-takenly lassi(cid:28)ed by the model as a tively transported.These outliers may in fa t be false-positives, whi h a(cid:30)n-ity for the protein was mistakenly al ulated as high.Figure 5: Relation between a(cid:30)nity of a ompound for hu-man brain hexokinase type I and intestinal absorption me h-anism of the drug. Blue (cid:21) a tively absorbed ompounds, red(cid:21) passively absorbed ompounds. X axis (cid:21) pKd value for thehexakinase. Y axis (cid:21) the number of passively and a tively ab-sorbed ompound. The onfusion matrix shows a ura y ofpredi tion of a tive transport with the help of human brainhexakinase.Figure 6: Permeability predi tion for a tively absorbed om-pounds. Predi ted permeability for orre tly lassi(cid:28)ed om-pounds is plotted against experimental values. Both axis arein logarithmi s ale.Fig. 6 shows permeability predi tion for ompoundsthat were orre tly lassi(cid:28)ed as a tive using Eqs. 2 and5 where n i /τ i = 3 . E − M ∗ cm − s − and C = 10 E − M ∗ cm − \\