0:00:13i can give a
0:00:14talk
0:00:15so that uh
0:00:17where your your stay
0:00:19so
0:00:20i think a my job of the easier a because uh a a a lot of stuff and of the
0:00:24background and and all those
0:00:25actually actually introduced by
0:00:27there is talks
0:00:28so of what what we're trying to do here is a to uncover the
0:00:32to operate of regulation by
0:00:34transcription factors and michael R As uh using a bayesian uh
0:00:39basic it's uh
0:00:40this is it's a regression fact mall
0:00:42or or call this so hyper affect them all
0:00:44so
0:00:45uh what's the object of okay uh the objective is saw
0:00:49to understand how gene expression basically transcription this being regulated by
0:00:53transcription factor all this common knowledge at
0:00:56and my car it a it's a small molecule that's side recently
0:01:00oh uncovered to normal also regulate a transcription
0:01:03so
0:01:04what what this approach are a so basing i wanna come you that that we can use this saw a
0:01:08base fact factor model to to to serve the display
0:01:11per
0:01:11and um using the small on top of uh
0:01:14michael or a expression data a lot of biological a prior knowledge
0:01:19so
0:01:19a just a little bit of background or which are sort already been introduced by a lot of uh uh
0:01:24a previous of speakers
0:01:25so this is essential that number like biology uh
0:01:28so it's uh
0:01:29i it to say uh that the you you know more the information flow goes from a D and they
0:01:34am are eight uh in and the protein protein use a basic building block all
0:01:38all living cells
0:01:40uh so here you know my focus is on transcription so basically how D and they it's been transcribed into
0:01:45M R
0:01:46uh a this process so here looks lean your but actually it's being
0:01:50have really regulate it okay and a male it's rewrite the by two factors or the first one is a
0:01:56proteins call transcription factor so now you're looking at the D and a okay the transcription basically is a copying
0:02:02of one change in the D and they
0:02:04into the small molecule into the small coke M are a and then by M R you being later translate
0:02:10like to to protein
0:02:11so that the rank of the first regulator their is called transcription factor in a lines to they was up
0:02:16from or region of the ageing source for example this is a gene
0:02:20and then the controls the
0:02:21of the product of uh or expression sure
0:02:24oh the M R a
0:02:25and for their know recently people also understand that the
0:02:29uh and that the small molecule actually is come my core
0:02:32it's a no i i rolled a soft and i are it's a little bit confusing with mike
0:02:37it actually binds in that so called mold the region of the E and that the search to be great
0:02:43degree
0:02:44M are so act actually no
0:02:46a together uh the transcription fact and Y core it together actually it's
0:02:50a kind of a better explain the complexity of of of the leaving so why we have this type versa
0:02:55a a a a traditional way if we want to look at transcription factor we we pretty much have a
0:02:59similar set of a transcription for
0:03:01we're difference of white
0:03:02so my or actually give you another lady of explanation
0:03:05so here might per mike my goal is okay okay a right now on you we have to my are
0:03:10rate high so see in that can be used to
0:03:13we the measure
0:03:14mri more case so
0:03:16for example we have michael or rate which also be introduced by a of three speaker
0:03:20and also
0:03:21well i happen to be that the we can also since this is also a
0:03:24are in a week it was a measure michael
0:03:27so
0:03:27we we are the goal here is to
0:03:29to really understand how am are watching gene transfer transcription be regulated by
0:03:36my car and transcription factor based on
0:03:39and M a measure all my mike or measurement this but there a case and michael R
0:03:43mission
0:03:44so
0:03:45oh before supposed to be a like in
0:03:47in which he yellow but now it's black
0:03:49so basically that's the that's the goal here
0:03:51or rate uh so that's
0:03:53these to two factors okay so let's see what are the calm approach has been taken all
0:03:57so a basic this clutching not work or in a a a a very simple way so you probably see
0:04:03these things how how i don't often right
0:04:05so it it's it's a very messy network well normal each not represent a gene and then you have links
0:04:11score L linking these different genes
0:04:13so
0:04:13uh how we only are the meeting here is a pretty much a that's okay if two genes a link
0:04:19they like to are sorted
0:04:20but
0:04:21but the problem here's how do we interpret a especially if i one to understand transcription regulation
0:04:26or what does this king really
0:04:28tells about transcription regulation
0:04:30oh oh it's very very difficult actually because all these only says
0:04:34gene are also she don't say whether
0:04:36the ching a association through the transcription regulation or some thing
0:04:41uh interact also also forth
0:04:43so this is actually a
0:04:45a a a i be working on this before about that you a kind of a stay away because it's
0:04:49so hard interpret and you when you present a just they they don't know what
0:04:53i don't even know what to tell them and they don't know how to interpret
0:04:56so basically i one you know look at them more D to about of biology and see if whether we
0:05:00can really model of this process of a transcription regulation and my car regulation
0:05:05oh oh oh star by modeling like a everybody that's here so we assume that the transcription fact is a
0:05:11protein so so that the protein activity we call this a acts a so the actively that and you know
0:05:17the the
0:05:18or
0:05:19a all basic basically a little bit of a about on the a basic on it the more transcription factor
0:05:25a you have prop possible in little wreck a so are quite a bit of a a a a a
0:05:28gene transcription
0:05:29so we use a to represent a transcription factor
0:05:33putting level activity and then use using Z
0:05:36to
0:05:37oh denote the
0:05:39expression level of
0:05:41michael or are okay and then where we're saying seen
0:05:44my car and transcription factor regulates the gene product are which is a are eight where we call why
0:05:50which can be measured by the mike or read data and and
0:05:53here we call what and then we can relate to this relation by a a simple linear relationship a where
0:05:59we stay okay uh the R a expression level as
0:06:02do to
0:06:03on the regulation or the bond that is a axis actively putting of activity of transcription factor for at
0:06:10and
0:06:10the mike or expression level a K and a and B U are the so so called with three
0:06:15or coefficient
0:06:17also this this very simple model and this is a free much just that's okay but in the case model
0:06:22the case where there's a the one shows can fact and the one my car a reality actually is a
0:06:26lot or more complex where you ball a lot of my car a lot of can with fact
0:06:31oh uh and and again you you're gonna have a more on that phone it
0:06:35a model like those for one
0:06:37E
0:06:38and the if you really although the in higher G know where you have possible week one T uh to
0:06:43forty thousand G
0:06:44and then you you're looking at basically a matrix like that in this but you're case the measurement R expression
0:06:50ah
0:06:51uh this is a matrix for each will represent aging and each column vectors represent
0:06:55he sample for competition the patient soul also for time
0:06:58points
0:06:59and the X the that it was here you know well represent expression mri of one
0:07:04so i assume early acts here represent a
0:07:08that's cheating that's sample and and an is the transcript factor of this act
0:07:13i and does Z as the mike or a a michael are and they
0:07:17activity sorry this is so this wrong
0:07:19and where i i slice said before this can be measured okay this can be measured by
0:07:23mike or all of high simple sequence
0:07:25and at
0:07:26S the so called a three stress and the B as in my regular wrist stress a and E is
0:07:32the ad it to i it to uh at
0:07:34what
0:07:35or right so basically we're we're looking at the C creation now uh we're given Y and Z E we
0:07:40tried to
0:07:41a secure rubber some white and Z base on this model
0:07:44and uh
0:07:45and uh so this is a goals of data is given Y Z what one understand
0:07:49a B and
0:07:51i
0:07:51a so how do we how how are we gonna really achieve this
0:07:54a so
0:07:55traditionally additional the of just have a model is really a a a a factor regression model this part is
0:08:00the fact the of this part as a regression model a so this nothing you are and you you know
0:08:05and and the solution you can see a a couple of different solutions pca i C S already be to
0:08:09use by
0:08:10but make less and uh
0:08:11and it and M have a a row one are good at this type of the mall or not really
0:08:16sufficient to to really model
0:08:18the D to while the white
0:08:19so the reason i give you a very simple reason here for example in you are we're looking at a
0:08:23this is a relatively uh a real scenario you like you can kind of a
0:08:27and get a sense okay
0:08:28a so
0:08:29yeah if if you want to use pca to kind of a although does a basic P Z sense of
0:08:34the loading matrix for this this is a a a more make be an a somali matrix
0:08:39so well we make must for right okay i believe all it
0:08:42are are very you know and now we know that each gene transfer or fact actually regulate only a very
0:08:46small set of genes okay
0:08:48while relative to okay it's all what a couple of
0:08:51yeah up two thousand a couple thousand genes
0:08:53still in in terms of the overall number of genes which is twenty thousand to forty
0:08:57as a sparse hiding
0:08:59so of the major should be spot
0:09:01and also on you you know where you have a regulation where or is it now as your abdomen
0:09:06can be we have we already accumulated a lot of are not just to which you know transcript fact to
0:09:11regular what's set of a
0:09:13so we should be able to you incorporate this type
0:09:15oh not
0:09:16and thirdly
0:09:17or so these samples actually an you know you look at the sample
0:09:21you like a sample these samples but like a whole you are represent for example patience you know the patient
0:09:27measure
0:09:28and saw in the case but these disease
0:09:30uh some some patience actually have similar
0:09:33expression path
0:09:34and meaning that they they have these can be used to define
0:09:37the stop type of disease
0:09:39also if you have most similar stop five
0:09:41you're expression level should be it
0:09:43so these problems are i actually should be carly
0:09:45to re represent the condo
0:09:47so something like
0:09:49i start X and Z a get this from a factor
0:09:52activity as michael
0:09:54or you should have a saw
0:09:55correlations you should have these group
0:09:57a in the set
0:09:58and what in C uh it doesn't really models
0:10:01it like this
0:10:02and also a lot the transcription transcription factor activity should be known that
0:10:07like a what the uh make and also argue
0:10:09a but that that was in the case of the gene but there's similar market
0:10:13so uh we had we need to model really non negative a transcript five
0:10:17i
0:10:18well
0:10:18in the case of my car in my car known to down regulate the transcription
0:10:22so it's loading matrix must be
0:10:24negative
0:10:25or a loss of this matrix
0:10:27actually is used to be negative
0:10:28so we need to somehow in all these T to by all the in to the ball that you know
0:10:32to do
0:10:34a a a basic and i'm gonna tell you how we we we model all these
0:10:37each of one of i
0:10:39they
0:10:40all
0:10:41or to start with a a a a sort of basic the modeling goal was to model the sparsity
0:10:46a and B or in knowledge and uh
0:10:48yeah a a model the non-negative transcription fact
0:10:51video
0:10:52and a
0:10:53negative regulation my car
0:10:55and then
0:10:56a the sample correlation
0:10:57okay so you need a lot of things
0:10:59a small
0:11:00the start with the sparse that we use exact the same model as that is what uh
0:11:04the close on johnny
0:11:06uh
0:11:06and uh L actual was using here
0:11:09yeah yeah the spy high just one a point out actually use high right of basically notes over bit
0:11:16a probability of transcribed factor L regulating gene and so this can be really them
0:11:20there's a lot of prior knowledge available from that
0:11:23they
0:11:23so we can really incorporate
0:11:25these prior knowledge
0:11:26a in two
0:11:27a time
0:11:28and a a so this is how we model the sparsity of a a a well in the case of
0:11:32a
0:11:33B actually very similar model yeah
0:11:35here
0:11:36now we have to use it a gaussian
0:11:38to really out of the down regulation of a a of a mike
0:11:42a B as the regular matrix um mike
0:11:44so that's only differs and i
0:11:46again as a prior knowledge and there are all their a databases
0:11:50and also
0:11:50part
0:11:51well
0:11:52as a oh
0:11:53yeah is just to a point of a like card regulation is do a very active or research just so
0:11:58we don't really know exactly how my or a
0:12:01right right of the genes not at the level of transcription fact yet
0:12:05but are
0:12:05a target prediction out
0:12:07that can be used to really a a give you some prior knowledge here
0:12:10so that's how a model the sparsity and copy the part
0:12:14but apply
0:12:15then let's more want to
0:12:16a a needs to be non-negative transcript factor i
0:12:19a body
0:12:20so it's not using actually trying to go also use the right
0:12:23only differences
0:12:24we have a mat
0:12:25actually and zero
0:12:27this possible i
0:12:28yeah you know there two
0:12:30of using rectified of one is
0:12:32it introduces a
0:12:34additional sparse the actually even a transcription factor activity
0:12:38and also a it gives a very nice the a function
0:12:41uh uh formation for the base and uh
0:12:44uh i
0:12:45base and duration so that's how i
0:12:48um
0:12:49fact
0:12:50and then
0:12:50owing to the correlation sample correlation of be fine example stuff
0:12:54so patients
0:12:56well we use basic assumption is that he's
0:12:58yeah samples are the same ballpark are so it's a natural plaster model
0:13:03so mixture gaussian
0:13:04and
0:13:04a problem with mixture girls
0:13:06you know that the fosters so we actually use of duration should process of mixture
0:13:10now
0:13:11a
0:13:12i
0:13:13sure
0:13:13or
0:13:14do should process of make sure a rectify
0:13:17record
0:13:18what we
0:13:19in use a duration process
0:13:21yeah
0:13:21so
0:13:22putting a in everything together this pretty much uh what the model looks like a also we have all these
0:13:27different parts we
0:13:28a basic a for you do the factor
0:13:31right uh i not
0:13:32factor not
0:13:33or projection
0:13:34and uh if you put in of them all
0:13:37looks like
0:13:38i
0:13:38so a lot of parameters to estimate at the of a sure yeah every why
0:13:43and then that the so how have
0:13:45resort to some of
0:13:47you you a traditional
0:13:49something for a for example
0:13:51you something
0:13:51for in this case because of these very powerful
0:13:55uh which also prior distributions
0:13:57we have thought conditional distributions in close
0:14:00oh
0:14:00base this uh but uh
0:14:02if sample thing of what am i'm not gonna do on the on the durations is along long
0:14:08i
0:14:08i
0:14:09but
0:14:10wise as they we are able to really create this
0:14:12beep something solution
0:14:14so was start by looking at the a like assimilation data
0:14:17where
0:14:17uh this but
0:14:18a we have one fund
0:14:19genes
0:14:20a with
0:14:21well some of us
0:14:23or are are this
0:14:24a this is called
0:14:26how a most rate with the a real situation
0:14:30oh
0:14:31later
0:14:31it's
0:14:32so and we assume there flight faster
0:14:35and
0:14:36and that are they a fully uh thirty five seven
0:14:39are we look at uh
0:14:40basic
0:14:42as
0:14:42are
0:14:43a a wall here i and talk about the correlations are with real
0:14:47also
0:14:47and i E uh
0:14:49mean square error estimate a were also look at a sparsity you
0:14:53station
0:14:53oh
0:14:54and the class triple form
0:14:55so
0:14:56just a an idea of how
0:14:58samples are were
0:15:00as there a case
0:15:01actually rather to the fast ball of course this
0:15:04and
0:15:05a a high any on what what kind of a different the settings and error
0:15:09so for
0:15:10but uh generally it become verge are relatively that
0:15:13a so this is the actual a be the cluster id so in this case was the two clusters
0:15:19uh you can see it actually covers
0:15:21a fairly
0:15:23fast
0:15:23so this is a and you know we we actually
0:15:25look at a performance
0:15:27a for different noise conditions
0:15:29in of it
0:15:30moist of errors
0:15:31and for example in this case
0:15:33why i we look at a this is a the this the so can estimate of a non negative or
0:15:38i
0:15:39spar
0:15:40as far matrix and we look at the precision and uh
0:15:43when the noise actually increases
0:15:44oh of the precision actually a
0:15:47when the boys
0:15:48increases i
0:15:49precision actor goes
0:15:50well
0:15:51i it some are but
0:15:53a and then both goes down this give case
0:15:56and uh but uh if you look at the faster actually class simple form a rates uh with the
0:16:01and also the estimation
0:16:03a with increase of the noise
0:16:05and then we look at the data base of because
0:16:07for for knowledge and the database has problems
0:16:09so
0:16:10a a a a there two type of problems for example whether the database really for all the norm knowledge
0:16:15like a like what of the database
0:16:17the whatever of data we
0:16:19oh
0:16:20for you know a precision again the precision recall problem
0:16:23we set up a a precision of the data and look at again you know that you better performance
0:16:28oh what what is that can be seen here data is precision
0:16:32you know increases
0:16:33you know if whatever big report basis
0:16:36to and we be able to really recover
0:16:38i
0:16:39well
0:16:40these on uh
0:16:41S regulations
0:16:43um nonzero out
0:16:45uh
0:16:45so
0:16:46uh
0:16:47i i i can speak this uh this uh
0:16:51one
0:16:52job
0:16:53right into the real
0:16:56a real data actually were looking at that we using the the
0:17:00a cancer genome at
0:17:02yeah
0:17:02this is
0:17:03in H
0:17:04a project
0:17:05and we take a look at the meal
0:17:06a
0:17:08right
0:17:09a
0:17:09oh where and you know particular we we look at a a of a form
0:17:13haitian
0:17:14i
0:17:15yeah a gene expression data
0:17:17and then a about
0:17:19one patient
0:17:20i
0:17:22i
0:17:24i
0:17:25oh
0:17:25yeah
0:17:26uh what what we and we need to also have thing
0:17:30i
0:17:31and really look at a on their own the show what
0:17:34perdition
0:17:36oh
0:17:37but
0:17:38yeah you know just okay
0:17:39extract now
0:17:40that
0:17:40all these conditions having my
0:17:43or
0:17:51or
0:17:51actually patient
0:17:52samples
0:17:53in addition
0:17:55one norm
0:17:58we have forty that
0:18:00i
0:18:02yeah
0:18:02we go the original why
0:18:05just to ask
0:18:07or what are
0:18:14a
0:18:15i
0:18:16yeah
0:18:17i
0:18:18and
0:18:19i saw that my
0:18:21first fine
0:18:22so which in fact
0:18:23yeah seven michael
0:18:25okay
0:18:26and i
0:18:26so with this are we told to be a basic the database P and C one are the try change
0:18:32step possible regulate
0:18:34i
0:18:35these set the transcript fact a seven my car
0:18:37and uh also in uh we a come up with a hundred thirty five genes
0:18:41so
0:18:42uh so
0:18:43the supplies to say that these a hundred thirty find genes
0:18:46are regularly by D
0:18:48all of these are set the my car some transfer factors
0:18:51in
0:18:52in many maybe to the conditions because all of these these uh prior knowledge are are are are are derive
0:18:57from an a different conditions
0:18:58but they are not necessary true but you heard too
0:19:01post
0:19:02one
0:19:03okay
0:19:03so and then for a a as to the prior knowledge for a transcription factor
0:19:07regulation way go to be trends back a and then extract these a regular regular three
0:19:13a a prior knowledge and from like a regulation actually we have our in house uh
0:19:17prediction uh we
0:19:19a these two papers
0:19:21so that's all these set all the of the uh basically a the experiments
0:19:25and then this is the
0:19:27uh i for the of proper ability poster their probability of nonzero elements and thus against can see most of
0:19:33them are nonzero or the probability of a
0:19:35one want
0:19:36it's very very small
0:19:37and only a small set of a probability actually give you
0:19:40close to one
0:19:41so and then all
0:19:43a not all these possible links we uncover a one hundred uh
0:19:47or regulations as so this is a side sparse
0:19:50and and into wrestling eh
0:19:52so uh one look at the covered regulations there are about the uh one fourteen i read report in the
0:19:58database
0:19:59and eleven
0:20:00a
0:20:00in the database are not really on cover a uh but we pick up seven additional you are regulations
0:20:06which are not really or in the data
0:20:08and then this is the the so regulatory what
0:20:11and this actually each node on the site represent a a transcription factor each no on this i represent my
0:20:17car and the circle here there are small those act actually stacked together and their are the represent right they
0:20:22represent genes
0:20:24and each link it has to very clear interpretation
0:20:27so have a here pretty much it means that this in fact the right that that that change
0:20:32and also
0:20:33uh we can use the the loading matrix the as a loading matrix
0:20:37to to to the to to uh to
0:20:40indicate whether this regulation of transcription factors up regulation or down regulation
0:20:44well for my car is always that regulation and this is a a a a heat map of a the
0:20:49loading matrix as so there are a lot of zeros basically here
0:20:52oh so
0:20:53and this is a cover the uh the transcriber fact activity all the fact
0:20:58models and together with the measurement of the my car but this is the for change remember i tell you
0:21:04in the sample there's and there's a normal some so we basically use an more samples of control
0:21:08to calculate the full change otherwise transfer factor
0:21:11oh about should be all
0:21:13cost okay and then this is the the cost or that that's be uncovered
0:21:17by the model so basic it model un covers three cluster
0:21:21and then also it it's a see the expression levels are more less the same within in the cluster
0:21:26and then we look at the saliva for each group of but these cost
0:21:29see whether bit do of form some trouble socks of a sub group a sub D C stuff sub type
0:21:35answer
0:21:35so the we we look at is so we look at the a bible with the see whether after treatment
0:21:39that the the the the patient in the same group have a similar so bible
0:21:43so
0:21:44it's a seeing you all um they
0:21:46the different groups spatial in different groups
0:21:49in have a difference some why will meaning that that you this separation does been something okay can indicate basically
0:21:55you
0:21:56an next um maybe see a patient this be can say that this patient possible after after treatment and need
0:22:01longer that the patient in
0:22:02this point group okay
0:22:04and then we look at the the basic the P of the pure voice
0:22:08i these source some bibles
0:22:09and the
0:22:10and the a clearly shows the who uh the the a cost or one the faster
0:22:14to actually has to large as as a what different
0:22:17so
0:22:17they they can be really are used to
0:22:20in as a viable of the effect effect is up a tree
0:22:24and uh so we were going back actually going back to the mike car an expression data C what thing
0:22:28you know you can
0:22:29a come up with a similar result
0:22:31ah
0:22:32by simply using by or and data and my a and the G did come by my car inching get
0:22:36a without going through
0:22:38the the factor analysis
0:22:39just basically
0:22:40for for one class room uh direct on these
0:22:43individual data
0:22:44and the reason uh the that can in is no i this is these are the P about was actually
0:22:49this is our perform this a lot log P about the as so we have a
0:22:54signal actually higher P about then if you
0:22:56look at my car
0:22:58gene or my car engine together a lot a without using the factor analysis
0:23:02so this really shows that you fact
0:23:04effect fact than this all of this
0:23:06fact model okay
0:23:08uh so that pretty much come close my my also
0:23:11keep those that uh
0:23:12just could that the in
0:23:14a a from a a in a H and uh
0:23:17and uh
0:23:17uh uh that okay
0:23:19thank you
0:23:20thank you i
0:23:24what what two questions
0:23:27yes and yes
0:23:32oh your examples
0:23:34you have
0:23:35quite
0:23:36small number of genes is
0:23:38uh i
0:23:39it is the factor analysis that that that you uh you have uh
0:23:43a restricted to
0:23:44very small sample
0:23:46so far yeah
0:23:47yeah
0:23:51i had it to use it T question a mode to model can you go back short
0:24:02each each one
0:24:04the matrix
0:24:05so the form
0:24:06yes this one is uh or
0:24:09this one yeah
0:24:11so you then you i don't if you but problem you mean the of course uh if you if you
0:24:16are one in C can you find the unique solution for a X in B
0:24:21of course for E X to me know do that
0:24:24the uh yeah
0:24:25yeah a very can question are
0:24:27actually the something really worse than a starting here we like i can give you a radical
0:24:32you know a cool whether there's so
0:24:34i by with it
0:24:36or
0:24:36so actually uh additional of things i haven't really talk about a for example no we have to restrict that
0:24:42a
0:24:42the
0:24:43the call uh of the uh
0:24:46all
0:24:47the factor needs to have a a a a a unique there
0:24:50and uh
0:24:51and also of the columns of uh
0:24:53oh a case
0:24:55these to have a
0:24:56a the same
0:24:57where it's actually
0:24:58the car
0:24:59and also
0:25:00uh
0:25:01the the the hours of a at and Z should be so be we we have to do some three
0:25:06prof
0:25:07make that is a a and C to be
0:25:09in the
0:25:10same Q
0:25:11you know as you have a
0:25:12i can't define all a and B
0:25:14okay but a
0:25:15whether there
0:25:17what what is the competition i can't
0:25:19that i i i i can at
0:25:23okay okay thank you things
0:25:25for at the end
0:25:27and uh i i you back