so hi everyone
i'm here to talk to be about clay of you have and you know it
may look at be unrelated thought you guys are going to see the collection later
that
so my patient here is tricks plan to be about what's lda and what we
what the first of all better technologies about
and i hope of my patient he's not to get you into compilers because
we just had lunch and you guys may fall asleep well talking about like the
level stuff i don't know people usually don't the like very much to go each
of those details so i'm trying to be as much as possible in some sort
of a high level abstractions so you guys can just you how clinging all of
it works and that's the jitter janeiro i you don't want to best you guys
how it works
so first of all what tell you know
so i'll here is the name of the whole a whole wired infrastructure which is
composed by the front end and i are and
several tools to the meeting i think we call for back a
it has a similar it is the same very has that you anything side
and we call the whole framework alvin but ldm is also the name of the
I R level of this compiler
and it's also the name we give to the bit school which is actually the
same as a byte code but in the elephant word we world we call it
a bit school
so with all the M you can drive like static compilation just as you know
the U C so for example you can a regular com one line that one
will be using D C you can you should the same what would like but
just replace tecc bike lane and it should you use the same results
so for example of for a C source code
you passes through playing and you can it meets that direct executables because it has
an internal simpler board can also emit regulars simplifies in you can pass can pass
it through to be know choose and have your regular object files for your actor
besides that you can use it as some amount of the couple tools so for
example you can people know for the C code you just going should be L
V M I R which you like to do before we call it call and
from the bit code you can do several different types of manipulations you can run
optimizations only but only be a scroll you can it be a simplify is you
can emit object files
you can link bit go together is you have all sorts of operations you can
do on the L V M I R
so
in the example well we generated for example i are using the claim and then
we can run several different optimizations and pass it to L C to which is
that do we used to emit i simply an object files
so
every part in a lot of em have a library so you have like specific
tools which you specific libraries and this way you can now only use whatever you
want you can she beat called for example and then you can compile it to
your some backtracked or the way you want using optimizations you want so it's good
"'cause" you can do several tricks by decoupling the compiler parts
so like you sylvia
first of all it's open source
and as i believe i think it's good you know it's good because we can
have several options it's not here about being better dingy see it's just a different
option it has different features and different interesting problems it solves that it makes a
good choice to use
so it's just about one more option
that can be used to they have a lot several different stuff using the opencal
by
so it's easy it's quite easy to get stuff inside tell em if you need
to tweak the compiler in it's not a stick a very steep learning curve it's
easy to use it as a library to contribute to the project in the communities
very receptive to newcomers so
it's quite exciting to be contributing to all the
so decide that since we have several different the compilers the composites ever different faces
and libraries
it's you can do optimizations in all phases of the compiler so as you see
those you can you can compilation during compiled it i'm just running a regular optimizations
in the I R you can do it only time just running because together all
i'll tell more about that later indicated will run time has support for profiling so
you can the profiling oriented optimization
and besides that there's like a tons of and i think that's is in transformation
spaces with will label is the bit going to several different ways and make it
suitable for whatever type of optimizations someone wants to play with
so decide that the clay from can solve the end which is also the name
for the for the driver it has lots of libraries implementing tools to automatically find
bugs and stuff like that so besides using it as a compiler you can use
it
as tools to check for the correctness of your code i'm giving more examples about
that a bit later in the top
so just be more specific about that also you can actually understand a bit of
the idea
so it's written in C plus i know
people may like C plus but
for L V M it's a quite good language should have a lot the compiler
and using the abstractions and everything like that as i told you before it's really
you can you have several different libraries and can use the library's in the way
you want to use them
and i already told about this
so the front end is basically you can use a front end which is this
is team based
with the with the new support for not quite new but with the addition of
logan supporting tecc you can usually this is see to generate lda my are and
then use the regular libraries and tools well from ldm to actually generate object code
or executable code so you can actually use D C shouldn't today of in my
are in make good use of both optimizations of this is from this is you
not relations from L V M
so it the plugging this called recognise then you can just run it through D
C
and you get all the my are in didn't can just use regular ldm tools
so but besides that we have the plane compiler which is also but to this
library approach
it's gonna because since the design is a bit watering you don't have like to
struggle with generating possible partners is everything dynamic if you want to generate binaries forearm
for example you just need to best the right the right not in the command
line and you get like and i'm binary you don't have to
of course we need to have like our libraries and all that the older change
support around it but if you if you have it it's just a matter of
changing than all which are actually want to generate code for
so it this is it would think about like the very decompose it live approach
of L V M
decide that
elvis also for it meeting goodbye acknowledges of water cold so well especially when you're
programming C plus in your messages templates and i think is when things get really
dirty
i will i rather use going to school while stuff in C plus then using
D C when i want to understand a bit better of what's happening with really
deep compilation everything stuff
because it really provides agnostics and you can holding the three up and see whatever
your marker is a screen up stuff or whatever your complete is expanding in weird
ways
and besides that it also have a static analyser
which is really good choose to check up on the correctness of circled to helping
automatically finding bugs
so all he static analyser be
in the
in the end of the talk
sites that's like i told you before the optimizations yelled game to have like since
we have a positions in the whole lifetime of the compiler we can apply appositions
should the I R which is the most obvious place to put the opposition's because
we don't have to write musicians a specific to a target or specific drawback and
so every optimizations that it it's targeting dependent we can place here so for example
if we want if you wanna have some sort of a generic vehicle that we
can for assortment more specific architecture be goal we can apply optimizations here and leave
it there are just so when we're a beating for certain target we can
apply several different musicians like for example you can apply appositions for vectorisation for X
eighty six for example
and then whenever you have the big gold on your C P U if you're
see if you only have sse it can and we have to see that maybe
X you can use more stuff to reason about side of vectors and we ate
a vax properly
sites that we have the L C to which is the tool of responsible for
actually a invoking be back-ends it has it uses the I R and think boot
and it outputs a saintly files or object files the element has it has both
it has to should to both assemble and disassemble cool and we also have a
parser so for example when you're writing see cool then you use a line call
it actually involved the same parser to parse the line as it in can reason
about the assembly code you wrote inside the C code and give your proper diagnostics
and are actually you can write optimizations that will just change instructions inside the i'd
like or just reasonable constraints that you were that you didn't
you were you couldn't reason about because the code was complex you can also reasons
id a line you do proper feedback
so you can also you can also use a simplify wasn't as input and generate
object files and delete them together you can you can play around in the way
you want
so all that we have that we are well so have that you'd compiler a
difficult father's side that supportive for several we are factors
and we have the T L T O which is use it to really be
goals together
and like i told you guys we have a similar in this is samplers and
several different tools to play with object code and mess around with the low level
stuff
so how does that relate to you norm at all
so that's what i like to introduce and give you how can get all can
benefit from can you know V M how can start using it to i don't
know we proving in several different ways because there are lots of to all their
a lot of those you know them that can be used in get on to
improve development and stuff
so first of all gonna more and he was this L of them incline libraries
it some projects
i don't know much if there are more than to products using it but ones
are the ones that i look it up and i found that using like of
them five which i was already aware of that and we have the developer over
there someone what stop with him
which is actually a softer in their
using L V M whenever you don't have like a specific you to the graphics
it can fall back to using whatever your C P U has to process that
and it's a good fallback
besides that i saw that there's a plug in called you did go to system
that was that will use going to just give hints to do should someone of
programming stuff and giving like whoever's during the program like a V C idea I
Ds do
it seems you know it was like an umbrella for several projects
i was i was i was playing around to see i was playing around with
the in the deep repositories
and with the an on tools and i step into the jhu and always three
in all the you bought stuff that it it's meant to be supported and i
think like why not they can use the static analyser for example or just you
the whole called bayes using clay in see what type of differences it can make
in the in the final executables it may be that for some by there is
would be slower than you see but for someone will be better i don't know
it's just something that i think it should be given that right
so i think for example jhu to noise three can benefit from trying to use
cling
so other stuff that i think that big you know project can benefit from out
from ldm sport we are back in for example for running at you know three
own on a nexus seven it will be a good try to go by using
playing
besides that using elevator bell till choose to have like smaller binary final binaries which
will explain a static analyser in some tool that you can dynamically explore several different
types of bugs like the address and the members any tight
so you're back in a all of them has a very
i stable are back and which emits really good quality code it's easy it apple
uses it for a compiling now us implications in the X gold you can excellent
is all these and claim and uses they are back and to generate go for
them so it's pretty stable it's pretty optimised
and i it also supports the most recent processors of all the army seven of
family of arm processors
and
besides that for example of them by would use the are back antigen you already
does if you want to generate article but you could it could be for example
more optimised so if you think in the case that if you want for example
to have been on running on our devices with all the specific gpus for example
we would use a just regular vector restrictions from the are instruction set like neon
to generate that gold in a better called in actually has a good fallback for
soft a rating for example
i'm not i think is the all the melody in a regular compiler when you're
linking stuff usually come from sickle we generate object goals for each different source you
have and then in the end your link all of them together or you really
on the from actually and then put it on the final executable later that's
that that's how easily a compiler full works
but with all the and the
and the beats code
representation what okay the ways that you can force it me to be it's colds
just invoking calling in and asking it all don't in the old radicals you directly
just can be bit codes
after that you can mix albeit chords together
and have a final module that final module you can actually apply several different types
of oppositions like to go to before including link time up musicians so you so
you can remove several functions that are not being used inside that model because it
can reasons use everything kids inside there so you can use liking triple sitters stuff
what mutations in
actually can a lot of stuff do we did and then you just invoke the
backend for the final module and you have your final executable so this sort of
like opening the couple reasoning in the compiler
not use of is actually we use pretty good code so i think probably some
project singing on can
just we can a bit of the makefiles to just an array of vehicles for
instead of object files and then the ring linking link always goes together apply link
time optimizations i don't know it maybe can benefits lots of projects i would give
it a try
so they stack analyser it's basically
a source code processing to based on playing that will try to reason about your
code and give you information about if you have one in your old what's our
alternative like not only but you stuff that you could have not done in the
call that will improve it or just remove use this is stuff that you're doing
up role so it's really good because
it actually conceded time debugging later could it can catch all sorts of but it's
not perfect but it helps a lot
so and you can also all to you can also do it use it for
all
the some sort of all but i think that
although much as they should you has you can do to use that and that's
why i think it's good to put it in stuff like you bought because whenever
you already have should you the whole three of you projects at the same time
you can just do it using they static analyser images
have a bunch of information of every project yours you're compiling and you can use
that to somehow track regressions just see if some sort of a critical bugs are
not happening between one revision another are at least before releasing stables a stable releases
you can run them and see if there's nothing critical be released to the public
since it's library basic is just a matter of
importing use actually use the library any let major on that
you have the
so i wanted to make a plug and write checks us you're passing the correct
i call back okay you are it's
how difficult would it be
i can tell you for sure like
it will be much difficult because already have all of those you have already have
analysis that will all come to it's three years three yes the trees
so you could probably use call that is already there to do whatever you want
a specific
you know so it shouldn't be difficult because i have a framework for doing that
it's already there
i don't i can tell you exactly how to do it
but
it it's not difficult
i think right now you actually have to com pilot i i'm not sure if
it if
we have like this leave claim his stuff and we have like that the light
stuff that you can actually write logins but for example for the specific type of
plugging that you that you may like to right i don't know which is going
to be the best place to implement a you know going to have like several
blank we have right now three places that you can who came and implement analysis
so some stuff you can just invoke your dynamic compiler analysis and we just best
way but for some other stuff to have to hack into the compiler to do
it
that's what i'm saying just you can have several who points
but not all not all of them you can just
best looking as a parameter
it could be
i don't know we can you can see that later after the top
right
but we have like several different
interface is that we can plug in allows incline
so this that equaliser also helps with improving called quality seashore there like
having the a direct feedback of what sort of mistakes are doing their cool it
also has false positives because it's not a perfect also sometimes it will say that
it
it looks like it has a but it's a it is not the problem is
that the compiler is just not smart enough to see that you already checked something
before in section default one but in general we it works when you can also
report bugs in telephone this sort these improve this
or send a pitcher self improvement
and of course the compilation is a lot slower using the static analyser but you
shouldn't be using it to produce like of production code it was like do those
separate compilations so you can see your analysis
and on impact on your billboard time and stuff like that
so tuesday static analysers three simple once you have blinks at all well you have
like package for going to i don't know about fedora but or you haven't it
ship the second you tool and based can you tool just you just that said
before called here and before make it will cup and just hijacking them you know
and make it use clank tools instead of the regular compilers and i'm not is
your code and then we'll just emits several each stay all male reports telling you
about all sort of different bugs
so i analyse a few of you know projects from the dog you know on
but hard to see how they would react fifty static analyser and what sort of
but we could we could find so i just put it on the web and
i have done for just a few that i was seeing people talking about you
know other talks and i chose a few projects just to show we will be
how does it works
so
for example i put it on blade see how the source code of we use
regarding the buttons and but at least bugs that this data can alike and five
and i found that it's very clean you don't hide it feel like average like
a few that for is of no pointers some money on the on the cal
i argument that is but it's
i find it very clean compared to other projects i've run into
so you can have an idea for example
that's part is actually not a problem but it's just something that you could avoid
so you could have like a
and a bit is a bit smarter code and bit faster code and it keep
track self that's all
what's you see this is like a next email report and
you can click on that several different that assigned a minutes you can if you
go there you can you can actually if you can on the link you can
actually see the full report and just go around and see the types of whatever
so for example for a random that assignment in late it will just show you
all the variables are sleaze
is never read after this attribution so there is there is no point in just
doing this you can just remove this code and probably make your code a bit
faster
it can actually sometimes be a false positive because sometimes you may never it may
not perfectly reasonable what the function is returning but it generally works really well
so that's one of the example for late
i also relative to G streamers should be streamer and that's for this is interesting
because of all the of all the projects i run into that you streamer and
the give to get i mean you think a was one that have like division
by zero so i found like seven divisible by zero in G stream or so
if you want to see the report to see how it generates how we it
shows you the bug
it's very cool because for example if you show they steps
and it show you reach break it to choose getting to that point and reach
sort of condition it assume to take the break so for example in this case
it will select the first hearing dysfunction it can see that the base that
because of the limitation of the function you can also click on that and see
the implementation of the function it's a base that may be zero
and if you follow the cold assuming for example that's all is a greater equal
then me that will could yell the a division by zero in your code role
so it's good to find all simple stuff that sometimes just you like really weird
bugs in the little what's happening
so i also minutes for the evolution data server
because they stack analyser can also sure we were sick of problems that it considered
like a security issues so for example
here we have this are you sure with the safety you idea so i was
like well so this is evolution data server you have like the
a security issue with the a very important function that mainly true
reflects collation so i just went there
just see what was happening but if you could be able to say that this
at E Y D is just should before they actually set you might be that's
considered dangerous so this is actually a false positive because you already look it before
so you were you were calling it again without checking what it retards because we're
ready check before what was returned
so and i also run into should be okay to see how much average you
will be your then actually that was a bunch of them
and they're probably can be more if it if they static analyser can be told
to reason about some constructs of but kitty because it has it uses like really
you mohawk and stuff like that
so but it's just good to find simple and
it just straightforward stuff so
can dispatch quickly
so this is the your model we have like
thirty three for
but i've but lot of them are actually really bugs just the logic at first
probably security issues
do to another computer user and i mean
just to analyser
i don't know i look it up i found some people that running to right
running on V but i didn't see the output
also decaf all some blocks same but people or not like just display there i
mean the book another graphical computes
like you to your you know all i don't know i don't look at i
think it if L at least the developers i know from yourself here fell there
quite they like link in and they use it in their products but i don't
think i don't know if they use it like on your box
and stuff like that
okay
so i think we have the descent advisors
this any factors are actually
for relative also have it on T C is just actually you can reason about
memory access and this that's regarding addresses
in the rerun time it's not a it's not i'm ecstatic cannot is anymore but
it's the libraries that you can link together with circle that will work just a
bit like alright for example and will do you like dynamic found it works
so for example D we have the senate by there
which basically checks for out of a lexus in any type of memory and also
for comments for you have verse
the those types of ever may not be sometimes on might be static analyser so
this limitation on using the like dynamically can you better results so i didn't have
like enough time to run to the all those products like a did misplaced i
can a lighter but i relatively and you it and i couldn't find maybe
runtime errors made i didn't explored those problems and of because i don't know like
the corner cases that would you like can try can probably reach strange golden but
they look at it around did and the them exploded anything
and we also have the members and it either
which is like a just a this is more like o'brien actually a you will
the tech on the july reads
all those tools it will have like a it is lower your go down because
it have like a runtime library in reasonable your code and everything and i just
used a G a huge to for a few products in trying to link using
the absolute bicycles memory it will break
it make just running regular make after you compile the final executable it will just
run a few tests and will just like explode because actually has and one of
a naturalised reads and we'll just i couldn't even like while using the G H
B O because it was some underlies everything in G D so it it's probably
worth taking a look to see if it's
some sort of false positive i don't know i don't believe but
just see what happens in if it's can prove
but that's it is
questions
i just got and analyse it detect a situation where you have in you know
sickle and the
condition that is just one is never modified inside
and that's and those detect and this that happens because for example or whatever and
then forget to assign and it's in the variables mentions
condition
sorry i put your repeat is again like within your everything
so suppose you have a why do you new sequel
it's a for example checks that one bearable is greater than another in
in the and then in the body all that will you'll forget to assign and
use into any of those variables so if it is true initially it will it
will for ever and little we'll just for a little and you also their static
analysing cash such things like for the most simple case can tell you that will
probably run for ever something like if it's a simple case can to i can
i think you can probably write code to try to break it and you know
you can probably put some weird if we gave that we couldn't figure out that
part of people
it will get more
so i just have to make this comment opponent quite funny that you had to
excuse the use of C plus
at the beginning of your talk in this crowd
because when you don't look at it into your results
that it looks like to type of the more macro expansions that are not doing
proper type checking because of programming against you'll it using C
and it's this program was done in C plus it would probably be type checked
and users would not be occurring
so just to come
i guess this is more of sort of administrative issue but i actually found a
bug in the clang static an analyser
and i reported it and you very annoying thing is it seems that L B
M or clang are driven by apple so i just got this mysterious link saying
clone to the internal apple about tracker and as far as i know it's still
open with an unresolved status i'm curious if like there's plans about getting it sort
of away from the apple infrastructure "'cause" it's disappointing that work that gonna report about
there's just this unknown apple what track at that i have no idea what's going
on internally about so i don't really how much capital controls all the M or
if you could give some words on that
is as a matter of fact i just
one fact is that almost part of the developers i hard by and also
there probably do whatever it's probably easier for them to track i don't know i
don't have anything i don't know
i think a for the goal for a nice
i want to us
i know that it's not really three will to do like C equals ceiling
for G C and
you know will build or called with ceiling can not with G C do have
a tool to help developers with the stuff
i couldn't understand it can you okay so we can imagine you have your call
or project and you are using G C and right now we want to ceiling
for as the compiler
and i know it's not really trivial to do this change you have a tool
to help developers with this
well i actually clay try to support most part of what this is he does
including the command line so you should expect that just changing the G C driver
for the clean driver it will work it may not what it works for most
part of the case a lesser implementing some just like if your problem we are
using not just a weird you know like station that a lot of the end
is likely is playing split it's a it's not going to support then you may
run into problems but regarding compiler flags it try to support everything that this is
you support just job to make sure that it that you can just be a
in place you can just replace in and it should work
that that's how we think that it should be
okay because i know that some constant whether or not to build whole districts you
link and
to do some
real hacks to make it happen to like i don't know like fifty percent of
pictures build them
so i just want to us okay it just like i told you maybe there
are some X stations in the cold that clank doesn't support but like regarding just
the compiler flags most part of them are supported or at least ignored if they
don't make sense in how internally ldm the stuff so you couldn't just like have
your compilation broken because some specific this is see flag is not there you know
in inside
okay
regarding statistical analyses and to ask this run on our cool or on actual some
school
sorry
still to come once is this wrong i hope
or not it's before it's on the C I C is form and during the
problem
so it's got like in the I R you already lost much information and a
lot more difficult to reason about
so it's be up there
it's before okay thanks
i first thought licensee
but
so though the playing like the C library accessing it gives you act
as the C N X and everything i found it incredibly restricting compared to what
you get from the traditional C plus
libraries
is there one like any work being done to enable
that stuff
and is it going to
be a little bit more convenient is like right now if i wanna get a
string for token and
i get like a see a string and i have to go and get a
C string from that
is it it's a really inconvenient see
i heard about a lot of complaints about that but
any it's also because it changes a lot right so i don't have any pi
that they're trying to be you know
so that's really i'm sure that if someone just i'd like is
just a proposal will written one probably they may consider it because
it's something that it's far
interest lot of people right but i have seen all works in that direction
nazi not they don't existed but at least i'm not aware of it but i
know it's a should like i see people complaining about that for several years now
so if we're going to start to learn you know around playing in terms of
like building you know static analysers encode checkers
would it even less to instead try to wrap the C plus libraries in the
G object
because it is really inconvenient after a all of the C plus
we were mostly C programmers here
i think we'll probably better just should try to spend our lives the somehow ye
pi right like with some maybe a guy thing i don't know i it's probably
easier "'cause" you could just use it directly see that all
writing it around and it is not just like receiver suppose busting it's not if
we use G object and wrap that knows and we can use G I R
python and what not to write fairly complex scripts very fast
but i don't know are you guys rapping us you possible stuff and i can
already
i could i just don't know like i think they're both way to solve it
i don't know which one is better "'cause" i have a big that place and
love to tell you what i think would be better at i don't use the
it guy myself so
i'm not aware like of all the trouble that it mainly it's too but it's
something that we can we can discuss letter and take a look and i can
give you feedback if i have anything to compensate i guess my comment in mostly
is that
actually maintaining a stable at i
would be huge
of course agree
you just you specifically major there are back in the curious about other non X
eighty six architectures like instant he acts do you know what the status of the
level of support is on that hi don't think it table enough if it has
i don't i the last time i look
hundred or a four
there L in back and still for every architecture that
both fedora and realm support
so
the cabby us there
powerpc thirty two and S three ninety
not us or not
the sixty four bit power industry ninety work
thirty two works
by then
i expect that dorm sixty four stuff work
X eighty six maybe six people
what else is there
it has more back-ends and that actually there's
there's a the backend for our six hundred
and later at and gpus there's a back and forth in called P C X
which is what people are using
sort of common i are
opencl
and a couple of other architectures we don't have to work
or like that
textile and
and then that'd chips
there in varying stages of quality and you find occasionally there's like jen go you
feel serialise this instruction correctly but they really much past
three to three is far more consumable in this respect than three that's you was
three that's it was missing
was missing both idea was a power and see and systems E
and market
you only one of you just to free to talk of that's about ldm and
playing
everything
i think