0:00:15 | Thank you, Brian. |
---|
0:00:16 | Good morning. |
---|
0:00:17 | It is my honour to be here |
---|
0:00:19 | and this is my first time in Interspeech. |
---|
0:00:23 | I thought that |
---|
0:00:24 | Interspeech have an interesting culture that before the talk there is dancing and singing, |
---|
0:00:30 | but it was not offered to me. |
---|
0:00:32 | So |
---|
0:00:33 | and I saw many old friends here, many familiar faces |
---|
0:00:37 | and this community is technically like a cousin to me. |
---|
0:00:43 | I have learnt a lot from this community. |
---|
0:00:46 | I'm working in signal processing in general |
---|
0:00:52 | with application to communication, multimedia and many other things, but not speech. |
---|
0:00:56 | I don't know why, but I found the answer yesterday. |
---|
0:01:05 | thing, okay. |
---|
0:01:07 | Yesterday, |
---|
0:01:09 | when I first came to America as a teaching assistant |
---|
0:01:13 | I was teaching assistant for signal systems |
---|
0:01:16 | and I had to teach students there's linear shift in invariance ??, |
---|
0:01:21 | there's a concept ?? |
---|
0:01:23 | And every time they were laughing behind me, |
---|
0:01:25 | so I asked them once: Why are you laughing? |
---|
0:01:28 | And they told me. |
---|
0:01:30 | F. You need to pronounce your F. |
---|
0:01:35 | That's why I cannot do speech processing. |
---|
0:01:38 | So |
---|
0:01:40 | today |
---|
0:01:42 | just like Brian said, |
---|
0:01:43 | there's no speech, there's no keynote, if I submit a paper to Interspeech it will |
---|
0:01:47 | be 'yes' or it will be rejected, |
---|
0:01:49 | because it doesn't match heading, keynote yesterday or keyword at all. However |
---|
0:01:58 | I hope this can be useful to you. I learnt something like hidden Markov model |
---|
0:02:03 | and deep learning |
---|
0:02:04 | from this community and I hope that what you see today, the idea, may be |
---|
0:02:08 | useful to you. |
---|
0:02:10 | Okay, |
---|
0:02:10 | so the story begins. |
---|
0:02:12 | First |
---|
0:02:14 | social media. |
---|
0:02:15 | This is a new phenomenon. More and more decisions and |
---|
0:02:20 | activity in our daily life are being recorded, tracked and shared. |
---|
0:02:25 | It's scary, right? |
---|
0:02:26 | Yes, it's scary. This is kind of user-generated big data. |
---|
0:02:31 | For example like our motion, our movement everywhere, they are being tracked. |
---|
0:02:35 | Yes, it is scary. However it is also a 'plus'. |
---|
0:02:40 | Some potential and very interesting and important research direction for us |
---|
0:02:45 | and machine learning is a very powerful tool |
---|
0:02:49 | as being used in this community and also potentially |
---|
0:02:52 | in many other enterprises, such as social media. |
---|
0:02:56 | Machine learning aims to use reasoning to find new and relevant |
---|
0:03:00 | information given some background knowledge. There're many applications |
---|
0:03:04 | and |
---|
0:03:05 | you all know, I believe many of you using that machine learning consists of three |
---|
0:03:10 | elements: |
---|
0:03:10 | representation, evaluation, automatisation. I will not get into detail of these. |
---|
0:03:15 | Although it is a very powerful tool |
---|
0:03:17 | there are limitations and constraints. |
---|
0:03:20 | Why? |
---|
0:03:21 | Because the generalization assumption may not hold. |
---|
0:03:24 | Especially in user-generated data scenario. |
---|
0:03:27 | Why? |
---|
0:03:28 | Because the user behaves differently at different time under different settings. |
---|
0:03:34 | You behave very differently when you are hungry or |
---|
0:03:36 | you need to go to restroom then you cannot sit here, yeah? You, we |
---|
0:03:40 | at different we may behave in a different way. |
---|
0:03:42 | And the training set may not be statistically consistent |
---|
0:03:46 | with the testing set. |
---|
0:03:48 | That was the assumption in this machine learning algorithm |
---|
0:03:52 | and also single objective functions |
---|
0:03:54 | cannot cover user's |
---|
0:03:56 | difference interests. |
---|
0:03:58 | Users have different interests and therefore they have different objective functions. |
---|
0:04:02 | And users are also rational and therefore they are all selfish. They all want to |
---|
0:04:08 | optimize their own utility. |
---|
0:04:10 | And we need to consider this. |
---|
0:04:12 | So |
---|
0:04:14 | the knowledge contained in the data |
---|
0:04:17 | can be difficult to fully exploited from |
---|
0:04:20 | this machine learning macroscopic view. Although it's very powerful. |
---|
0:04:25 | Because the data are the outcome of users' interactions. |
---|
0:04:29 | We cannot ignore the users' interactions. |
---|
0:04:32 | We need to explicitly consider interaction of users |
---|
0:04:36 | and basically that mean their decision process |
---|
0:04:38 | should be taken into consideration. |
---|
0:04:41 | And in the sequels you will find out that Game theory is |
---|
0:04:45 | powerful tool |
---|
0:04:46 | that can study users' strategic decision making, because we can take into |
---|
0:04:51 | multiple objective function at a single time and we can |
---|
0:04:54 | also take into consideration of users' local and individual interests |
---|
0:04:58 | into consideration. |
---|
0:05:01 | Okay, |
---|
0:05:02 | let's face it. |
---|
0:05:03 | Learning is for what? Learning is for making optimal decisions. We don't learn just for |
---|
0:05:10 | fun. |
---|
0:05:10 | And learning and decision making are coupled together |
---|
0:05:13 | due to network externalities. What is that? What does I mean network externalities? I mean |
---|
0:05:18 | that my decision, my action |
---|
0:05:20 | will affect your decisions and your behavior. |
---|
0:05:24 | We all affect each other. You made decision to come |
---|
0:05:26 | here and I made decision to come here, that's why we are all here. |
---|
0:05:30 | We affect each other. However what is the problem? |
---|
0:05:33 | The problem is that there is a missing link. |
---|
0:05:36 | And the missing link is that |
---|
0:05:38 | in a very traditional sense machine/social learning and strategic decision making are still two |
---|
0:05:45 | unrelated disciplines. |
---|
0:05:47 | And so here |
---|
0:05:48 | we propose |
---|
0:05:49 | a notion of decision learning to preach in between |
---|
0:05:53 | that |
---|
0:05:54 | when we are doing learning we also need to consider the decision making process. |
---|
0:05:59 | So this is a |
---|
0:06:00 | title of the |
---|
0:06:01 | talk. 'Decision learning' by that we mean learning with |
---|
0:06:06 | strategic |
---|
0:06:07 | decision making. |
---|
0:06:09 | Okay, |
---|
0:06:10 | I'm going to use three example to illustrate |
---|
0:06:12 | how decision learning work. |
---|
0:06:14 | First decision learning with Evolutionary User Behavior. Second is with Sequential User Behavior and then |
---|
0:06:21 | we'll talk about how we design mechanism |
---|
0:06:23 | from system perspective. |
---|
0:06:26 | First let's talk about Evolutionary User Behavior. Here I want |
---|
0:06:30 | to use information diffusion over social network as an example. |
---|
0:06:34 | Okay, |
---|
0:06:35 | now you see |
---|
0:06:37 | this is the Twitter hashtag during 2008 US Presidential elections. |
---|
0:06:43 | It |
---|
0:06:44 | captures spreading of comments and phrasing by candidates. |
---|
0:06:48 | And look at |
---|
0:06:49 | pink one, you can see that pink one |
---|
0:06:51 | this one says it is lipstick on a pig. |
---|
0:06:54 | It is a phrase by famous Sarah Palin, if you know her. |
---|
0:06:58 | And you can say that when a phrase was comment |
---|
0:07:01 | there is a duration and it'll reach to the top |
---|
0:07:04 | and then eventually it decreases and then goes down. |
---|
0:07:08 | You see that everything have a durations. |
---|
0:07:10 | And |
---|
0:07:11 | this is for the political comment |
---|
0:07:14 | and this was done in 2008. Now, if you are politician you want to measure |
---|
0:07:22 | that your |
---|
0:07:23 | message can be delivered. When to reach to the peak? |
---|
0:07:26 | When will be the starting point and when will be the end point? |
---|
0:07:28 | It had to be before the election not after, right? |
---|
0:07:31 | Okay, now this is the other point. |
---|
0:07:34 | That is online advertising for new product. |
---|
0:07:36 | How to predict |
---|
0:07:38 | the popularity of this advertisement and eventually what is the market share? |
---|
0:07:45 | Can we predict that? |
---|
0:07:47 | Okay, all of this relate to one problem that we call Information Diffusion Problem. |
---|
0:07:51 | How information diffuses? |
---|
0:07:54 | Users exchange information over social |
---|
0:07:56 | network. Okay, we're talking about this on the our social network |
---|
0:07:59 | and the study of this information diffusion, |
---|
0:08:01 | this problem is important. Why? Because if we understand the dynamics |
---|
0:08:06 | of this information diffusion. We can predict and maybe control |
---|
0:08:10 | the start/peak timing and distribution. |
---|
0:08:13 | And we can estimate a final reach of the populations |
---|
0:08:16 | it also perhaps we can identify the influential users and links |
---|
0:08:21 | for our purpose. |
---|
0:08:23 | Okay, |
---|
0:08:24 | now |
---|
0:08:25 | Dynamics In Information Diffusion. |
---|
0:08:28 | It is a sequence of decision making processes. |
---|
0:08:31 | It is not what we said, what we though that information just |
---|
0:08:35 | spreads by itself. |
---|
0:08:37 | No. |
---|
0:08:38 | For information to diffuse |
---|
0:08:40 | it relies on other users to forward or post the information they receive. |
---|
0:08:45 | Like I have my social network and if you are in my network, I post |
---|
0:08:48 | something and you will see that |
---|
0:08:50 | and once you see that you have to make a decision to post it or |
---|
0:08:53 | not |
---|
0:08:53 | and then from there now everybody when you are in somebody's social network |
---|
0:08:58 | everybody have a decision making process: Should I post, is the information exciting, |
---|
0:09:03 | are our friends interested in it? Or I don't want to post because |
---|
0:09:07 | it could be embarrassing. |
---|
0:09:09 | There is a decision making process here |
---|
0:09:11 | and information diffusion is an evolving process |
---|
0:09:15 | like our evolution. |
---|
0:09:17 | So |
---|
0:09:18 | for information diffusion we ask |
---|
0:09:21 | whether to post |
---|
0:09:23 | or forward or not. |
---|
0:09:25 | And for evolution process |
---|
0:09:27 | when there's a gene that have a mutant |
---|
0:09:29 | we ask |
---|
0:09:30 | whether to take mutant or not. |
---|
0:09:33 | So it's similar, so we want to relay this together and model this problem. |
---|
0:09:37 | And social network is always illustrated by a graph structure and we understand that so |
---|
0:09:42 | we are using the |
---|
0:09:43 | Graphical evolutionary game. |
---|
0:09:45 | Now |
---|
0:09:46 | what is this Graphical Evolutionary Game? Let me make a very simple explanation. |
---|
0:09:51 | Each player have |
---|
0:09:52 | a notion of fitness. |
---|
0:09:55 | How fit you are? How fit I am? |
---|
0:09:57 | Okay and so this is an evolution that we all know. The stronger one, the |
---|
0:10:03 | fittest will survive. |
---|
0:10:05 | A users' fitness |
---|
0:10:07 | is determined by this as you can see I tried to use this equation and |
---|
0:10:11 | this is the user's fitness. |
---|
0:10:13 | And B is the baseline fitness that |
---|
0:10:15 | depend on myself, okay. |
---|
0:10:17 | And this U |
---|
0:10:18 | is interaction payoff |
---|
0:10:20 | determined by the environment. |
---|
0:10:22 | And this alpha is selection intensity. |
---|
0:10:26 | So if I'm going to interpret this I would say |
---|
0:10:28 | that |
---|
0:10:29 | someone, like for someone here Haizhou Li here in Singapore |
---|
0:10:34 | his fitness depends on his own fitness |
---|
0:10:38 | plus the selection intensity multiplied by overall environment fitness of Singapore together, that is |
---|
0:10:44 | his own fitness strenght. |
---|
0:10:47 | I think that makes sense. |
---|
0:10:48 | So |
---|
0:10:50 | an evolution process has two important elements. One is selection and one randomness. |
---|
0:10:56 | These two are very important in the evolutionary process. |
---|
0:10:59 | So let me use this example. We have a graph |
---|
0:11:02 | and then randomly we select |
---|
0:11:05 | a node |
---|
0:11:06 | that can be a user. |
---|
0:11:08 | And once we select this one we have already randomness in here, now |
---|
0:11:13 | we are going to make selections, okay. |
---|
0:11:15 | So |
---|
0:11:16 | we have to compute fitness. |
---|
0:11:19 | Its own fitness and also its neighbours' fitness. |
---|
0:11:23 | I want to compute all these fitness and we select |
---|
0:11:26 | the strongest fits |
---|
0:11:28 | to imitate. |
---|
0:11:29 | It doesn't have to be imitation, I just used imitation as an example. |
---|
0:11:33 | There's a ?? protest and ?? despair there many different kinds of processes. |
---|
0:11:36 | Okay, |
---|
0:11:37 | so |
---|
0:11:38 | we need to know which one is the best |
---|
0:11:41 | fits |
---|
0:11:42 | and then we are going to imitate |
---|
0:11:45 | and |
---|
0:11:46 | this is the best fit and we imitate, |
---|
0:11:49 | okay. |
---|
0:11:49 | So this is the |
---|
0:11:51 | a typical evaluation process that is determined |
---|
0:11:55 | by users fitness. |
---|
0:11:57 | Now |
---|
0:11:58 | Information Diffusion Over Online Social Networks |
---|
0:12:00 | I have just say that |
---|
0:12:03 | online social network can be presented as a graph. |
---|
0:12:06 | As we can see it's a graph. |
---|
0:12:07 | And this information diffusion depend on the user's actions to forward or not depends on |
---|
0:12:13 | utility. |
---|
0:12:14 | If I like or my friend may like it I will calculate the utility of |
---|
0:12:19 | that and if |
---|
0:12:20 | that utility is strong I'm going to forward or post. |
---|
0:12:23 | If that is embarrassing then I am not going to do it. It'd damage my |
---|
0:12:27 | reputation. |
---|
0:12:28 | So this is a very similar to the |
---|
0:12:31 | Graphical evolutionary game. |
---|
0:12:33 | Now on the left hand side |
---|
0:12:35 | is the Graphical evolutionary game. On the right hand side we have the Social network, |
---|
0:12:39 | okay. |
---|
0:12:39 | Now |
---|
0:12:40 | there we have notion of fitness as I |
---|
0:12:43 | explain. Right hand side we have the user's utility. Now we |
---|
0:12:46 | basically we try to model the entire problem using this utility. |
---|
0:12:51 | Is there interest or not that we can do so. So on left hand side |
---|
0:12:55 | is that how can we map |
---|
0:12:58 | and relay that utility or our interest to the notion of fitness |
---|
0:13:02 | and then we can use Graphical evolutionary game which is a |
---|
0:13:05 | very powerful tool that have been useful sometime, that we can use. |
---|
0:13:09 | Okay, so how can we calculate that? I'd use a very simple example here. |
---|
0:13:15 | Now let's look at this graph. |
---|
0:13:17 | Now we have a user. |
---|
0:13:20 | He chooses strategy to foward. |
---|
0:13:23 | Okay. What is his own fitness? |
---|
0:13:25 | Here I choose B=1. |
---|
0:13:27 | This is based on his own. |
---|
0:13:29 | And the neighbour is what? |
---|
0:13:30 | We have Kf neighbor here. We have Kf neighbour |
---|
0:13:34 | with utility that forwards also. |
---|
0:13:37 | The rest of them K-Kf |
---|
0:13:39 | choose not to forward. |
---|
0:13:41 | Okay, so this is the utility of all his neighbours and you have selection intensity |
---|
0:13:46 | if that |
---|
0:13:46 | overall together with his own |
---|
0:13:49 | fitness. This is the overall fitness of |
---|
0:13:53 | the user who have the strategies to forward. |
---|
0:13:56 | For someone not to forward, |
---|
0:13:57 | we'll be doing the same, |
---|
0:13:59 | okay. |
---|
0:14:00 | Now for this |
---|
0:14:01 | network, we have this configuration probability. |
---|
0:14:05 | We have the fitness of the 'center guy'. |
---|
0:14:07 | Now we need to calculate the fitness of all his neighbours. |
---|
0:14:11 | Let's say someone with strategy to forward also. |
---|
0:14:14 | What is the |
---|
0:14:16 | fitness |
---|
0:14:17 | and if someone with strategy not to forward |
---|
0:14:20 | what is the fitness? So we calculated all this. |
---|
0:14:23 | Now |
---|
0:14:24 | with this we want to calculate what is the |
---|
0:14:28 | probability that someone will change from forward strategy |
---|
0:14:32 | to not to forward strategy? |
---|
0:14:34 | And we can calculate all this, okay. The details I will just omit. |
---|
0:14:37 | By doing all this |
---|
0:14:39 | we can find the dynamics of information diffusion. |
---|
0:14:43 | This dynamics of percentage of users who will |
---|
0:14:46 | forward the information will be captured by this equation that we can calculate |
---|
0:14:50 | and |
---|
0:14:51 | the final evolutionary stable state |
---|
0:14:54 | can be also obtained by this. And let me explain this. What is this |
---|
0:14:58 | if the utility of forwards, all forwards is much larger than the rest who will |
---|
0:15:03 | not do anything about it then |
---|
0:15:06 | one hundred percent of the users will all forward and post. |
---|
0:15:08 | Because it's too advantageous to them. |
---|
0:15:12 | If on the other hand |
---|
0:15:14 | not to forward have the highest utility and to forward |
---|
0:15:17 | have the lowest utility then noone is going to do it. |
---|
0:15:20 | Follows in between, there will be a stable percentage of population that will do that, |
---|
0:15:27 | okay. This looks complicated. |
---|
0:15:29 | Now K is degree of freedom. Meaning for each node how many friends you have. |
---|
0:15:34 | This assumes that |
---|
0:15:35 | (scale) is a large enough, okay. Let's assume K is sufficiently large. Now you can |
---|
0:15:40 | see that basically it |
---|
0:15:41 | is independent of the degree of freedom. |
---|
0:15:44 | And meaning what? Meaning only the utility can determine how much percentage of the population |
---|
0:15:50 | will forward or post information or not. |
---|
0:15:53 | Okay, now let's see this |
---|
0:15:56 | experiment. These are real data we acquired from Facebook. In this particular graph we have |
---|
0:16:02 | 4039 users with 88234 edges. |
---|
0:16:05 | And here we have ten |
---|
0:16:07 | subgroups, ten major user subgroups. First this is a social network so you can see |
---|
0:16:12 | that in |
---|
0:16:13 | the low scale it is a straight line here. This is so called scale-free network |
---|
0:16:18 | meaning |
---|
0:16:18 | that this is low scale so low scale people have very high |
---|
0:16:23 | degree of connectivity, meaning it's very powerful. It is exponentially less, okay. So that is |
---|
0:16:29 | something that we all knew before. |
---|
0:16:31 | Okay, now let's look at Evolutionary Stable States. |
---|
0:16:35 | That is |
---|
0:16:36 | now let's look at left hand side first. |
---|
0:16:39 | We use four cases as examples. So now we know the utility function, okay. Utility |
---|
0:16:44 | function of |
---|
0:16:45 | four parameters now are our model parameters. We want do model of information diffusion process. |
---|
0:16:50 | So now look at this. The first case. |
---|
0:16:53 | If utility of forwards is very high or higher than others, then what? |
---|
0:16:57 | One hundred percent of user or everybody will post. |
---|
0:17:01 | If |
---|
0:17:02 | it reduces |
---|
0:17:03 | and is not as good |
---|
0:17:05 | however still good enough then |
---|
0:17:07 | then there are some percentages in this case some sixty and some percent that will |
---|
0:17:12 | post or forward. |
---|
0:17:13 | If it's getting less |
---|
0:17:15 | then about thirty and some percent |
---|
0:17:17 | will post and forward. |
---|
0:17:18 | If on the other hand |
---|
0:17:20 | it's not good at all. The utility is so bad, there's nobody left to post |
---|
0:17:24 | then |
---|
0:17:25 | nobody is going to forward it |
---|
0:17:27 | and do anything. And there is a rising time and we can calculate that time |
---|
0:17:32 | and eventual population, |
---|
0:17:33 | that percentage of people. We said there are ten subgroups and in each group they |
---|
0:17:37 | all behave the same. This |
---|
0:17:38 | is just to show that. |
---|
0:17:40 | Okay, now this is something that we say: Okay, if we had a model parameter |
---|
0:17:45 | what is the behavior? |
---|
0:17:46 | Now, how about we don't have the model parameter. We don't. |
---|
0:17:49 | We only had the data and this of the data from MemeTracker. |
---|
0:17:53 | MemeTracker is this |
---|
0:17:54 | it is |
---|
0:17:55 | this news, this cycle network. It builds map |
---|
0:18:01 | for the daily news cycle, okay. |
---|
0:18:03 | By trying to link |
---|
0:18:05 | that what others, |
---|
0:18:06 | these are news report, |
---|
0:18:08 | for these information. |
---|
0:18:10 | You see as example here is a comment: we're not commenting |
---|
0:18:15 | on that story I'm afraid. Okay. With this |
---|
0:18:18 | then someone reports: we're not commenting on that. And there are three different links. And |
---|
0:18:23 | then somebody |
---|
0:18:23 | changed a little bit: we're not commenting on that story. |
---|
0:18:27 | And then there is a link, okay. So |
---|
0:18:30 | this is a very good source of |
---|
0:18:32 | to find these information, how you diffuse, okay. So we had this data and this |
---|
0:18:36 | is a huge amount of |
---|
0:18:37 | data. As you can see that we have more than 172 million news, articles and |
---|
0:18:42 | sources. |
---|
0:18:43 | Okay, so we might from here |
---|
0:18:45 | now what we want to do is |
---|
0:18:47 | we want to learn the utility |
---|
0:18:48 | that is, I have this vast amount of data, |
---|
0:18:51 | can I find a utility? I can reduce to few parameters to describe how informations |
---|
0:18:56 | may diffuse. |
---|
0:18:58 | Okay, this is the result. |
---|
0:19:00 | This is dynamic of four pieces of information. |
---|
0:19:03 | This curve associates with word |
---|
0:19:06 | Google Wave. |
---|
0:19:08 | This associates with the word |
---|
0:19:10 | Tehran. |
---|
0:19:11 | The grey line is a real data. |
---|
0:19:14 | And the red one is our fitting, our result. |
---|
0:19:17 | You can see it very well and the blue line |
---|
0:19:20 | is using these only |
---|
0:19:22 | using only the machine learning approach. |
---|
0:19:24 | So you can see that |
---|
0:19:26 | what we can do can fit much better. |
---|
0:19:29 | And most importantly that not only that |
---|
0:19:32 | we describe this information diffusion phenomena into, in this case only in two parameter, because |
---|
0:19:38 | we normalize these two, okay there are more and more. We only used two parameters |
---|
0:19:43 | we can describe how the information |
---|
0:19:44 | diffuse, when it stops, when it reached to the peak and the |
---|
0:19:48 | way it (ends) and if we only have partial information we can predict. |
---|
0:19:51 | I didn't have something to show you we can predict |
---|
0:19:55 | and also. |
---|
0:19:57 | Okay. |
---|
0:19:58 | Now |
---|
0:19:59 | let's do this experiment. We have five |
---|
0:20:02 | group of sites from the database, okay. We know this is a |
---|
0:20:05 | large database. Each group contains five hundred sites. |
---|
0:20:10 | And we estimate the final stable population, |
---|
0:20:13 | the lowest population that posts or forwards these information. |
---|
0:20:16 | So these five groups we can find very interesting. |
---|
0:20:20 | At the end it had different percentage of |
---|
0:20:23 | users in this particular group. |
---|
0:20:25 | That will post or forward information. |
---|
0:20:28 | And |
---|
0:20:31 | this is to show that the black one is, these are from, |
---|
0:20:36 | these are our model. |
---|
0:20:38 | And the red one is from the real dataset and you can see that they |
---|
0:20:42 | are very consistent. |
---|
0:20:43 | Okay, now back to this. |
---|
0:20:44 | What does this means? |
---|
0:20:46 | This is very interesting. We can see that group five |
---|
0:20:49 | behave cohesively |
---|
0:20:52 | or share major common interests. |
---|
0:20:54 | That means that those users in this particular group |
---|
0:20:59 | more than eighty one percent will agree with the social (group), |
---|
0:21:04 | with their own networking friends. Whenever they post something |
---|
0:21:07 | eighty one percent that they will also post. |
---|
0:21:11 | So |
---|
0:21:12 | and however in the group one. There only about |
---|
0:21:16 | nineteen percent will do so. |
---|
0:21:18 | Therefore if you hire an advertiser you want to |
---|
0:21:21 | do advertisements. Which group are going to forward to? |
---|
0:21:25 | And then to expect |
---|
0:21:27 | a high |
---|
0:21:28 | percentage of |
---|
0:21:30 | results that many people with see. |
---|
0:21:32 | That would be this group. |
---|
0:21:34 | So we can |
---|
0:21:35 | expect our group five in this particular case behave |
---|
0:21:38 | cohesively and share major common interests. However group one |
---|
0:21:43 | share rather very little common interest. |
---|
0:21:45 | This is to mine |
---|
0:21:47 | the coherence of group and |
---|
0:21:50 | if we can do anything about it or not. |
---|
0:21:53 | Okay. |
---|
0:21:54 | Now I'm going to change gear to talk about Sequential user behavior. |
---|
0:21:59 | Here I'm going to use some |
---|
0:22:01 | well known website that we collected data from to |
---|
0:22:07 | illustrate the problem. You all know Groupon. I think some of you may also use |
---|
0:22:10 | it and advantage of it, |
---|
0:22:11 | right? On Groupon we sometimes find out there is a very good deal for a |
---|
0:22:15 | restaurant |
---|
0:22:16 | that you always go and we buy from there. |
---|
0:22:19 | And |
---|
0:22:20 | and Yelp. |
---|
0:22:22 | We're also using Yelp. |
---|
0:22:23 | When I am here I am using Singapore app to determine which restaurant I want |
---|
0:22:29 | to go. |
---|
0:22:31 | Once we pull them together it gets very interesting. |
---|
0:22:33 | We found that |
---|
0:22:35 | this is a Yelp rating |
---|
0:22:36 | for some very good restaurant. |
---|
0:22:39 | We found that |
---|
0:22:41 | Yelp's star rating decline, okay you see this is declining after a succesful Groupon deal. |
---|
0:22:49 | With rating that's very high and |
---|
0:22:52 | many people buy this Groupon |
---|
0:22:54 | and then in the next few months you can see that |
---|
0:22:57 | its rating is declining. |
---|
0:23:00 | Why? |
---|
0:23:03 | This is a negative network externality |
---|
0:23:06 | in place here. The one I'm mentioned, I'm going to explain that. |
---|
0:23:09 | So this degrading quality may be due to overwhelming number of customers, because it's too |
---|
0:23:15 | successful. Many people buy and then in the following day many people will show up |
---|
0:23:20 | and they only have three waitress. And then you have one hundred people who show |
---|
0:23:24 | up. Quality would not be good. |
---|
0:23:26 | Kitchen is limited and may not be good. |
---|
0:23:29 | That could be the reason. |
---|
0:23:31 | Okay, so |
---|
0:23:33 | what is a phenomena here? |
---|
0:23:35 | Learning |
---|
0:23:36 | to get best deal. Everybody want to get best deal. |
---|
0:23:39 | And then they make a decision to take advantage of that and then their decisions |
---|
0:23:44 | affect each other. |
---|
0:23:45 | Reduce all this ?? fashion from each other. |
---|
0:23:49 | That's what we call |
---|
0:23:50 | negative network externality. |
---|
0:23:53 | Your decision and my decision |
---|
0:23:55 | eventually affect each other, |
---|
0:23:58 | okay. |
---|
0:23:59 | So how can we have model this problem? |
---|
0:24:01 | I like to go to a problem in machine learning called Chinese restaurant problem or |
---|
0:24:07 | Chinese |
---|
0:24:07 | restaurant process. Some of you are using it also. |
---|
0:24:10 | I don't need to explain this is Singapore if you don't know |
---|
0:24:13 | what Chinese restaurant problem is, just choose chinese restaurant you go to, okay. |
---|
0:24:17 | So |
---|
0:24:17 | Chinese restaurant problem |
---|
0:24:19 | we have limited round tables with unknown size |
---|
0:24:22 | and this round table is of unknown size could be cloud service or it could |
---|
0:24:25 | be a deal, |
---|
0:24:26 | okay. |
---|
0:24:27 | And customer which choose table to sit |
---|
0:24:29 | or open a new table. |
---|
0:24:32 | For the Chinese restaurant process we have infinite number of a tables, |
---|
0:24:36 | with infinite number of seats. |
---|
0:24:38 | A customer enters and with predefined probably to |
---|
0:24:43 | either choose table to sit or open new table. |
---|
0:24:46 | This is non-strategic because it's parametric, okay. |
---|
0:24:50 | So now |
---|
0:24:51 | we introduce Chinese restaurant game. |
---|
0:24:53 | We want to introduce this strategic behaviour, this decision making here. Why? Because |
---|
0:24:58 | eventually we want to understand the negative |
---|
0:25:02 | externality effect, to model it and to understand that. |
---|
0:25:06 | Okay, so this Chinese restaurant game. |
---|
0:25:09 | We have table |
---|
0:25:11 | with unknown size |
---|
0:25:12 | Rx (θ): |
---|
0:25:13 | is system state (θ). System state is the condition of the restaurant. How much money |
---|
0:25:16 | they have, how much |
---|
0:25:17 | budget they have and the environment, okay. |
---|
0:25:19 | A customer has signal S about the system state from |
---|
0:25:22 | advertisement, from friends, from wherever they have been before. |
---|
0:25:26 | Or from advertisement. |
---|
0:25:28 | Which table to seat |
---|
0:25:30 | to have maximum utility? |
---|
0:25:32 | You know that if you go to chinese restaurant some day it'd very unfortunate many |
---|
0:25:36 | people seat |
---|
0:25:36 | with you and eat, is that right? |
---|
0:25:39 | Utilities is very bad. It's not confortable at all. |
---|
0:25:42 | And this is a negative network externality. |
---|
0:25:44 | More customers at one table less space for each user and therefore |
---|
0:25:49 | utility is less |
---|
0:25:50 | for past and future. |
---|
0:25:52 | What does that mean in future? |
---|
0:25:54 | That mean |
---|
0:25:55 | when you come in |
---|
0:25:57 | based on the condition you sit at a table that may not be the best |
---|
0:26:01 | decision. |
---|
0:26:03 | In ten minutes |
---|
0:26:04 | ten people may sit with you, okay. So you need to predict the future and |
---|
0:26:08 | that is the problem and the |
---|
0:26:09 | future affect you decision right now. |
---|
0:26:11 | And that make it in fact difficult. |
---|
0:26:15 | Okay, so this is a sequential decision making problem, customers make decisions sequentially |
---|
0:26:20 | and information of observation that we have is that for every customer comes, |
---|
0:26:24 | every player, they can only seat |
---|
0:26:27 | at that moment how many people seat at the table. |
---|
0:26:30 | And then also the signal, okay, from those who seat before him |
---|
0:26:34 | that's all he can get. |
---|
0:26:35 | So when first customer arrives |
---|
0:26:38 | it picks up a table |
---|
0:26:39 | and then the second one |
---|
0:26:40 | pick another table, and so on and everybody picks one. Sequential decision making process. |
---|
0:26:49 | Can |
---|
0:26:49 | number three user make the best decision at that moment? |
---|
0:26:53 | No. |
---|
0:26:54 | If he cannot predict the future and see the future his decision at that moment |
---|
0:26:58 | may not be the |
---|
0:26:59 | best, because people not yet coming are going to affect him, |
---|
0:27:03 | okay. |
---|
0:27:04 | So now let's assume there is a perfect signal, meaning that |
---|
0:27:07 | everybody know all the signals, all the decisions other people may want to do given |
---|
0:27:11 | that |
---|
0:27:11 | condition that they have. |
---|
0:27:15 | I would not get to the details so I'll explain |
---|
0:27:17 | then there is equilibrium grouping meaning the best utility is |
---|
0:27:21 | that when you choose that |
---|
0:27:24 | with this signal from everybody it will be the best. |
---|
0:27:29 | When anyone changes to other table |
---|
0:27:32 | the utility will only reduce. |
---|
0:27:34 | So that would be the best strategy. |
---|
0:27:36 | When a signal is perfect. |
---|
0:27:38 | Let me explain this. So what will be the strategy? |
---|
0:27:41 | When everything was, |
---|
0:27:43 | all the condition if I come in I will know |
---|
0:27:46 | ten minutes or thirty minutes later of what people come here |
---|
0:27:48 | I have the perfect information. You know what? |
---|
0:27:50 | If I come in |
---|
0:27:51 | I'm going to know which table have the best utility and I'm going to choose |
---|
0:27:55 | that. |
---|
0:27:55 | The second one will choose that also, until the table is filled. And then they |
---|
0:27:58 | will go to the |
---|
0:27:59 | second best utility and do so, because the signal is perfect, right. |
---|
0:28:03 | So |
---|
0:28:05 | customers who come early |
---|
0:28:07 | have |
---|
0:28:08 | advantage |
---|
0:28:10 | but those were protect signals. |
---|
0:28:11 | In real life |
---|
0:28:13 | it's not that easy. |
---|
0:28:14 | It's imperfect signal. We cannot have the |
---|
0:28:17 | perfect signal and therefore we have to learn |
---|
0:28:20 | from the noisy signal to form our belief and this is the learning process |
---|
0:28:25 | and we can use Bayesian learning or some other thing to construct the belief that |
---|
0:28:29 | we have, |
---|
0:28:31 | okay. |
---|
0:28:32 | So |
---|
0:28:35 | now this. |
---|
0:28:36 | What is the best response? The best response is: |
---|
0:28:39 | let's choose the best utility. |
---|
0:28:42 | But in order to choose the best utility we need to know the final distribution |
---|
0:28:46 | of the users and like |
---|
0:28:48 | I said we need to know who come in, where they will seat? |
---|
0:28:52 | So this utility function depend on subsequent users' decisions |
---|
0:28:56 | and this is a negative network externality. |
---|
0:29:00 | And we need to predict the future decisions to find a best response |
---|
0:29:03 | therefore we had to ?? backward induction to do so. So that everybody is going |
---|
0:29:08 | to have optimal |
---|
0:29:09 | decision we can predict and then backward induction from there. |
---|
0:29:13 | Okay. Now this is the summary of this Chinese restaurant game and this is some |
---|
0:29:18 | example. |
---|
0:29:19 | Remember this? |
---|
0:29:20 | I say that |
---|
0:29:21 | if there is a good Groupon deal |
---|
0:29:25 | the utility will go down like that, okay. The red one is the real data. |
---|
0:29:29 | And now let's use linear model in this model |
---|
0:29:32 | the utility, okay. |
---|
0:29:33 | So |
---|
0:29:34 | this is utility that we model. We learnt from the real data that this is |
---|
0:29:37 | the utility. |
---|
0:29:38 | And based on this utility |
---|
0:29:40 | this utility and then we can attain all these parameters that we have and then |
---|
0:29:44 | we can use |
---|
0:29:46 | what we had just derived. |
---|
0:29:47 | We consider all this negative network extarnalities. We can predict the future to find optimal |
---|
0:29:52 | result. |
---|
0:29:53 | As you can see the blue one is learning with negative network externality that we |
---|
0:29:58 | propose and then |
---|
0:29:59 | the red one is without. Meaning only at that moment. |
---|
0:30:02 | We can increase ?? DSSR rating. |
---|
0:30:04 | We can indeed increased the rating by doing so. |
---|
0:30:08 | So if we consider this |
---|
0:30:11 | negative network externality then the strategy will perform much better. |
---|
0:30:17 | In fact we can also use this to devise strategy. |
---|
0:30:20 | This is a New restaurant's strategy, okay. Now we have a new restaurant. |
---|
0:30:24 | This is a low quality restaurant and a high quality restaurant. |
---|
0:30:29 | Now |
---|
0:30:30 | what is the number of customers? |
---|
0:30:34 | Look at this. For a low quality restaurant |
---|
0:30:37 | in order to increase the number of customers who arrive |
---|
0:30:41 | the deal price had to be less. |
---|
0:30:45 | If your quality is not good, you make it cheaper. |
---|
0:30:49 | If |
---|
0:30:50 | high quality restaurant even if deal price is a little higher you can still make |
---|
0:30:55 | a good profit. |
---|
0:30:57 | This is a signal quality. Your advertisement and also all affects to be none. |
---|
0:31:03 | If you a high-quality restaurant you want signal quality |
---|
0:31:06 | to be very good, so people know this is a good restaurant. |
---|
0:31:10 | If you have a low quality restaurant. Don't let know too much about you, okay. |
---|
0:31:14 | That would be better. |
---|
0:31:16 | And next we get to revenue. This is now what is the revenue? |
---|
0:31:20 | High quality and low high quality restaurants they can also all achive peak revenue. |
---|
0:31:26 | So |
---|
0:31:27 | for high quality restaurant to achive peak revenue |
---|
0:31:30 | the signal equality had to be good |
---|
0:31:32 | then the revenue was higher. Let people know |
---|
0:31:34 | and then |
---|
0:31:36 | through the word of mouth people will come. |
---|
0:31:39 | However for low quality restaurant it's not good. |
---|
0:31:42 | Don't let people.. |
---|
0:31:44 | The better the signal quality the lower revenue you are going to have and see. |
---|
0:31:49 | Okay, so I think this all is common sense, however it's highly nonlinear |
---|
0:31:53 | and we had to model that and this is indeed a very highly nonlinear problem |
---|
0:31:57 | and I don't have this |
---|
0:31:59 | slide to show you. And in our paper you can take a look. So high |
---|
0:32:03 | quality restaurant should try |
---|
0:32:04 | to increase the signal quality and low quality address should hide the |
---|
0:32:09 | quality information and use low deal price to attract the customers. |
---|
0:32:15 | Indeed we developped a family of this. We had a Chinese restaurant game and we |
---|
0:32:18 | also had Dynamic Chinese |
---|
0:32:19 | restaurant game for people who come and go. |
---|
0:32:21 | And that is for example just like this Wi-Fi, okay. Many people come here and |
---|
0:32:25 | many people may leave and |
---|
0:32:27 | they decide which network they want to choose. And also we had this Indian Buffet |
---|
0:32:30 | Game. |
---|
0:32:31 | And that is |
---|
0:32:32 | interesting. We see all this in US, okay. |
---|
0:32:36 | Chinese restaurant in this all round tables you can |
---|
0:32:39 | see in Indian restaurant also had these buffet thing. |
---|
0:32:42 | Okay so |
---|
0:32:43 | we have multiple |
---|
0:32:45 | choice of what we can do |
---|
0:32:47 | and this can be applied to many different scenarios. |
---|
0:32:50 | And we've applied to for example we have seen this in Yelp, okay. There is |
---|
0:32:54 | Yelp on Amazon. |
---|
0:32:55 | And this is so for this review, okay. It's a sequential decision and Question and |
---|
0:33:00 | answer sites. You can see |
---|
0:33:01 | all these Question and Answers sites, we also had a problem on this. |
---|
0:33:05 | Also many other sites as long as you have this sequential decision process that come |
---|
0:33:10 | in |
---|
0:33:10 | and why these .. we can apply this to that, because social computing system in |
---|
0:33:16 | this case, they are all structured in the same fashion that |
---|
0:33:20 | users arise sequentially |
---|
0:33:21 | and to make decisions on |
---|
0:33:23 | whether to do something or not, whether to forward something or not. |
---|
0:33:26 | And also there are externalities amount users. |
---|
0:33:29 | Your decision, your actions will affect each other and also there is some uncertainty. So |
---|
0:33:34 | in this |
---|
0:33:35 | sense Chinese restaurant game will be a natural choice in modeling and analyzing these users' |
---|
0:33:40 | behavior, so |
---|
0:33:41 | that we can come up with the best |
---|
0:33:43 | piece of the model we have seen and also to devise some incentive mechanism, |
---|
0:33:47 | okay. To attain some results that you want. |
---|
0:33:50 | And I believe that in speech and language community |
---|
0:33:53 | that you also have something that you may want to do. |
---|
0:33:56 | Speaking of this mechanism |
---|
0:33:58 | we want to see this I believe that you use a lot of this a |
---|
0:34:01 | labeling also in this community. |
---|
0:34:03 | Now you have seen all this |
---|
0:34:05 | that's all from users' point of view, so you may ask can |
---|
0:34:08 | system designer design something based on this? |
---|
0:34:10 | Yes. |
---|
0:34:11 | So we want to see if we can design some mechanism to |
---|
0:34:14 | attain something that we really want. |
---|
0:34:17 | In this case I want to show you |
---|
0:34:19 | is that how can we collect |
---|
0:34:21 | good |
---|
0:34:22 | data |
---|
0:34:23 | okay. |
---|
0:34:25 | Large scale labeled dataset |
---|
0:34:28 | is very important in many applications. I think you all know better than me. |
---|
0:34:31 | Because more data leads to better accuracy, |
---|
0:34:34 | however large scale in-house annotation is very expensive. |
---|
0:34:38 | You need to have a thick pocket to do so. |
---|
0:34:41 | And often it becomes a bottleneck for that. |
---|
0:34:43 | And recently you have seen that microtask crowdsourcing is very popular. |
---|
0:34:49 | It claims to be a very promising way, because it can handle large volume, short |
---|
0:34:53 | time and low cost. |
---|
0:34:54 | All the benefits |
---|
0:34:56 | and a question is: is that so? |
---|
0:34:58 | We have many website that can do so, |
---|
0:35:01 | okay. |
---|
0:35:03 | There's a problem |
---|
0:35:04 | you see all the |
---|
0:35:05 | positive side, but there's something that didn't tell you. |
---|
0:35:08 | That |
---|
0:35:09 | due to limited budget, okay you and I don't have deep pockets, okay |
---|
0:35:13 | so that the collected data often have low quality. |
---|
0:35:18 | Why |
---|
0:35:19 | low quality? I will explain that. T mutual incentive thing, okay. |
---|
0:35:22 | Now |
---|
0:35:24 | for machine learning all signal processing, okay. |
---|
0:35:27 | Sometime we say okay |
---|
0:35:29 | let's deal with it, let's cope with this |
---|
0:35:31 | so let's filter out low quality data or cope with |
---|
0:35:34 | noise with this modified algorithm to be more robust and probable. |
---|
0:35:38 | But there is after-effect, |
---|
0:35:40 | okay. This is a our typical behavior up to that. |
---|
0:35:44 | Now with these decision learing solution |
---|
0:35:47 | maybe we can ask |
---|
0:35:49 | can we do a better job before that? |
---|
0:35:51 | Can we incentivize |
---|
0:35:54 | with a mechanism to ensure high quality data from the first place? |
---|
0:36:00 | Okay |
---|
0:36:02 | you want to give me more money? No, I don't have money to get you. |
---|
0:36:05 | Let's see if we can detain ?? something |
---|
0:36:07 | that can make you happy. |
---|
0:36:09 | We allow |
---|
0:36:09 | increase in your budget, maybe reduce your budget, okay. What that is |
---|
0:36:13 | okay. |
---|
0:36:15 | This |
---|
0:36:16 | crowdsourcing |
---|
0:36:18 | has such characteristic |
---|
0:36:20 | repetitive and tedious small tasks. Yes or no, correct or not correct. And you do |
---|
0:36:25 | that |
---|
0:36:25 | and how much do they earn? |
---|
0:36:27 | USD 0.04 cents for one thing. |
---|
0:36:29 | Too many |
---|
0:36:31 | too many of them have house or wife and they have nothing to do, |
---|
0:36:34 | when baby is sleeping they can make a little bit more budget, so they can |
---|
0:36:38 | help pay for some of the bill. |
---|
0:36:41 | So worker receives a small amount of reward and there's no competion among workers. |
---|
0:36:45 | So what's the problem? |
---|
0:36:47 | The problem is that this may lack proper incentive. |
---|
0:36:51 | It is profitable for a worker to submit poor solutions. |
---|
0:36:55 | Yeah? |
---|
0:36:56 | Because of the |
---|
0:36:58 | the number of solution is just too high, nobody is going to check on it. |
---|
0:37:02 | It'd be too expensive for you to check on it, right? |
---|
0:37:04 | Think about it. That's why it's expensive. |
---|
0:37:07 | So |
---|
0:37:08 | and however you want to check on all this, just like I said. |
---|
0:37:11 | Provide incentive is challanging. Why? Requesters typically have low budgets for each task |
---|
0:37:18 | and also to offer better incentive you have to pay more, so people can do |
---|
0:37:22 | better job, you need more |
---|
0:37:23 | money and also in order to verify the solution is also expensive. All these are |
---|
0:37:28 | problems. |
---|
0:37:29 | So what incentive mechanism should |
---|
0:37:31 | requesters employ to collect high quality solutions in a cost-effective way? |
---|
0:37:38 | So here we propose a mechanism |
---|
0:37:40 | that sounds interesting and we also have the Oracle analysts that found it can work |
---|
0:37:44 | and we also did |
---|
0:37:45 | experiments, so what is that? I'll show you. |
---|
0:37:47 | Now up to now there are two very basic reward mechanisms |
---|
0:37:51 | that we can do so. That we can evaluate the solution. |
---|
0:37:54 | One is called consensus mechanism. Consensus mean what? Okay |
---|
0:37:58 | now I have all these solutions people submitted. |
---|
0:38:00 | I do a majority vote. I don't know the right answer but |
---|
0:38:04 | I do a majority check. After majority check I know it's correct. |
---|
0:38:10 | Correct fits the majority check. |
---|
0:38:12 | Or I can do reward accuracy mechanism. Meaning I can |
---|
0:38:17 | I can verify |
---|
0:38:19 | the solution is correct or not with probability. |
---|
0:38:21 | It's too expensive with probability. I want to verify all of them, but I don't |
---|
0:38:25 | have time. |
---|
0:38:25 | If I had time I'd it myself. |
---|
0:38:27 | Okay, now I have these people so I may have some probability points, probability principle, |
---|
0:38:32 | probability |
---|
0:38:32 | that I'm going to make a sample |
---|
0:38:34 | and then to check on the solution. |
---|
0:38:37 | Now you can see these two had cost C or ?? |
---|
0:38:41 | And this relates to a parameter and this is not under control of requester. |
---|
0:38:46 | So there's a problem here. |
---|
0:38:48 | This cost, the one I mentioned, is a fundamental problem. |
---|
0:38:51 | For the traditional (methods) it's a problem. |
---|
0:38:54 | So |
---|
0:38:55 | now |
---|
0:38:56 | before we do so |
---|
0:38:58 | let's |
---|
0:38:59 | come up with this incentive mechanism first that's called Mt, okay. |
---|
0:39:03 | This is called Incentive Mechanism via Training. Idea is this |
---|
0:39:06 | to employ quality-aware training to reduce mechanism cost. |
---|
0:39:11 | What does that mean? |
---|
0:39:13 | We will assign |
---|
0:39:15 | enforced unpaid training to workers when |
---|
0:39:18 | performing poorly. |
---|
0:39:20 | You flip off one poorly |
---|
0:39:22 | then they had to go to some training time and in this training time they're |
---|
0:39:25 | not going to make any money |
---|
0:39:27 | okay and to guess some credential back so before they can do that again. |
---|
0:39:32 | So this is more like a punishment if they don't do well. |
---|
0:39:35 | So |
---|
0:39:36 | okay now we had two states. Everybody had two states. |
---|
0:39:39 | What are they? |
---|
0:39:40 | Okay this is normal. |
---|
0:39:42 | They produce results, give them money so that they are happy. |
---|
0:39:46 | However |
---|
0:39:47 | if they don't do well |
---|
0:39:49 | they are going to go to the training state. |
---|
0:39:51 | And in this training state workers have to do a set of training tasks. |
---|
0:39:56 | Unpaid. |
---|
0:39:58 | Okay |
---|
0:39:58 | to gain qualification |
---|
0:40:00 | for the working state. |
---|
0:40:03 | Okay this can be just a policy |
---|
0:40:05 | to deter it, |
---|
0:40:06 | not necessary may be implemented. You will see that. |
---|
0:40:09 | So everybody now |
---|
0:40:10 | had two states. |
---|
0:40:11 | Now you can see two states. |
---|
0:40:13 | One of these is worker's action at working state. |
---|
0:40:16 | Now everybody's action at working state. |
---|
0:40:19 | This will determine the probability of this worker may continue to be in working state |
---|
0:40:24 | or |
---|
0:40:24 | he may go to the training state. |
---|
0:40:26 | And for how long it may get from training state back to working state. |
---|
0:40:31 | Worker will prefer to be here to make money. |
---|
0:40:34 | However if the job is not good enough he is going to go here. Nobody |
---|
0:40:38 | want to come over here. |
---|
0:40:41 | Okay, so worker's action at this moment |
---|
0:40:44 | may not only affect his immediate utility, but also his future utility or both |
---|
0:40:51 | together. |
---|
0:40:52 | There's a notion of this long-term expected utility here |
---|
0:40:56 | and basically for you formulate this problem |
---|
0:40:58 | it is |
---|
0:41:00 | is a like the |
---|
0:41:01 | Markov Decision Process (MDP). |
---|
0:41:02 | But this Markov Decision Process is not easy, because that |
---|
0:41:07 | the |
---|
0:41:08 | this MDP faced by each workers also depends on others' actions. |
---|
0:41:14 | Our action, how's the quality of our work is depending on how others are doing |
---|
0:41:19 | their quality work, so it is indeed |
---|
0:41:21 | Game-Theoretic Markov Decision Process. That means what you are doing affects what I am doing. |
---|
0:41:27 | What I'm doing will affect what you're doing. |
---|
0:41:29 | Okay |
---|
0:41:30 | so from here |
---|
0:41:31 | I will just summarize the results. Basically we show that |
---|
0:41:34 | with this Game-Theoretic MDP |
---|
0:41:37 | there is a guarantee of Nash Equilibrium |
---|
0:41:39 | as long as the number of training set N is large enough. |
---|
0:41:44 | If the number or training set, |
---|
0:41:45 | I mean the training problem to be done N is too small, |
---|
0:41:48 | nobody will worry, okay. I go to the training state, very soon I can get |
---|
0:41:52 | out just like somebody commits a crime, |
---|
0:41:54 | you put them in the jail and one day |
---|
0:41:56 | they come out. You know what? Many people walk out and in everyday. |
---|
0:41:59 | If you put them in for ten years. Very few will .. they will think |
---|
0:42:03 | twice before do something again. |
---|
0:42:05 | Okay, now |
---|
0:42:06 | we also show that |
---|
0:42:08 | if we |
---|
0:42:08 | follow certain design procedure |
---|
0:42:11 | we can achieve arbitrary low mechanism cost. The cost can be even |
---|
0:42:15 | very small, almost to zero, if we design it correctly. |
---|
0:42:21 | Okay, now let's see the experiment setup. |
---|
0:42:25 | Task, a very simple task |
---|
0:42:27 | this is to calculate the sum of two randomly generated double digit numbers. |
---|
0:42:33 | Second year, |
---|
0:42:34 | second grade student can do so, okay. And (workers are given) ten point for each |
---|
0:42:38 | accepted solution. |
---|
0:42:39 | And the goal is for |
---|
0:42:40 | these participants to maximize or to accumulate these points. This is not real money. We |
---|
0:42:45 | work |
---|
0:42:45 | with points, we are doing an experiment. |
---|
0:42:48 | And the text assigned to these participants comes in three sets. And each one is |
---|
0:42:53 | limited to three minutes, okay. |
---|
0:42:54 | Only in this limit had to do so. |
---|
0:42:58 | And so we had these three sets. |
---|
0:43:00 | The first set is that reward accuracy mechanism with sampling probability 1, that mean |
---|
0:43:05 | in this |
---|
0:43:06 | in this set |
---|
0:43:08 | we tell all the participants |
---|
0:43:10 | that we are going to check your every solutions. |
---|
0:43:13 | And the set number two |
---|
0:43:15 | with sampling probability 3. |
---|
0:43:17 | That we will only have thirty percent or probability 0.3 to check your solutions. |
---|
0:43:23 | That is set two. Set three is |
---|
0:43:26 | sampling probability policy 0.3 only, however we also tell them we also have this incentive |
---|
0:43:31 | mechanism. |
---|
0:43:32 | That is if you don't do well enough |
---|
0:43:34 | you are going to do some training problem before you can come back to do |
---|
0:43:38 | that. |
---|
0:43:39 | Okay, so what are the results that we get? |
---|
0:43:42 | If we tell them that we are going to have this sampling probability 1, you |
---|
0:43:46 | can tell that |
---|
0:43:47 | the accuracy, this is the histogram, number of wokers and their accuracy, the accuracy is |
---|
0:43:52 | quite good. |
---|
0:43:55 | With 0.3 probability you can see that most of them don't care |
---|
0:43:59 | okay, because only |
---|
0:44:02 | 0.3 probability will be checked. |
---|
0:44:04 | And this is the sampling probability. |
---|
0:44:08 | You can say .. how about this guy? You know there's always |
---|
0:44:11 | nerd, okay. And they always do good job. |
---|
0:44:15 | And now with this |
---|
0:44:17 | sampling probability 0.3. We tell them: hey we have this, if you got checked, you |
---|
0:44:21 | are going to put |
---|
0:44:22 | into there. Now you can see the result is basicly as good as this result. |
---|
0:44:29 | And you say: Okay, who are these students?, can they do this easily? |
---|
0:44:33 | They are all |
---|
0:44:34 | engineering students, graduated student, okay. With forty one of them to participate. |
---|
0:44:40 | So these show that |
---|
0:44:41 | in fact with such machanism we indeed can collect good data from beginning. We don't |
---|
0:44:46 | have to |
---|
0:44:47 | take it saying that data is not good enough and just take it from there. |
---|
0:44:52 | Let's |
---|
0:44:53 | do something you know |
---|
0:44:56 | do to better if we can collect much better data. |
---|
0:45:00 | Okay, the conclusion and final thought here is that here we see three social media |
---|
0:45:06 | examples |
---|
0:45:08 | to illustrate the concept of decision learning. |
---|
0:45:12 | In this information diffusion problem |
---|
0:45:15 | we |
---|
0:45:16 | try to learn |
---|
0:45:18 | users' utility |
---|
0:45:20 | functions |
---|
0:45:21 | for understanding and modeling strategic decision making. |
---|
0:45:25 | And for the Groupon and Yelp example that we've seen |
---|
0:45:28 | we try to learn from each others' interactions. |
---|
0:45:32 | We have this negative network externality |
---|
0:45:35 | for better decision making. |
---|
0:45:37 | And for this crowdsouricng we design a mechanism to |
---|
0:45:41 | enforce users' strategic behaviour to attain better quality data for better learning. |
---|
0:45:47 | So |
---|
0:45:49 | by decision learning |
---|
0:45:51 | we mean that it is the learning with strategic decision making. |
---|
0:45:55 | And these two are something that has to be considered together. |
---|
0:45:59 | And we can analyze users' optimal behaviour from user's perspective |
---|
0:46:03 | and we can also design optimal mechanism from |
---|
0:46:05 | system designers' perspective. |
---|
0:46:08 | Now for the |
---|
0:46:10 | coming big data tsunami. This is something that we envision. |
---|
0:46:14 | We are going to have big data here |
---|
0:46:17 | and these big data and |
---|
0:46:18 | is not |
---|
0:46:20 | ?? steady data overlay |
---|
0:46:22 | that |
---|
0:46:23 | Bob and Alice is going to |
---|
0:46:26 | learn |
---|
0:46:27 | from this data and be it whatever model here. I say Markov decision process, it |
---|
0:46:30 | can be anything to learn from here. |
---|
0:46:32 | Once you learn something from here, you are going to make decision. |
---|
0:46:35 | But (when) you make decision you are going to take action and most of these |
---|
0:46:38 | are sequential actions. We are at |
---|
0:46:40 | different places and different people. |
---|
0:46:42 | And then you have the outcome and this outcome will come back to affect the |
---|
0:46:46 | data. |
---|
0:46:47 | Like I said the data is not static data |
---|
0:46:49 | and therefore |
---|
0:46:50 | Bob and Alice |
---|
0:46:52 | one can be in Singapore and one can be in US, their actions and their |
---|
0:46:56 | decisions will eventually affect each other, |
---|
0:46:57 | because the data have been changed. |
---|
0:47:00 | And this is something that |
---|
0:47:02 | we believe that is something that |
---|
0:47:04 | both |
---|
0:47:05 | challenging and the potential future for research. And I believe that |
---|
0:47:12 | this is something that also can be |
---|
0:47:15 | very promising in speech and language community. |
---|
0:47:20 | Okay, that's it. End of story. Thank you for your attention. |
---|
0:47:32 | So I am glad, I think we all made a good decision to come to |
---|
0:47:36 | professor's Ray Liu talk. |
---|
0:47:38 | We have time for few questions. |
---|
0:47:47 | I think this was a wonderful talk. |
---|
0:47:49 | I've noticed that in your Chinese restaurant decision process that you had this backward reduction |
---|
0:47:54 | stamp and |
---|
0:47:55 | you showed your backward reduction for just one time step, from T+1 to T. |
---|
0:48:00 | The question is: is it computationally feasible to do backward reduction from the end of |
---|
0:48:05 | the end of time to different time? |
---|
0:48:07 | Oh yes. That's correct. |
---|
0:48:49 | I think it's surprising issue, right? I mean it's easier |
---|
0:48:52 | okay to make one cent for task, if they are more difficult they make 10 |
---|
0:48:55 | cent for task and they had to think twice |
---|
0:48:57 | right? Which one can make better profit? |
---|
0:49:01 | So I think that different |
---|
0:49:03 | what I show is just one dimension, meaning I just used one example. In that |
---|
0:49:08 | particular we |
---|
0:49:08 | designed mechanism in that way. |
---|
0:49:11 | In a different scenario |
---|
0:49:12 | you can design different mechanism in a different way. I think you should be all |
---|
0:49:17 | fine. There's |
---|
0:49:17 | no single solution here. This is just a concept, meaning that we can do so. |
---|
0:49:22 | And I used an example to illustrate that. |
---|
0:49:27 | Do I have to push? Yes. |
---|
0:49:29 | I am a little intriqued by the experiments that you have done. |
---|
0:49:32 | I'm just wondering that if we look at social networking |
---|
0:49:36 | you know first time you are using it, maybe you're kind of getting |
---|
0:49:42 | bogged by that information diffusion that happens. |
---|
0:49:46 | But you know second time or third time user, you just ignore all the incentives. |
---|
0:49:50 | I'm just going to do what I normally do. |
---|
0:49:52 | Say every time you do that experiments I am going to get a new set |
---|
0:49:55 | of |
---|
0:49:56 | people who are |
---|
0:49:57 | using this and then making decisions based on that? |
---|
0:50:02 | Okay, because there are some microphone problems I didn't hear clearly. But let me |
---|
0:50:08 | guess that I think you are saying that |
---|
0:50:10 | if they do many times they have different experiences so may game the system, right? |
---|
0:50:16 | So I think it would not (happen), because we use game theory to formulate that, |
---|
0:50:20 | we found |
---|
0:50:20 | ultimate equilibrium in that solution, so |
---|
0:50:23 | basicly doesn't matter how they game, |
---|
0:50:25 | they will have the best strategy that can achieve their best equilibrium solution for that |
---|
0:50:30 | so doesn't matter what they game. They only have |
---|
0:50:33 | some of the best strategy they can do, they can go. |
---|
0:50:35 | Whatever other strategy would not get them better so it's okay. |
---|
0:50:44 | Anymore questions? |
---|
0:50:53 | So seems like they want to make a decision to go for coffee |
---|
0:50:57 | early, but before that |
---|
0:50:59 | I would like to show our gratitude to a professor Ray Liu and our vice-president |
---|
0:51:05 | Haizhou will present |
---|
0:51:06 | a souvenir to professor Liu. |
---|