Episode Transcript
IM: Sadie, I know this podcast is all about the practical applications of mathematics and statistics, but do you ever feel like you’re still stuck in the ivory tower?
SW: You mean, do I feel limited by talking with academics about their research?
IM: Yeah, I think almost every single one of our episodes focuses on a PhD researcher’s field of interest. And compared to the rest of the human population, that’s an over-represented sample.
SW: Woooooow, throw statistics back at me to make a point huh? But yeah, you’re right. Invariably we end up focusing on research done within universities or research institutions. But there are interesting, fun ways that non-mathematicians can get involved in mathematical research.
Kathryn Leonard 07:15
the only way that we can map sort of our automated processes onto a human perception is to get some humans to tell us what they're perceiving. And so that's where this this project came from.
IM: But before we get to this project, we should introduce ourselves and our guest.
SW: Guests, plural actually! But you’re totally right. I’m Sadie Witkowski
IM: and I’m Ian Martin
SW: And you’re listening to Carry the Two, a podcast from the Institute for Mathematical and Statistical Innovation, AKA IMSI.
IM: This is the podcast where Sadie and I talk about the real world applications of mathematical and statistical research.
SW: We might seem like an odd couple to tackle these topics. I’m a cognitive neuroscientist and Ian is a high school choir teacher!
IM: But today, we’re going to show you that you don’t need to have a degree in mathematics to help further research in the mathematical fields.
SW: So let’s meet our lead researchers who put together a study that relied on lots and lots of non-expert people participating. First up, we have Axel.
Axel Carlier 02:25
So crowdsourcing is like, let's say, amorphous in terms of motivation, from the people who really provide the annotations, insights about media, or something.
SW: Axel Carlier is an assistant professor in the University of Toulouse in France, and he conducted this particular study with a fellow professor at Occidental College in Los Angeles, Kathryn Leonard.
Kathryn Leonard 28:16
Perceptual Studies are often a part of shape, mathematical shape research, if you're building models to represent a thing that humans care about.
IM: So I’m hearing about shapes and crowdsourcing… how does this tie into non-academic research again?
SW: Right, we’re kind of throwing around three separate terms here: crowdsourcing, citizen science, and community science.
IM: Crowdsourcing I at least feel like I’ve heard of. That’s when you get a crowd of volunteers to help out on a project, collecting data or doing a specific task. It could be as simple as an online poll…
SW: Or could be more complex, like responding to “captcha” requests where you have to identify all the images with a bicycle in them. By getting lots of people to respond to that question on a set of images, you can develop a big data file that you can use to train a visual machine learning system.
IM: Right, I’ve heard that before. So how does that differ from citizen or community science?
SW: My understanding is that citizen science is a slightly older and more common term, but is similar to community science. I like National Geographic’s definition of citizen science: the practice of public participation and collaboration in scientific research to increase scientific knowledge.
IM: Ok, so then how is that different from community science?
SW: Well citizen science was originally meant to distinguish amateur data collectors or analysis from professional scientists, but it’s taken on connotations around citizenship status. IE, where you are born or how you came to live in the United States.
IM: So community science is a more inclusive term for the same idea.
SW: And using community science focuses on the social nature of such data collection or analysis. That it’s about collective action, undertaken by a group of people, not just individuals.
IM: Honestly, these all kind of sound like different ways of saying the same thing. A couch is a sofa is a place where I spend most of my free time.
SW: I could make an argument between the three, but when it comes to statistical research like what we’re discussing today, maybe crowdsourcing is the best term?
Axel Carlier 01:14
Because crowdsourcing is a term that has been created to designate these situations where you need to ask people that are not necessarily experts in the field, their advice on something, which in our case, was how do you segment the shape? And so and so in crowdsourcing, people don't necessarily know that they are actually helping science in any way. Whereas in citizen science that they know. Because it's been made clear. So I would at least make this difference. In our case, we we may let them know that we did that. But but it's not necessarily always the case.
IM: Oooooo, so like M-turk pays people to distinguish between a corgi butt and a loaf of bread? You might not know how you’re helping science, even though computer scientists are using your responses to teach AI to tell the difference between them.
SW: Yeah. Axel actually brought up the example of m-turk, or mechanical turk, as a really common way we see this kind of crowd sourcing for scientific research. Of course….
Kathryn Leonard 03:09 (lil later tho)
And Mechanical Turk that Axel mentioned often pays incredibly tiny amounts of money from a US dollar perspective to people in other countries. And again, that's kind of exploitive. They are, I would imagine that community science and citizen science, on the other hand, fully voluntary participation, [it’s] a conscious decision to participate.
IM: That makes total sense. But let’s get to the meat of this episode! How are Kathryn & Axel using crowdsourcing for research?
SW: Let’s start with the ‘why.’
IM: [fake, annoyed sigh] WHY are Kathryn and Axel using crowdsourcing in their research?
SW: [annoying bright] I’m so glad you asked!
Kathryn Leonard 05:31
Human perception is super wriggly. So, so a few things one, in the 20s, a psychologist tried to write down a list of rules for perceptual similarity. And for every rule that he wrote down, he could find a bunch of examples that follow the rule and a bunch of examples that didn't follow the rule. And so he had to add another rule. And his conclusion was, there is no set of rules that can encompass human perception of similarity.
SW: Kathryn explains that their research on understanding shape starts from this primary conundrum. It’s really hard to parse complex shapes into their constituent parts.
Kathryn Leonard 04:36
So this project came out of a much larger project on shape understanding. So if we want to automate processes that can take a new shape and figure out what its parts are, figure out which parts are similar to each other and maybe based on that similarity determine that this is an elephant and not a refrigerator. that requires That requires kind of an understanding of what the important parts are to humans.
IM: So if we want to teach computers to recognize parts of an animal or distinguish between two refrigerators with different organization, we need to understand shape perception.
SW: AND we need to understand how humans do this weird, poorly defined process before we can even hope to teach computers to do something similar.
IM: Hence, the crowdsourcing?
SW: Exactly.
IM: How did they advertise this project anyways? I mean, I’m all for furthering science and whatnot, but I wouldn’t even know where to get involved with a project like this.
SW: You’re not wrong. Most community science projects are run by nature conservation organizations or science museums or other non-profit organizations. For this research… they took a different tack.
Kathryn Leonard 10:36
So Misha Collins is an actor who was on Supernatural
IM: WAIT WAIT WAIT. Isn’t Supernatural that show you’ve been obsessing over? With the two guys from Texas?
SW: Ok I wouldn’t say obsessing…
IM: [interrupting] I would. Did this episode just become a ploy to make me talk about Supernatural?
SW: No! I promise. We’re getting there.
Kathryn Leonard 11:02
he likes to get people together to do elaborate and pointless activities. And so he developed this GISH WHES scavenger hunts. So GISHWHES stands for the Greatest International Scavenger Hunt the world has ever seen. And so the second year of Gish WHES he, the first year was just kind of a small little by mail thing. And the second year, he took it online, and he needed help coming up with items for the for the scavenger hunt list.
IM: Don’t tell me, they somehow got their study added as a scavenger hunt item?
SW: [laughs nervously] Eh-heh, you guessed it! Not that this was Kathryn’s first time contributing items to GISH, as it was later abbreviated.
Kathryn Leonard 11:38
we would sometimes we would troll people, like there was one year that we put the Riemann hypothesis on. That's one of the items proving this is yet so we had little inside jokes. But we also, we also got into the fun.
IM: Riemann hypothesis?
SW: It’s a famous mathematical conjecture that many consider to be an important unsolved problem in pure mathematics.
IM: And they made THAT a scavenger hunt item?
SW: Well the item actually ended up being a sort of joke reference to it… but we’re getting off topic.
IM: Right, [joking] ‘cause you want to go back to talking about Supernatural.
SW: Because I should explain why crowdsourcing through GISH was particularly helpful for their study.
Kathryn Leonard 12:03
I simultaneously was working on this research project where we needed to crowdsource. And so I thought, Aha, I will put this as an item on the scavenger hunt. And then people will be super excited to do it, we'll get a lot of response. It's International. So you won't have the usual Western bias that you often have needs perceptual user studies. And the teams will get points for their participation. And so they'll want to do a good job. And so we did.
SW: Remember, the I in GISH is for international. A lot of crowdsourced research have issues with bias that comes from the all the people looking at the images having the same cultural background.
IM: Ooooo, so that can skew the data. But in this case, you have people from all over participating.
SW: and you have highly motivated people!
IM: Right, you want to do a good job to earn as many points for your team as you can.
SW: And a third aspect that Kathryn and Axel didn’t mention was that by connecting their research to this event, they were able to collect a bunch of responses within a relatively short period of time.
IM: Because the scavenger hunt had a deadline.
SW: Although, that kind of turned out to almost be a problem.
Axel Carlier 14:44
we had to launch our task at a particular moment, and there was a few days and then it was over. So it had to work because because otherwise the there was no second chance. And actually we had some difficulties, but technically We can just say bugs, basically. Which we managed to correct and on time, but it was not so easy.
IM: Let’s come back to the difficulties, but what even were the images that people were segmenting?
SW: Well it’s a collection of a bunch of natural and man-made objects, called the MPEG-7.
Axel Carlier 09:27
the distinction between natural objects like animals, for example, or humans, actually, I think there are humans in the data set. In contrast with the artificial objects that have been made by humans, it was important in the data set, because clearly, there's a semantic that comes in place. And of course, the semantic can be different if the objects are natural on artificial so this was important in our choice, I think.
IM: Ok, so like a picture of a dog but also a picture of a tractor?
SW: Yeah, but without any color or detail and just the outline.
IM: Ok, so what was the difficulty that Axel mentioned?
Axel Carlier 15:22
what actually I didn't anticipate was the number of participants that would do it at the same time,
15:38
when our task was posted, a ton of participants directly connected to the website, which was hosted on the university server, and not a very powerful one. And so the machine just crashed.
SW: [laughing] Basically the university server wasn’t ready for so many people to try and access it all at once! GISH accidentally DDOS attacked the server! You know, when a website gets flooded with requests and shuts down.
IM: But seeing as they still look back on this project fondly, I’m guessing they managed to make it work?
SW: Yeah, they were able to get it running smoothly enough to collect quite a few crowdsourced responses.
Axel Carlier 16:09
we collected quite a lot of data, because I think in the end was more than 1000 participants annotated shapes and on all our shapes, which was nice, because since we know that people have different perception of the shapes, we want it to have redundancy, so multiple person annotating the same shapes. So that we can we can actually infer that, yes, there are multiple different interpretations, and all are valid.
IM: Once they had all these images that multiple people had broken into parts, how did they analyze it?
SW: Let’s take a short break and I’ll explain when we come back.
[musical break]
SW: Now that we’re back, let’s talk about what Axel and Kathryn actually found. How learning how humans break up the visual world can help us better teach machines to do the same thing.
IM: Right, so through GISH Kathryn and Axel had over 1000 people take these images and segment them into parts. But did they tell them to break up, like, an image of a cow into the hoofs and the flanks and the snout? Or…?
SW: Actually no, you couldn’t give specific instructions because that would bias how people segmented them.
Axel Carlier 25:57
what is difficult in this tutorial is that you should try to avoid biasing the you the participants. So the goal was not to explain them, you should annotate like this. It was just to tell them this is the tool that you… it's really to discover the tool, and then let them freely annotate.
IM: Honestly, it’s not like people follow instructions anyway. I can write out very clear directions for a worksheet or test or something and then also tell my students exactly what to do a million times and I will still get at least three kids who have the audacity to ask, “Wait. What are we doing?”. It’s like they want you to come explain it to them individually.
SW: Funny you should say that…
Axel Carlier 25:23
One thing you learn when you do these kinds of experiments online is that people do not read the instructions. That's the rule of thumb. So you should always assume that since people won't read the instructions, you should try to at least make them do something to show that they understand to some extent, what they are doing.
SW: [laughs] Turns out no one reads instructions apparently?! But that didn’t stop Axel from trying to write something clear and concise that participants would understand.
Axel Carlier 26:52
should have lots of people read again, the instructions to make sure they're understandable. Especially when, like me, you're writing in English, and it's not your mother language. And,
Axel Carlier 27:24
And especially it should be people who are not in math for computer science, because people who participate in GISHWHES are probably most of them are not computer scientists or mathematicians. So So yes, they will react different than just the colleagues at the lab.
IM: Right, get an outside opinion on how you’re teaching people to do the task, and then discover no one reads said instructions. Sounds about right.
SW: [Laughs] The bane of every instructor's existence.
IM: Ok, so let’s skip to the part where they collected all the data, all the segmented images.
SW: Right! So the first step was to compare an image to see how much agreement there was between GISHers. Er, I mean people. And they did this through spectral clustering.
IM: Spectral clustering?
SW: It’s a multivariate statistical technique…
IM: [interrupting] Jargon Sadie!
SW: Imagine a bullseye made out of a bunch of discrete points. If we used traditional statistical methods, you can’t separately identify the inner dot from the outer ring. But spectral clustering allows you to recognize those patterns in the data as separate.
Axel Carlier 19:40
So for given shape, let's say you have 25 annotations. So you get 25 by 25 metrics, which tells you for each couple of annotations, how different they are. And once you can, once you did that, you can you can cluster the annotations using spectral clustering And then for each of the clusters, we were able to produce, let's say the, what that means is that each cluster of annotations, regroup annotations that look a lot like, like each other. And from there, we produced some kind of majority votes of the annotations within the same cluster to produce the final final annotation. That is a summary of all these annotations from the same cluster.
IM: I thought I understood, but I think I’m still kind of lost.
SW: Ok let’s try a teacher-based example. You know those plastic transparencies we used to put up on projectors before powerpoint slides?
IM: Ugh I always loved being called to write on them for the teacher...
SW: So imagine everyone got the same outline of a cat that they had to break into parts. And you then laid everyone’s work on top of each other to see if we segmented the tail as a part and the paws as a part of the cat.
Kathryn Leonard 21:16
Spectral clustering is basically just a fancy way of looking at all the different annotations and seeing if users clump. So are there a bunch of users who annotated the shape the same way, more or less, in a few different clumps.
IM: Makes sense. So we want to see the level of agreement between people and spectral clustering is the math-y way to do it.
SW: Or statistical way, but basically yeah. And they wanted to get a sense for the average area of agreement over total area of the shape.
Kathryn Leonard 20:47
So if two shapes were exactly annotated the same, they would have a distance of zero. And if 1% of the area was annotated differently, then they would be have a difference of point to one. And if two shapes were completely differently, they'd have a distance of one.
SW: And now, putting it all together…
Kathryn Leonard 21:33
it really helped tease out the fact that for a lot of these shapes, one representative annotation was not actually appropriate. Going back to that there's no right, like, there's no right way to do this. There's no one correct human perceptual approach. And so I think, too, having two representatives did a good job of representing most of the shapes, but some of them clearly want three. And some of them want even more.
IM: So there’s no one way to skin a cat?
SW: [laughs] Well, segment a cat. But yeah! Sometimes people agreed on one or two ways to break an image into parts. And some other images seemed like there was a lot more variation in how people define the parts or sections.
IM: Seems like there’s still more research to be done.
SW: Isn’t that always the case?
Kathryn Leonard 22:07
one of the cool things about this project was we created this dataset in order to answer a research question that we had to validate our own approach. But in fact, the data set itself raises a bunch of other questions. And there are always more projects, and there are than there is time. And so there are a lot of really interesting questions still unanswered, like looking at sort of the geometric configuration of a shape? Can you tell in advance how many clusters you should, you should expect to have in the annotation? Or some of the shapes were rotated versions of each other? Does rotation of unknown shape like a cow something that we sort of map onto a real object in the world? Does the rotation affect people's annotation? Or? Or is it pretty consistent over rotation as compared to an artificial shape? Something that doesn't exist in the in the world? Does that mean that it's just like a geometric abstraction. Does that rotated, get different annotations, or does it tend to be more consistent there? There are a ton of questions that I want to know the answers to, and I don't have time to go do the research.
[music plays here]
IM: So going back to what I was saying at the beginning of the episode, I still think this falls under research conducted by academics.
SW: It does, but it also recognizes this really important role that non-experts have to play. We SPECIFICALLY need people who aren’t mathematicians or statisticians to participate in order to get studies like this to work.
IM: Is there a coda to this story? Do you have updates on either their study or GISH?
SW: I actually have updates on both! First, when I spoke with Kathryn & Axel, I found out that they’ve had requests to share both their data and how they ran this study with other labs who want to replicate it.
IM: That’s cool! More crowdsourcing research clearly needs to be done!
SW: But sadly, it wont be done through GISH.
IM: Why not?
SW: Well, the annual GISH competition did make it a habit to include community science and crowdsourcing research projects in later years. For example in 2022, one of the challenges was to help nasa by processing an image from the JWST as part of the NASA data challenge.
IM: But you said did?
SW: YEahhhh… full disclosure, I did participate in GISH last year. Shout out to my team, Off Grid GISHers! But afterwards, we found out it was the last year Misha Collins would be running GISH. I think it was an incredible amount of work, so the scavenger hunt has been put on indefinite hiatus.
IM: That’s too bad. You'll have to find some other ways to nurture your parasocial relationship with the actors on Supernatural.
SW: But GISH isn't the only way you can get involved with community science or crowdsourcing for science!
IM: Yeah, who needs a Supernatural actor to convince folks to get involved with research! [looks pointedly at me]
SW: Ok, ok so I might like a show. Sheesh!
IM: We’ll include some links in the show notes to other cool community science projects for our listeners to check out and participate in as part of mathematical and statistical awareness month!
SW: Including some of Axel’s work.
Axel Carlier 29:06
I learned a lot from it. So in terms of good practice, and so nowadays are more focused on let's say, applying deep learning techniques to for image processing, which requires annotated data. And so for example, I have a project right now where I work with the ecologists to recognize birds images can so so we, we, we need a notation, basically. And there are no public database that that contain enough of these annotated images. So I have to design again, interfaces, user studies, I mean, to gather data. So definitely, I would say that I benefit from the experience, that's for sure.
IM: Well Sadie, do you have any last thoughts - supernatural or otherwise - that we should share before we go?
SW: Actually, just one. And it’s not my thought, but from Kathryn.
Kathryn Leonard 31:01
Working with Misha who has a massive fan base was very helpful towards getting a really great crowd. So I highly recommend collaborating with celebrities when you're trying to do a user study. But the other thing that I will say is that because of this paper, Misha is one of the very few people in the world who has an Erdos-Bacon number. So Bacon number is how many degrees of separation you have from starring in a movie with Kevin Bacon. And then Erdos numbers, how many degrees of separation you have publishing with Paul Erdos. Very few people are both actors and people, you know, publishing papers and some chain with Erdogan. But this paper this paper gave [him?] a Erodus-Bacon number and it is six.
IM: WOW, had to lean into the nerdy-ness there at the end huh?
SW: For those who don’t know, Paul Erdos was a famous and prolific Hungarian mathematician who wrote around 1500 mathematical articles in his lifetime.
IM: So Misha Collins can say he’s both X degrees from actor Kevin Bacon and Y degrees from Paul Erdos for a combined number of 6? Wow that’s nerdy, even for us.
[outro music starts]
SW: Don’t forget to check out our show notes in the podcast description for more about how to get involved with community science in a field that interests you.
IM: And if you like the show, give us a review on apple podcast or spotify or wherever you listen. By rating and reviewing the show, you really help us spread the word about Carry the Two so that other listeners can discover us.
SW: And for more on the math research being shared at IMSI, be sure to check us out online at our homepage: IMSI dot institute. We’re also on twitter at IMSI underscore institute, as well as instagram at IMSI dot institute! That’s IMSI, spelled I M S I.
IM: And do you have a burning math question? Maybe you have an idea for a story on how mathematics and statistics connect with the world around us. Send us an email with your idea!
SW: You can send your feedback, ideas, and more to sadiewit AT IMSI dot institute. That’s S A D I E W I T at I M S I dot institute.
IM: We’d also like to thank our audio engineer, Tyler Damme for his production on the show. And music is from Blue Dot Sessions.
SW: Lastly, Carry the Two is made possible by the Institute for Mathematical and Statistical Innovation, located on the gorgeous campus of the University of Chicago. We are supported by the National Science Foundation and the University of Chicago.