The ThinkND Podcast

Our Universe Revealed, Part 8: When Numbers Lie

Think ND

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 52:34

Episode Topic: When Numbers Lie

Data is the new gold, mined and used (for better or worse) every single day. Algorithms and artificial intelligence (AI) are then used to suggest what movies we stream, or send us grocery store coupons in the mail. When our data is used mindfully and ethically, this can be beneficial. When it’s not, we could be putting ourselves at risk. Victoria Woodard Ph.D., associate teaching professor and director of undergraduate studies for the department of applied and computational mathematics and statistics at the University of Notre Dame, will help you understand how AI works (what is ChatGPT actually doing?!) and things to be concerned about when providing your data to others.

Featured Speakers:

  • Victoria Woodard Ph.D., Associate Teaching Professor and Director of Undergraduate Studies, Department of Applied and Computational Mathematics and Statistics, University of Notre Dame

Read this episode's recap over on the University of Notre Dame's open online learning community platform, ThinkND: https://go.nd.edu/afd4c9.

This podcast is a part of the ThinkND Series titled Our Universe Revealed.

Thanks for listening! The ThinkND Podcast is brought to you by ThinkND, the University of Notre Dame's online learning community. We connect you with videos, podcasts, articles, courses, and other resources to inspire minds and spark conversations on topics that matter to you — everything from faith and politics, to science, technology, and your career.

  • Learn more about ThinkND and register for upcoming live events at think.nd.edu.
  • Join our LinkedIn community for updates, episode clips, and more.

Welcome And Series Intro

Speaker 10

Good evening. how is everyone this evening? Welcome to, this is the Our Universe Reveal Lecture Series. Um, my name's Deb Maher. I'm a professor of ecology at Indiana University South Bend. And I'm serving as the moderator for, the Our Universe Reveal Lecture Series. This series includes talks in the areas of science, music, and the arts. So steam for everyone. Um, we feature current research and creative work that's being done in our region, and it's an opportunity for us to be, curious about ourselves, our world, and our universe. this is a partnership between IU South Bend, the University of Notre Dame. The St. Joseph County Public Library that graciously hosts on this. I also just wanted to make a note. There's exits in the back, in case you need, uh, that, and tonight it is my pleasure to introduce, Victoria Woodard. Victoria Woodard is a statistics and data science professor at Notre Dame. She's been teaching statistics and data science in many forms for over 17 years. At all levels from community college, graduate level as a teaching assistant professor of public and private colleges and so forth. Victoria's PhD is in mathematics and statistics education from North Carolina State University. she also has degrees in math and actuarial science from Ohio State University. And then no state Ohio University. The Ohio. No, no, no, no. Oh, oh my goodness. Okay. Just Ohio University. And then most recently she got a master's in computer science from IU, south Bend because her students were asking her a lot of questions about the computer science aspect. And so she dove in, to that area. so one of her interest is finding ways to best integrate technology into the classroom, and AI is one of those ways. Without further ado, I'm gonna turn it over to you.

Speaker 2

Thank you. Hi everyone. Thank you so much for being here tonight. I'm gonna start out by saying that I am not an expert in ai. As Deb mentioned, I just got this master's degree in computer science and I was doing it because I wanted to learn about this. But I can also say that there's probably not too many experts in AI right now. Anyway, it is a very new field and it is constantly changing. So I'll do my best to answer questions that you have about this, but there's probably not a correct answer or one that I have at my fingertips at this point in time. So who has heard of ai? Go ahead, raise your hands. Excellent. Okay, good. You're in the right place'cause that's what we're about to talk about tonight. Who could describe how AI works? Excellent. That's why we're here tonight. That's what I wanna tell you guys. So I'm gonna go through a process that I think might help us understand a little bit more about how the computer does AI stuff. And I'm gonna do that by having us think how we think. So in general, artificial intelligence, ai, by the way my parents are in the audience, they told me they were gonna ask, what does AI stand for? So it stands for artificial intelligence, uh, is a process where we are simulating human intelligence and machines do this through, I've, I'm simplifying this down to a three step process. I think we can keep track of these three steps throughout. The first one is gonna be data collection and pre-processing. The second one is going to be training data and creating algorithms, and then we're gonna do some evaluation. So before we learn how the computer does this, I actually want us to do this. I want us to think about how we make decisions. Because if we're gonna simulate human intelligence, let's practice our human intelligence. So let's suppose we have an individual that's just walked through a door and it's backlit, so we cannot see anything else about this person, but for some strange reason, we have this very weird ability that we can tell the person's height, this person is five foot two inches. And we as human beings are typically curious about classifying individuals. And maybe something we wanna do is say, is this person male or female? So who in here thinks that this person is most likely male? Who thinks this person is most likely female? Okay. So I would argue that in our brains we probably have seen a lot of adults that are female, that are five foot two. I have not seen a lot of adult males that are five foot two. Not that they don't exist, but if we're thinking of the most likely scenario, we go back to this corpus of data that we have seen. We parse out the information that's not about height and not about whether they're male or female, and then we make a decision based on that information. So in our case, going back to our three steps, if we're thinking about data collection and pre-processing. Where have we seen this in the past? What in our past has brought us to this point of making this decision? So we're thinking about all of the people we've seen. How many of them were five, two, how many of those were male? How many of those were female? And then we can make that decision based on that data. What rules have we come up with? So maybe the rule is not, if this person is five foot two, they're female. Everybody else is male. Maybe it's if they're tall, they're male. If they're short, they're female. So we have something maybe in our brains that would help us to make that decision. And then finally, when a new situation comes up and we see a person that fits to match this rule that we have, we then can continue to a, strengthen the rule and believe that our assumptions are true. Okay. So what if through our magical door person that is now six foot one walks in, how many of you would think that this person is female? How many of you would think that they're male? Okay, so our rule is telling us short, probably female, tall, probably male. What if they're five foot nine? Any thoughts? What'd you say back there?

Speaker 3

Need more information?

Speaker 2

Yes. And back. Back. Beyond that

Speaker 4

most likely male because the male average height in the US is five eight.

AI Benefits Overview

Speaker 2

Okay. I would agree with both of those things. It might might be the case that's most likely male, but there are still a lot of females now that are kind of creeping into this realm of being five nine. So more information would absolutely be helpful for us. So thinking about our process that we're working on, we're gonna update that last one down there. When we get a situation that doesn't fit our rule or we get something that's ambiguous to us, we might say, can I have some more information? Can I have a hint as to what's going on here? So I'm gonna give you a hint. Let's say that this five nine person walks in and they're holding hands with a person that's six one. Do we think this person is male or female? We probably are gonna think that they're female because the added information tells us six, one, probably male, their companion that they're holding hands with at five nine is likely female. But what if their companion was five two. Do we think the five nine person is male or female? Okay. Now there's some bias involved in this if you didn't recognize it. We live in a heteronormative society, and if two people are holding hands, we typically believe that they're going to be of opposite sexes. The data that we train our models on, both as our own intelligence understands and as the computer understands, is going to be based on anything that we input into it. So in society we typically see a man with a woman. That's not always the case, I will admit that. But I think for many of us, that is the data that we have seen in the past. So we were drawn to this idea of if the companion is taller than that person that's five foot nine is likely female. If the companion is shorter, that companion then is probably, the five foot, two person is female, and the five foot nine person is male. So as far as the computer goes, I want to point out here, and I put this in quotes, the computer is not making decisions. It is not thinking it is doing what we tell it to do based on the data that we've provided it. It's gonna come up with a rule and it's gonna output that rule. But it is not a decision. It is up to us as users of AI to use that information to make our own decisions. so the first thing that we're going to do is provide the computer with a lot of information, and I mean, a lot of information. We wanna have thousands and thousands, like the entirety of Wikipedia thrown into training our model in order for this to be good. We're then going to have the computer go through and look at that information and make its own rules. It's gonna try to identify its own patterns from what it's seeing in the data that we have provided it. Again, the computer is not thinking what it's doing is, it's answering the question, what is the most likely thing to occur here? And then finally what we'll do is we'll provide the computer with new test cases. We'll give it some new data, which we'll know the answer to, and we'll say, what do you think? Is this going to be male or female? And then it might update its rule as it starts seeing new things that, maybe go against its rule or it might, reinforce its learning. It might say, oh, yeah, yeah, you've given me more data that fits into my current role. I'll keep with that. All right. Who has heard something scary about ai? I have definitely heard some very scary things about AI and, and I wanna address those. But before I do, I wanna know if anybody's heard any of the benefits of ai. I. Oh good. I'm so glad to hear that. I'm gonna go through some of the ones that I know about, some that I think are really interesting to give you a taste of what it is that AI can actually do. Uh, and then I'll go through and I'll address some of those scary things, but how you as citizens that use AI can hopefully maybe stop those things from happening. So one of the first things that I learned about when I learned AI was image recognition or something called computer vision. And the idea here is that we can provide lots and lots of data. So again, remember our model, we're gonna give lots and lots of data to the computer, and we're gonna have it come up with patterns. So somebody had to sit down and box off the cars, the trucks, the lights, the pedestrians, and tell the computer that's what these things are for. A whole bunch of images. So you have those Google cars that drive around. They're taking all the pictures. Then somebody goes in and says, this is a car. This is a truck. Don't hit this. Stop when you see red. And then the computer starts to pick up on that. And as it looks at more and more images, we can train it to do this. So the computer is boxing off those things now and telling us what those are. So we call that computer vision. It's what the computer is seeing in the images that we give to it. This is really cool because this is what leads us to get self-driving vehicles. The more and more we train it. Then the better and better those things will become. And eventually, hopefully one day we'll have such good self-driving vehicles, we won't need to touch the steering wheel and accidents will be basically gone. So that would be excellent. that's not the case right now. You've probably maybe heard of Tesla in the news, having issues with their vehicles. Their model is not where it needs to be. The rules they've created are not quite where they need to be, but it's growing, it's learning, it's trying to reinforce and become better. Another place that this is used is in medical imaging. So right here we've got, uh, images of some, I think their lungs. I think that was lungs. And what happens with this is the doctors will have these images taken of people's lungs. They will make their own diagnosis and say if this is cancerous or not cancerous. And then all of these pictures are fed to the computer. And it makes a rule to see if it can identify cancerous and non-cancerous, images. Whatever it might be provided. This is for lungs. if any of you are at IUSB, Dr. Wolf Wolfer over in the computer science department does this with breast cancer imaging, and there's lots of other different fields in the medical world that are using this as well. About 20 years ago, I worked as a typist. I was typing up legal descriptions. If you ever refinanced your mortgage, you'd know what that is. And I remember sitting there one day saying, oh, I wish I could just read this and have it just type on the screen. That would be fantastic. And then I later thought I'd be out of a job if that happened, which is happening. but 10 years ago I was coming home from a conference and I pulled out my cell phone. I put it on the seat next to me, and I started talking about all of the ideas I had at this conference, and my cell phone was able to capture. Basically a pretty good idea of what I was saying. Type up notes for me today. I get in my car and I say, Hey car, take me to the library, uh, downtown South Bend, and here I am. It got me here. I'd never been to this branch before. So another thing about AI speech recognition, I, we've got lots of things now that can identify what you're saying, type it out, use it to search on Google. If any of you have one of those, Amazon Alexis. In your, in your house. Uh, I don't like to say it out loud'cause if there's one around, it'll start listening. it's listening, it's gonna pick up that data, it's gonna try to go do a Google search with that info. but another really neat thing about my example of driving here, it found the most optimal route to get here. It didn't take me, you know, a portage road and around and do all this stupid stuff. It got me here in the least amount of time. So AI is also being used to help us get places quicker. Safer. Safer and safer. We can do it without ever having to pick up and text, or even, heaven forbid, have a MapQuest on our laps. Oh gosh, that was the worst. Terrible, terrible. How many of you have ever received a text from your bank that says, we have seen some suspicious activity? Okay, that is ai. We have a bunch of data that has been collected by banks of what users typically use. We have a bunch of data of what you typically use, and if you bring those things together, there's a rule that's made that says this, this transaction, this is unusual. I'm gonna send a text alert, I'm gonna make sure that this is okay, and let's hope that there's no fraud going on there. And if there is, we can hopefully catch it then. Anybody play chess online or any other games online? Okay. Uh, a lot of games today, if you are doing a single player versus the computer, they use AI to figure out the optimum set of moves to do in order to beat you. So there's a whole bunch of data that's been fed into this. A rule has come up for the best sequence that the game should follow at that point in order to beat you. And then you gotta try and figure out how to beat the game. Now, all of these are kind of interesting examples because it is, let's feed it lots of information and then it'll spit back out something it's seen before in its own highest probability. What we have here for generative AI is starting to get into a a different realm. With this, what we're doing is saying, you know, those images, we fed you of different things like cars and people, and mushrooms draw a new one for me. So it's actually taking what it's learned and its rules about what a mushroom looks like, and it's creating an artificially generated image. One of those is from my yard. Another one is a generated image. Any guesses as to the real one, left or right?

Speaker 5

Right. The one on the right,

ChatGPT And Mad Libs

Speaker 2

the one on the right is in fact the real image. The one on the left is a generated image for us. I'll do one more for this. Do I have any swifties in the room? One of these is real and one of them is AI generated. The real one is on the left, the generated one is on the right. But how scary would it be if you were to all of a sudden see an image of yourself online that you knew you had not taken? That would be kind of scary. I think we gotta be careful about this. so generative AI can be images like we saw here. It can be audio or it can be text. And a big one that you might have heard about recently is chat, GPT Chat. GPT is a website you can go to. You can type in prompts, you can ask it questions, you can even ask it to give you back information in a certain tone. so what we're doing here is we are interacting through this thing called natural language processing. Uh, we talk in our own natural voice and then it comes back with this. A set of words and it makes sense. We read it and we think, who typed that? Where did this come from? What's actually happening? There is the whole of Wikipedia. A whole bunch of things on the internet went into training this, and there was a method used to figure out what is the most likely next word that should be said in this process. And then when you type things in, it's gonna give you back an answer based on its most likely scenario of what is correct. It can absolutely be wrong, but it's getting better and better by the day and it's really scary what it can do. So let me give you an example of what's going on with this and, and again, we're gonna go back to this idea of let's look at our intelligence to see how the computer is doing its own thing. So any of you ever play Mad Libs before? Okay. All right. So I really love my noun, said, Bob. What noun could go in there?

Speaker 7

Rocks.

Speaker 2

I love my rocks.

Speaker 7

Partner,

Speaker 2

what is it? Partner

Speaker 7

job.

Speaker 2

Job,

Speaker 7

car,

Speaker 2

dog, cars. Lots of different nouns could go in there. Right now we don't have any context other than Bob looks very happy and he says that he loves something. So we could add some context to this. And now as we're doing this, things become more likely to fill in that blank. So now the prompt tells us, Bob looked at the gifts that his daughter had made, uh, daughters had made from him. I really love my blank, said, Bob, what, what makes sense now?

Speaker 8

Daughters,

Speaker 2

daughters or gifts are probably the two most likely things that will be generated at this point. What if we add something after that that says having them was the best decision I ever made. Daughters. Yeah. I mentioned my parents are in the audience, right? so even if we've got texts before and after what we're trying to fill in, chat, GPT can look at all of the surrounding context and fill in the blanks that go there. So, as an example of this. I, I'm not gonna go through all of this, but you could kind of get an idea of what's going on, what is going on here. I asked chat GPT to describe chat GPT to an 8-year-old, and you'll notice that the language that it's using here is, I don't wanna say dumbed down, but it would be something that an 8-year-old would understand, and it's talking about having a buddy, and this is cool, and it's really trying to promote itself as being a tool. That an 8-year-old would probably want to use. there was part of, I think it's this one, maybe it's the next one. One of them says it's thinking, but it's, it's not actually thinking. So I'm gonna say that over and over again'cause it's not actually thinking. I then asked chat. GPT. To describe chat GPT to an an adult that doesn't really know much about ai, and you'll get the same general gist out of its response, but you'll notice that the language that's used in there has been elevated to something a little bit higher up. so you, this is kind of dangerous for college professors who are assigning writing assignments to their students because their students could go in and say, Hey, chat GPT, describe the reasons for the war of 1812. Do it in the voice of a student who doesn't know anything about this. Yeah. And I have actually heard of some students doing this and they submit it and they would get the grade they would've had, they just spent the hour writing the assignment or however long they would've anyway. And it's terrible. so as an educator, I have my own opinions of how chat GPT should be used. I, uh, that's a whole different lecture that I could give. but it, it can be a very dangerous thing for us. Are we on time? Okay. So how can AI go wrong? We've talked about this process of we are gonna give it lots of data. It's to make a rule and then it's gonna give us back something in return. So it can go wrong in the way that it's trained. So here's some more generative AI I plugged into the program I'm using. Lemme remember what that is. Oh, I don't have it up here. I could, I think it's Wayfair if I recall correctly. And I said I would like some photos of government IDs of middle-aged women. Does anybody see anything wrong with this?

Speaker 3

They're all white.

Verify Before Action

Speaker 2

They are all white. There's not a single person of color there. I was like, oh, maybe it's just the eight that I got here. No, I did this over and over and over. I looked at a hundred images. There was not one single person of color in those images. So to me, what that says is that whatever program I used to make, this was trained on data or on images that were likely of a lot of white people. And that can be very detrimental because when I created the image of Bob, I said, gimme a 30 something happy person. And they were all white people too. So there was, there's basically no chance that people of color are going to be represented in this. And that is something that is not acceptable. So whenever you're looking at the output of these things, you should absolutely be mindful of how is this trained? Because if it was trained like this one was, you're going to get biased results. It can also go wrong in the way that it's used. So it is out there. I was talking to somebody just before this, and it seems like we're coming up with ways to use AI for the sake of using AI as opposed to finding ways to use a AI because we have a problem that needs to be solved. So one of the ways that I found that this is being used, it says that researchers at Harrisburg University. Announced that they had built a facial recognition software that could predict whether someone is likely to become a criminal based on a picture, of their face. Well, if this is the data that they're using, then I'm afraid because I'm going to be labeled as a criminal if this is all that that's being trained on. But my guess, and I don't know this for sure, is that they probably took pictures of inmates. People that are criminals, and then they use that to help train. Because remember, you need to have some sort of label to identify these different things. And right now in our, in our justice system, we are overly incarcerating individuals of color. And then that means that people of color are now going to be identified as criminals more often just because they fit into this, this picture. And so we wanna avoid things like that happening. Definitely wanna make sure if we are going to do something like this, it's trained properly. But I just think this is a terrible idea all around something else that happened. Uh, there's a company called Clearview AI and they harvested 30 billion images from social media on the internet. Now this, you might say, okay, why are people putting their pictures out on the internet if they don't want them to be used anyway? But when you sign up for Facebook or all those other social media, there's a, if, if you actually read the fine print, people are not supposed to go through and harvest those images. They're supposed to be for you and your friends and maybe your friends' friends. They're not supposed to be out there for everybody to get ahold of. So this company violated the rules of the social media platform in order to get those images. All right, so they're out on the internet anyway. Why does it matter? Well, if you have, somebody using this to figure out, criminal activity and they find your picture and it looks very similar to somebody else, that's a criminal. You might have people knocking down on your door because your picture matches something that was seen in one of these, uh, databases. So I think in this audience, I'm, I'm seeing nods that yes, this makes sense, but when I talk to 20 year olds about this, they are, they're like, I don't care. It's just data. Whatever breaks my heart. Data, your data is your identity. It is your gold. And I, I tell this the 20 year olds, you probably already know this. Don't give it away for a t-shirt. It's not worth a t-shirt. Be, be very, very careful with your data. It's important that you keep it your own so it doesn't end up doing some harm out there. some guardrails that we can put into place. I've got, an example here from the Writer's Guild Association. What they did with their negotiations as they were coming out of their strike earlier this year. One of the things that they negotiated was that AI cannot write or rewrite literary content. And what this meant was the, studios could not just use AI and not hire writers. They were required to hire writers to do the job of writers. They could not use AI on their own. A writer could choose to use ai. So if they wanted to use chat GPT to help them tweak a joke or something like that, they could, but the, associate, the association made it so that they could not be mandated to do this. They did not have to use chat GPT. And then finally, any content that has been written is not allowed to be thrown in the training. It is not allowed to be pushed into the, training data that they have there in order to create new content in the future. So if we can come up with our own guardrails in our own situations, I think that will be very helpful. if you are a leader in your field, you absolutely should be thinking about these things. How can I protect my people? this so that AI is not going to replace their jobs or make them, not need it anymore, make it enhance their jobs. That's what they did here. They're enhancing what the writers already do by using chat GPT if they so choose to do. And then finally, we need informed citizens. So I'm so happy you're here to hear this because that's exactly what we need. We need people to be in the loop. You need to know what can go wrong so that you can fix it or you can at least hold it back. Uh, results need to be verified before action. So earlier I mentioned that we have these images that are taken of lungs or breasts or things like that. It's not like the computer is outputting, Hey, this is cancerous, and that's the end of it. The doctor takes another look at it and then tries to identify why did the AI think that this was cancerous? So we need professionals that can go through and say, yes, this is correct, or No, this is not right. And, and help. evaluate and update those results as needed, but to keep that human element in there. And then finally, results need to be explainable and appealable. The computer should not be the one making the decision. That is the end all, be all of it. Uh, we're not gonna have something that says, you know, that person is a criminal, take them away, and then just not have any sort of way to bypass that. We wanna make sure that there is a person in between that saying. Wait, are you sure? I think of that Tom Cruise movie that came out. There was like three brains that were three people that were brains that were trying to

Speaker 10

minority. Right.

Speaker 2

Thank you. Minority part. Thank you very much. Good movie. That's exactly what I think of here. They, they just assumed whatever the, three individuals in that tub said was right. Was right. And there was no oversight to it. So we need to have that oversight. I wanna thank you guys so much for coming. I hope you guys learned something valuable from this, and I'd be happy to answer any questions that you might have if I can. Yes, sir.

Speaker 11

You mentioned about students using Chad GT to write papers.

Speaker 2

Mm-hmm.

Speaker 11

And if they do, how do you recognize that? That it's not written by them?

Speaker 2

So there are. AI tools that will help figure out if students are not writing how they would typically write or if you don't have a sample of how they would typically write, if they're writing things that match other things on the internet. so turn it in as a tool that can be used for

Speaker 11

that. Means that, is that the same thing as the Writer's Skills Association? Is this how they going to figure out if somebody has used AI to generate some kind of a.

Bias In Image Prompts

Speaker 2

I dunno the answer to that for sure, but I would assume a very similar tool. If not that one will be used. Yeah. Yes, sir.

Speaker 12

When you asked for the faces, uh, of middle aged women mm-hmm. Did you then later ask, Hey, can you show me a diverse group of middle aged you want just to see what would give you, because you're asking as a white, middle aged woman, or maybe not the middle aged, but

Speaker 7

that's a mom.

Speaker 2

So I

Speaker 12

just, you know, wonder did it had something to do with how you asked or,

Speaker 2

that's a good question. I did not go back and check that. And to be honest, I, my husband and I worked on this project together and he was the one that actually typed it in, so I don't know why it did that. If it was maybe modeling based on things he had asked, but I don't think there was. My camera wasn't on, it wasn't looking at me and saying, oh, this person's white. I'm only gonna give back, uh, images of white people. So I don't know. That's a really good question. If I were to go in and ask for somebody of color, would that make a difference? I, I think it would, I think it would follow the instruction, but the fact that it's not doing it is a, a good mixture of what the population is currently. I think that's, that's terrible. The

Speaker 3

cooling actual pictures or are those generated? They're generated. Okay.

Speaker 2

Yes.

Speaker 3

So those aren't real people that you Well, that's Janet.

Speaker 2

Yes, exactly.

Speaker 3

Counting.

Speaker 13

Exactly. what additional insights did the Master's program bring to you in terms of going into material learning about AI now?

Speaker 2

Oh, that's a good question. So, I took a neural networks course and an AI course both with Dr. Wolfer and, both of them were just super insightful into, how the computer is used to make those rules. I think a lot of what I learned here in what I presented might not have been, some of it was from that class, from those classes, but I think more of this was my own insights from doing my own personal research of how AI could go wrong. but yeah, I think the program itself really teaches you how to get into the nitty gritty of how do you tell the computer how to make those rules. Did that answer your question? Okay.

Speaker 10

Go ahead ma'am.

Speaker 7

I'm not even sure how to ask this, but I'm trying to get a sense of how either a program is designed or a system is designed for a computer to gather input. I. I mean, it's not people just standing there. Yeah. Thousands of people inputting. So is it trolling data already existing or how does that work?

Speaker 2

So, for example, uh, with the image recognition data, you've got, bits of color. So if you've ever done Microsoft paint, you can zoom way in and you can see the bits of color. And so that'll get coated as zeros in ones or maybe something between zero and 255 to get the gray scale coloring of it. And then the whole image itself is seen then as a set of pixels. And each one of those you could think of as a variable in an equation. And then the, the why or the the output, the response to that would be, is this cancerous or is this not cancerous? And yes, somebody did have to go in and type in this is cancerous or not cancerous. Yeah. Before I have a question for you. Are you, like a, a plant person? Oh, God. Okay. I think I've seen you online with your plants. Okay. I, I think it

Speaker 7

was a generated image of me.

Speaker 2

I, that could be, yeah. Yeah. I think your companion had a question too. Go. Yeah.

Speaker 14

Um, I understand stand AI is hugely dependent on mountains of data.

Speaker 2

Mm-hmm.

Speaker 14

Are there any guidelines or standards on the quantity or quality of data? Is input to these programs.

Teaching Kids With ChatGPT

Speaker 2

So for as a statistician, I would say that if you're going for quality, that you should go for quality over quantity. but as a computer scientist, I'd say you should go for quantity over quality and then just hope that the bad quality ones wash out with it. So I don't know if there's a rule that says you need. 50 or 50,000. I think it's just a matter of each time you evaluate and you see, oh, this didn't work on this particular image, you can keep figuring out, we call them features, new variables essentially that you can use to maybe modify and get a better answer. So it's not necessarily quantity or quality, it's kind of a mixture of both and no, there's not really a set number of rules. It's just you keep going until you get. In classification you're looking for? Yeah. Did that answer your question? Kind of maybe. Okay. Yes.

Speaker 7

given that AI is changing the nature of programing as it's today among of a lot of other junk, what would you, or what would you have some of your colleagues recommend? educators lead, children towards. Help them be prepared for the future.

Speaker 2

Uh, I would absolutely have them use something like chat, GPT. they're going to use it anyway, so I may as well teach them how to use it properly. And since AI is becoming so integrated into our society, there's no reason for me to be like, oh, that doesn't exist. Uh, so for example, uh, I teach a lot of computer programming and chat. GPT can actually do a very good job of, of giving output for computer programs. And so what I'll do is I will give it a prompt and I will then show my students the output that it gives me, and I'll ask them to think about the output and say, why did it choose to do this? Is there anything wrong with it? So they're evaluating the response of the AI because I think they're gonna use the AI anyway, and that's probably how they're gonna use it most often. Yeah. Did you have a question? Yeah. You.

Speaker 5

Given that it's here, it's not going away. And you had those images of those women? Mm-hmm. And, and somebody asked, is that my most real, or were they generated? Is any company at all, would it be Google might have been soft coming up with software that someday could be loaded into your computer? So that you would know, is this really real, that it could tell me if it was, if I should be believing what I'm seeing?

Speaker 2

That's a, a really good question. The answer is yes. Uh, I don't know. I can't remember which company it is, but they are definitely coming up with that. I'll, I'll give you an interesting side story about that. there were criminals that had heard about generated ai. And they thought, well, if they catch me on camera and I'm wearing what looks like a sixth finger, they'll think it's in a generated image and I can get away. So they're now trying to counterbalance that and saying, oh, wait, even though that looks like it has a sixth finger, it's actually that person wearing a, a fake finger, and that is real or the opposite. This is a fake, somebody's trying to frame this person. I, I don't, again remember who that was that's doing that, but I know that I did read about them trying to come up with this identification. Yes.

Speaker 6

So there been lots of examples of AI like chat, GBT, generating false references, references that don't exist.

Speaker 2

Yes.

Speaker 6

And then when you challenge them, they become combative and say, yes, indeed they do exist. What's going on there?

Speaker 2

I have not seen that myself, but I would assume that since it's generating a. The next most likely word. Uh, it's, it's, it's giving you this reference. It might not be real, and it's not necessarily looking back at that point and saying, this is right or wrong. I know when I tell my, my device at home that I can talk to that, Hey, that's not what I was looking for. She says, sorry, and then that's about it. Like there's no fixing it. And I think the same thing is happening there. Maybe there's no fixing it, so it's not. I, I really don't know.

Speaker 6

Does, does AI have the ability to lie?

Speaker 2

I don't think it does. I think it has the ability to give you the next most likely word. And if it feels like the next most likely word is gonna be to say, Nope, you're wrong then, but I, it's not lying. I think lying is, again, a form of thinking and it's not doing that. I don't, I don't believe it's doing that. Not yet. Yes.

Speaker 12

Think, think the term that leads is hallucinate. Yes. Creative. the generative ai. And a few months ago I was playing and I asked it to write a obituary on me and, um, you to hallucinate because there is stuff available about me on the internet. And it got some of that, right? Some of it, it just made up. It's like that first thing.

Speaker 2

Yes, yes. If, if it can't fill in the gaps, it doesn't add the next most likely thing based on context. It'll come up with its next most likely thing based on other context. And so it's using other obituaries and using, yeah. Yes, sir.

Speaker 15

Uh, so regarding combativeness, I will note that these, uh, AI all trained on the internet with kids 50% arguments.

Speaker 2

That is a good point. Yeah. This is trained on data from the internet, so Yes.

Speaker 16

I, I think, you know, like so much data, garbage and garbage

Speaker 2

out. Absolutely.

Speaker 16

So when you look at chat GT and you're looking at prompts. Writing prompts, how that, how important that is. Have we seen a correlation? And I think to the points over here, it does sometimes spit out wrong information, you know, perhaps because it's not designed to say, I hope know. Correct. Designed to give you an answer for whatever you ask. Has there been any research or study done, um, that you're aware of? If you give it too little information versus too much information in a prompt, is there a sweet spot where if you, if you put too many parameters in your prompt that it, it's more likely or less likely to spit out false information to satisfy your, your inquiry versus less? And is there a sweet spot somewhere in the.

Speaker 2

I, I don't know that, but Dana, you might try to get a graduate student to try and figure that out. I think that would be a good study form. I, I don't know the answer to that. I think that's a really interesting question though. Yes.

Speaker 9

Yeah. I just wanna kind of share a quote that kind of put it in perspective, yes, they don't lie, but they can be wrong. Somebody told me that AI can't make tacos. It can make a recipe, but it's up to us to make the recipe and note tacos are any good.

Speaker 2

That's correct. And, and that's this idea of, uh, we need this intermediary. What we need to be in this, we need to have this thing that gets told to us, this is what you should do. And then we go forward from there without that intermediary, I think that's where we get to a really dangerous place.

Speaker 17

I, the fellows comment there about making up legal case and this, uh, made me try to prompt my memory, but, uh, Michael Coen, who was Trump's former lawyering at the prison. Apparently recently actually filed a legal brief, which he used ion, and it in fact made up a whole bunch of cases. And only after he had filed a brief this come out. So that's kind of where you were. You had a person in the middle, but they weren't competent in paying attention.

Speaker 5

Yes.

Speaker 17

Going on.

Speaker 5

Yes, that.

Speaker 17

But one thing I would've asked myself, a new thing that like I've heard, and I don't know enough about this, maybe you do, that a lot of job applications nowadays. Are being processed by AI before any human being sees them. So a lot of people are not getting into the first cut, and some worries that they've seen expressed is that they're using stereotypes about the kind of information being used. And, uh, I wonder if you know about those kind of things and uh, you know what's going on with that.

Ethics Groups

Speaker 2

Yes. So there is a book written called Weapons of Math Destruction. I Cannot For the Life of Me, remember the author at the moment? she's got blue hair. She's a very nice lady. Uh, but she did a study about this as part of the book. What ended up happening was the model, the, the data that went in were all the resumes of individuals that got hired. And uh, then they used that to figure out, well, who would be the best candidates? And what they started to realize was they were really only hiring men because they were only using these resumes from way back when only men were really being hired for positions. So I think. I think it could work, but you would be, again, very, very careful to make sure that you're representing men and women, people of color and you, and it's almost a little difficult to think of all the biases that the computer might then have. So if you are going to do that, maybe as a launching point, just like I encourage my students use chat GPT as a launching point. You wanna have that person in the middle saying, is this really working? And so for the first hundred resumes that it goes through, look at it and have them double check to make sure the rule that's being used is good. Yeah. But definitely could be dangerous if they were to do that. Yeah. Yes.

Speaker 13

Are there any, since you work with the students, are there any local groups in the area that are like getting together? as, as a collective or as a community to start working on this stuff or teaching each other, experimenting with this, or do you know of any resources in the area that are doing this?

Speaker 2

So the, the one thing I can say is I know for sure that the masters in data science at Notre Dame, they have an entire ethics course built into their curriculum to make sure that. The students that come out of that are using data for good. Thinking about all the ways that this could go wrong, I don't know of any local groups like public that, that are doing this. Has anybody heard of anything like that? Unfortunately, I don't. I don't know that that's the case. Yeah. Yes.

Speaker 9

Are you aware of any institution or organization, uh, in this field or it's like a wild west out there?

Speaker 2

Currently it's kind of the wild west out there. it's something they are trying to work on, but there's so much with, um, you know, if somebody puts their picture out on Facebook, technically it is open to the public and it can be used. Although Facebook's regulations say you're not supposed to use it. So yeah, there's a lot going on to be considered. And it's currently like the wild, wild west out there.

Speaker 11

So a follow up on that question? Yes. Does that mean every company accumulates data on their own and then they use it to create an artificial intelligence? chat gt or is it just one?

Speaker 2

Uh, so chat, GPT is created by a company called Open ai and they, they mined a lot of data from the internet, from sources that they could get to legally, but Facebook could, Ask the Amazon device, Hey, what have those people talked about in the past 24 hours? Oh, they've talked about toothpaste. I'm gonna put a toothpaste advertisement on their Facebook feed this week. So some companies will use it internally, but then some companies will sell your data to other companies so that they can then use that to make decisions and sell you products and things like that. Yep.

Speaker 6

Yeah. Sort of related to that, there's a, I think a case before the Supreme Court that open AI is using copyright material and it may come out again. They can't do that. That is case. How is that gonna change ai?

Speaker 2

oh, that's a good question. I don't know if the court then would rule that they have to. Retrain their product. I, I doubt that they would make them do that. I think it would probably be the case. They would say, you can't do that anymore. You can't use this anymore. Uh, but I will say if you are going to use chat, GPT, whatever you type into, it becomes part of their corpus of training information unless you get the paid version and say, don't use this. So that's why they're always telling you don't put your corporate information in there because then it does become the property of chat. GPT. Yes.

Speaker 8

I was just gonna say that an interesting thing I saw this week that resonated with me was that a DI might not. Someone who knows how to use a AI migrant place.

Preparing Future Programmers

Speaker 2

Yes. And I think that's part of this, becoming an informed citizen. You need to become, you need to get in the loop. You need to know what's going on, how to use this so that you don't get replaced, make yourself useful in this, this world that we're in.

Speaker 9

Yeah. I going onto a question I was asked you earlier. I'm curious as a computer science. Developer, how are universities thinking about comparing students, specifically going into writing software in the future with the mindset of ai Can't write code yet, but in the future it'll, yeah,

Speaker 5

it will advance. Yeah.

Speaker 9

How, how are, how are universities thinking about, or is it still too early to prepare students to go into this new field? Or not a new field, but this field that is changing as a software engineer?

Speaker 2

So are, are you asking like, how do we prepare students to write the code or how do we prepare students to use something like chat GPT to write the code?

Speaker 9

Yeah, more like I, I saw it, I watched that YouTube talk at Harvard about how developers could be replaced or become almost like code reviewers. Right? And it, I'm just curious if that's kind of like, if universes you're thinking

Speaker 2

about Yeah. Okay.

Speaker 9

better and more efficient.

Speaker 2

Um, so I'm not in the computer science department at Notre Dame, and I don't know what's currently going on, Dana, are you guys doing anything in that realm

Speaker 10

currently? We're, we're trying to, to make sure that our students can call.

Speaker 2

Okay. so what I would say. yep. This way, what the writer skill did and prepare our students to start making these demands, if you, you know, if you want programmers, you're gonna need to pay us. If you want just chat, GPT, every programmer is gonna strike against you and you're gonna, gonna have anybody to write any of it anyway. So I, I think this is actually the, the method that we need to be telling, telling our students that are coders. You need to band together and make sure that you know what demands you have, what the code you write can no longer be used to train new code and you can't write code without a stuff like that. Yeah. Yeah.

Speaker 18

I must admit that when I first, saw this, when they made the agreement, I thought there's delay in, because eventually AI will do this. AI will write code.

Speaker 10

Mm-hmm.

Speaker 18

And so we have to prepare for that ality, I think. So even have these guards in place, they're just temporary guard bill.

Speaker 2

That could be true. I don't know how long they, they guaranteed that, uh, AI was not allowed to write anything for them, but you're absolutely right.

Speaker 18

But if they have, and then someone else can be doing it like, like outside of that, and they would just run an by thing. We we're using AI and we have our own company and next thing you know, they're produc.

Speaker 2

Yes, yes. So temporary. This is not a permanent fix. And it's, and as I've said a couple times, it's scary. It's definitely scary where these things are going.

Speaker 10

One more question just

Speaker 6

Oh, thanks. Yeah. So the singularity may or may not be What the, are you, does that frighten you? I mean, the idea of a GI that machines just saying, I don't care what you do, I'm gonna be doing this. Are you optimistic? Optimistic.

Closing And Next Lecture

Speaker 2

I'm an optimist. I, I think that things will be okay, but I've always been an optimist about everything, so I don't know if I'm the person to be asking that. I definitely think there could be an AI revolution at some point, but I think by, I think a really good thing that we can do is step in the middle of it and, and make sure that we're not giving the computer the power to actually implement its decision decisions. Let it make decisions. We need to be the ones that judge if they're good and then implement them. I think that's our safeguard against that happening.

Speaker 3

Thank you very much.

Speaker 10

All right. Thanks so much for coming. Uh, I think we have a lot to think about. Maybe we do need a public group that is sort of thinking about this, getting this information out there and thinking collectively as a community, um, how to address. So anyway, thanks so much, Victoria. That was great. so our last, uh, our universe revealed lecture series, at least for the spring is gonna be April 2nd. 6:30 PM this room. and we're thinking about eclipses in outer space. So how can you use Eclipse information to learn about other planets and whether other planets are habitable? So Lauren Weiss from the Department of Physics and Astronomy, we'll be leading that talk. and with that, thanks so much. Have a great evening.