Ethique et l'intégrité collecte données

De April MediaWiki
Aller à la navigationAller à la recherche


Titre :

Intervenant : Puneet Kishor

Lieu : RMLL2015 - Beauvais

Date : Juillet 2015

Durée :

Lien vers la la vidéo

Transcription

00' J'essaye, MO, cqfd93

Rencontres Mondiales du Logiciel Libre. Beauvais 2015

Présentateur: Eh bien, nous allons commencer la conférence suivante et Corinne tu es avec nous, tout va bien. Je donne la parole. Ah, votre microphone est ici. Your microphone is there. I shall not translate.


Puneet Kishor : What's that ?


Présentateur: I shall not translate, because...


Puneet Kishor : That's OK, OK.


Présentateur: Ca va pour l’anglais tout le monde ?


Puneet Kishor : I apologize, I am going to talk in English. but it will give you a chance to practice your English with me. My French is much worse than your English, you don't want me to be doing that anyway. This is going to be a very different presentation, I think, from most of the presentations you've been hearing. Most of them has been about software. This is about matter issues, bigger issues, not bigger, I don't mean more noble but bigger in terms of more complicated issues about ethics and integrity and what we can or cannot, or should or should not do.

So hopefully you will find this of interest and I will want your reactions to that. It's very good that the conservation if I understand correctly ended with a little bit of talk about terms of services and licenses ??? . It's all right. I can have got that, you now, my French is not good and my Spanish is not good and I don't know any Portuguese but I could get that much little bit.

I actually used to work for an organisation called Creative Commons. How many people have heard of Creative Commons ?

I am surprised that you ??? not heard of Creative Commons. Creative Commons is the organization that makes copyright licenses, one of witch is actully used by Wikupedia for everything that is published on wikipedia. And CC licenses as they are called are Creative Commons copiright licenses, I worked at Creative Commons for three years as the manager of Science and Data policy.

So my focus is more on science and the application of licensing information to scientific data and scientific software.

In this presentation I'm going to go in a slightly different but related direction.

How many people here understand what is a license?

No, no, it's easier than meaning of life. Can you tell me in very short what is a license?

Public : inaudible

Puneet Kishor :Very good! A license is a permission. You can do something with my work or whatever that I have licensed. A license is a permission given in advance without knowing what you may do or not do. Think of a notice on a park, it says "You can come and sit here and enjoy the park", that's a license to enjoy the park. The person who's put the notice doesn't know who's going to enjoy the park, but it has been put there in the future for anyone to enjoy the park, that's a license. License is based upon some kind of underlying law. There is something that gives me the right to give you the permission, right? This is ??? computer, he gave me the permission to use his computer. If it was not his computer, he couldn't have given me per- well, he could have given me permission but wouldn't have meant anything, right? Because he doesn't have the right to give it to me. So in order for me to license something, I have to have the rights ??? I can license. In the ??? intellectual property, there is a right called Copyright Law. How many of you understand what is Copyright Law? Even generally.

05' Transcription cqfd93

Puneet Kishor : Can you tell me what is copyright law, short?

Public : inaudible

Puneet Kishor : Someone else: Can you tell me what is copyright law?

Public : inaudible

Puneet Kishor : Try it! No?

Puneet Kishor : Copyright law is a law that gives me the first right in the benefits that I may get from things I create, OK? So if I write a poetry or if I write a song or make a film or make a wikipedia page, I immediately get rights on it and I get the benefit or the first chance of benefiting from those. And then, based on that, I can give those rights to others and I do that using a license. If you go to any wikipedia page or any page and if you go to the very bottom of it, terms of use, somewhere there will be written that "Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply". The person who wrote this had rights, that person then gave away those rights using this license. This license was made by Creative Commons where I used to work. OK, so that's the connection. OK. Now, Let's come back to my talk. So you all are now experts in copyright law and you all are now experts in licenses. But there are things which are not covered by copyright law and if they are not covered by copyright law I dont have rights in it that I can license away, and if I can't do that then how does the world work? And that is the subject of my talk. As you can see I've gone beyond the Creative Commons basically. Conventional science projects, and I'm using the word "conventional" to mean the most common place science projects that happen in universities and higher research institutions, if they involve human subjects, they have to be approved. I don't know the situation in France, but I'm hundred percent sure it probably is the same as in the United States. There is some independant body that has to approve your project to ensure that you're going to treat your human subject with respect. In the United States, these bodies are called Institutional Review Boards (IRBs). When I want to make a project ??? and I want to study behaviour or I want to study people and their behaviour on anything, it could be a social sciences project, it could be a health project, it really doesn't matter. If humans are involved, I have to get the project approved, and the IRB which are independent bodies, they will review my project and they will ask me a lot of questions and they'll make it very difficult for me. In fact they will make sure thet I'm doing everything correctly and that I am not going to do anything that will in any way harm or disrespect the humans that I'm studying. If I am going to be getting any data from humans, I will inform them, say if I am going to be studying you, I will inform you in advance as to what I'm getting from you and you will have the option to leave the study if you want. Understood? OK. So that's a very very basic step in all science project. IRBs are like the ethical watchguards. Typically, IRBs do their review in the beginning of a project, and they review the project and then they say "yes you can do it" or "ne you can't do it" or "yes you can do it but you have to make these corrections ??? " All ??? Yeah?

09'53 Transcription cqfd93

If i am going too fast, let me know, I mean I know when you people talk really fast in French I can't understand. I'm learning French, I understand if you speak slow, but I can understand the same thing with English.

But what about … Citizen Science? Have you heard the term Citizen Science? Has anyone here heard the term Citizen Science? George you have… no? Noboby has heard the term Citizen Science besides George? George, can you tell me what is Citizen Science? You can tell in French.

Public : C'est de la science faite par des non spécialistes ???

Puneet Kishor : Well, so there's several kinds of Citizen Science, typically Citizen Science involves, it does involve a specialist, say me, but then I employ, not employ as in payment, but I recruit a lot of common citizens who are not specialists to help me do the project.

Have you heard of a product called "Galaxy Zoo"? Galaxy Zoo is a very famous Citizen Science project. Zooniverse is the platform on which Galaxy Zoo is based.

There's a very famous project called the Cornell Birds Survey. Every year, Cornell University in the United States does this bird survey where citizens from all over the United States for a specified period go out and count birds. And it's been going on for more than a decade. It's a very rich project, yes.

Public : Inaudible

Puneet Kishor : I wouldn't call it Citizen Science although it does involve getting permission from the person whose computer on which you're running SETI@home, I wouldn't call it Citizen Science, I would just call it more like "distributed computing", you know, that's really what I'm doing here, OK.

Public : Inaudible

Puneet Kishor :

Arrêtée à 12'05

13'36 - transcrit Juu, relu son cqfd93

Three kinds of open projects.

How do we approve, evaluate and monitor some citizen science projects, that's the theme of my presentation.

There are three kinds of projects according to a paper that I found.

Projects where citizens contribute some information, projects where they actually not just contribute some information, but they also help collaborate and help design or even analyze some information. Galaxy Zoo dot dot (??) you actually see some information and you tell whether it's a star or a nebula or... You know, you actually do something, you think about something and you make a judgement call.

And then the various sort of the top end of the Citizen Science priject would be where scientists and citizens get together and try and figure out what to study.

There is actually another fourth kind of citizen sceince project that's happening a lot: self-organized. How many here have heard the term quantified-self? Can you tell me what's quantified-self?

Public : inaudible

Puneet Kishor :Well, kind of. For example my phone has a motion sensor. Every time I walk it counts the number of steps I walked. And it basically allows me to keep track of how many steps I walked and if I go here and click on a button, it'll tell me that today I walked five thousand steps. Five thousands one hundred and five, which actually is not a lot, I should be walking twice as much more. It also tells me that I've climbed two floors, so i haven't done much climbing today. But quantified-self is, I mean it could be anything, it could be how much you walk, it could be getting your blood pressure on a daily basis, it could be measuring your heartbeat on a daily basis, and there are people, there is a very weird place in this world, I don't know if you've heard of it, it's called San Francisco, where people are obsessed with this kind of stuff, and there are constantly measuring everything about themselves. They've got like you know ?? everywhere and they are just measuring everything, which is why I run away from there and I came to Paris, where nobody seems to be obsessed by it at all. But, that's quantified-self.

But people are taking this quantification further into analysis, and people are grouping their data together and trying to figure out what's wrong with them, trying to cure deseases, people who have certain kinds of deseases are building websites where they can collaborate and talk to each other and say "hey, you know, this is happening to me, is it happening to you also? I get headaches when I drink red wine, do you get headaches when you drink red wine also?". Things like that they are doing, right? These are sort of self-organizeds cientific projects that are happening.

So then these projects are happening outside conventional academies, they are not happening at the universities, they are not happening at Université Marie Curie, they are not happening at Stanford University, they are just happening at, just people, meeting together and doing these things, right? Who monitors these projects?

17'16 - transcrit Juu, relu son cqfd93

How do we approve non-conventional projects? So, the thing that I want to ask about is, and actually I'm going to ask you a lot of question, I'm not gonna provide any answers. The thing that I'm realy asking about is: how do we approve non-conventional projects?

If you decide to do a study on yourself, maybe you are taking samples out of your body, and measuring them or something. Is that ethical? Is it ethical to harm yourself? I mean the society says no. It is illegal to commit suicide. In many societies at least, in many societies. So the issue really becomes how do we evaluate and monitor projects that lie outside things that are governed by law?

Citizen science, sensors, self-measurement, participant led research, that's one of the big things that are very popular. As I mentioned people have certain diseases and they make a website where people of same disease can come together and share their experiences. You know, irritable bowels syndrom, crohn disease, different kinds of cancers, a lot of people want, they want somewhat comfort in a community, right? And they are sometimes giving each other advice and they are doing it outside the mechanism of medicine and health laws and the institutions.

So what is the substitute for IRBs in this question, that's something that I'm thinking about.

19'04 - transcrit Juu

What about ongoing monitoring? And what about ongoing monitoring? Even if you approve such a project, even there is, even if you set up a system where you can approve some kind of projects that's going on, how do you monitor it on an ongoing basis? Where people are doing things they may be collecting dat on others, what if I'm collecting data on you and misreport it. I write something bad about you or I tell something good about myself that doesn't exist. You know, what if I recruit all of you to measure water samples from your village wells, and you find out it's not very good, and you decide not to report it, right? So these are the issues. Or if you find that somebody else's well's not very good, and that person hasn't reported. Should you tell on that person, that person hasn't reported, you know, cause that is the isue of privacy that come's in. So invading privacy of others, if there's a citizen science project let's say I recruited all of you because I'm studying nesting habits of certain kinds of birds; And you all are bird lovers and I've recruited all of you, and you are supposed to go to the nests of the birds and take photographs and bring them back to me. Turns out that you're also a collector of eggs, and you steal the eggs, right? That's the issue so arming existing data or arming natural environment or culture property, these are the issues when there is no mechanism for ongoing monitoring that might exist in a more conventional academy.

20'49 transcrit par Cpm

Legales tools are… So the reality is that legal tools are existed such as copyrigth law and that's drugs??? are inadequate, if they don't exist, and if they exist, they are inadequate, they are inappropriate, they are expensive, nobody likes lawyers, lawyers are expensive and they are confusing, and they really scare as. How do you mean know, how many of you have ever been in a corp? No one. And a lot of people will never go to a corp in their normal lives. I mean a normal life, does it involve lawyers? And does it involve court and yet a life is rule by laws. Right? So, it is an interestring thing that we have all these laws and yet laws don't really, you know, come in to play on our life a daily basis.

21'41 transcrit par Cpm

Slide 14/10 Do no evil

So, one solution could be do no evil. You are inbound??? with that, right? Do you know do evil? That hasn't gone down very well. That is a big company that has this, think all do no evil. And they have done even evil up there. So, maybe, the thing of I, I thinking quite a bit is about just mutual respect and social contract. So how many of you ear the term social contrat? "Contrat social", here we go, french, yeah, Rousseau, yeah. So this is notion that we give up something to get something. Right? We, individual ??? become member of a society or a country, we give up some ??? return for the safety and other things that society provide. That's the social contrat, right? All be a citizen of France and France will look after me, we can other thing. Is somebody laugh. Public : yes because maybe too much. Yeah. But anyway, that's the notion of social contract. This notion that there is something that bind of to be grouped together.

23'03 - transcrit Juu, relu son cqfd93

Good behaviour by another name

So, here are different names for good behaviour. You know, a lot of conferences nowadays have this thing called "code of conduct". And of course social contract, doctors have this thing called hypocratic oath, you know the little Rx, you know "I'll never harm anyone blabla", we have something called honor code university, I don't know if you have that here? In the United States there is honor code that you wil not cheat, like we can get exams where you take exam to your home, and you bring it back two or three days later but it's a honor code that you will not ask someone else, you know. Mutual respect… So what I'm saying is interestingly there are things they may not always work, but there are things out there which are not based in law. And they are designed to make comuunities work, ok? So can something like this be used or maybe a combination of these things be used?

24'16 - transcrit Juu

Importance of data integrity

One issue that becomes very important that I'm really interested in is the notion of data integrity.

This thing is telling me that I walked five thousand one hundred and five steps today. What if it's over-reporting? What if it's under-reporting? I don't know. Should I just believe it? We go to live believing a lot of things, not questionning them, right? Until we get some other evidence to the contrary.

There is a lot of focus in this conference and in my life, i work at creative cons as I said, on open license, right? First of all I guarantee you ninety percent of the people don't know what an open license means when they say "open license". Ok, fair enough. Like people don't know what organic means, but they shop organic food, right? Open is good, but is not a substitute for good science, 'cause in the end science is asking some questions, and that is more important than anything. What would you rather? Open but crapy sceince, closed but good science? If you're a scientist you would probably choose good science, because a scientist is motivated by answering questions. By finding insights of something. So the question, and this is particularly useful not so much in software, but in hardware. Open hardware. What if the design is open but the data coming out of the hardware are bad? So let's say I make a hardware, I made some fantastic sensor, you know like the star wars tricorder it can measure everything, and I publish it under an open license, right? And you come in, you see that, you like it, you take it down, you're a great guy, we are not very honest. You take my open design and you make some changes to it, or you maybe cut some corner and make something which has license opened but now is not producing right data. And what if this thing was measuring something that was important for environmental health or public health, maybe reporting on air quality, maybe reporting water quality? that could e serious consequences for public health.

So the issue of data integrity is very important which has nothing to do with licensing, but it's very important for open science and the quality of science.

26'55 - transcrit Juu

Evaluating data integrity

So, there is a study that I found where they found many ways in which you can actually evaluate data integrity.

By the way, all my ?? talk is on my website and, no software's required, just a browser, just click you know, it's a program I wrote and so it's available to anyone. So you can see all the links are there.

So you can measure different... Think of these like vectors along which you can measure data integrity. Is the data accessible, believable, complete, consistent, relevant, secure, etc. There is many different things you can measure, you can add more to this or subtract from this. They are dimensions that you can measure.

Building can do as look a reputation, or think of it like social capital. This is very common on web communities, right? How many likes for example, or how many retweets, this is one example of some kind of trust and something. We have reputation scores in communities that are software, particularly software communities well you know, there is someone who's answered a lot of questions. Has people used Stackoverflow? Stackoverflow has the reputation, all has this reputation system basically, and as your reputation grows more you can do more things, etc. So that's sort like trust accross social networks, and what I call co-calibration where you can take yourself and calibrate yourself in someone else, or take a piece of hardware and calibrate a against a non-truth, maybe a reference hardware. So, that's another way for evaluating data integrity.

The bottom line is that there are mchanisms out there for making our lives run in a community fashion, without involving law. What are some of those mechanisms that can be taken together or combined into something that can be used to evaluate and monitor open science projects. And this is the thing that I actually found more interesting right now and sort of my post-license world of work.

29'37

That's the all talk I have. I think I have a lot of time yet I really want people...

35'35

Public : For me, thank you for the talk...

37'45

Come on...

40'30

Public : j'essaie en anglais ou ...

43'49

Ask me anything.

45'21

Maybe in my culture...

46'23

... I have to working a lot...

47'00

Thank you all.