Hello There, Guest!  

Ethical questions surrounding the release of OKCupid data

#11
(2016-May-13, 10:31:55)Emil Wrote:
(2016-May-12, 22:02:03)MichaelZimmer Wrote: I also posted a set of questions here about the research ethics variables of the project (also emailed to the author), but that post has been removed without any communication to me. Addressing research ethics is central to peer-review.

[Edited to add] I also notice that the dataset is now password-protected.


You sent the same questions per email to which you received an answer. However, because you were impatient and decided to duplicate the questions here as well, I removed the forum version.

I recommend some patience in receiving answers and not duplicating content.


This is active re-structuring of debate, and akin to censorship. Proper reaction to avoid duplication would have been to attach your answer to Michael Zimmer to the public forum, not the private email.
 Reply
#12
(2016-May-13, 10:30:15)Emil Wrote: Note: non-scientific discussion of the dataset is moved to this thread. Peer review threads are for just that, actual scientific peer review. Yes, that does mean that your discussion posts about that topic go into that thread.


It appears that "non-scientific" thread that is providing peer-review of the research methodology has been deleted.
 Reply
#13
(2016-May-13, 14:38:44)MichaelZimmer Wrote:
(2016-May-13, 10:30:15)Emil Wrote: Note: non-scientific discussion of the dataset is moved to this thread. Peer review threads are for just that, actual scientific peer review. Yes, that does mean that your discussion posts about that topic go into that thread.


It appears that "non-scientific" thread that is providing peer-review of the research methodology has been deleted.


While the link you have in "this thread" above doesn't go anywhere, I do now see the "ethical discussions" thread. Was that temporarily deleted?

More to the point, why do you not consider peer-review of the methodology used in this paper appropriate for this "open" peer-review process and forum?
 Reply
#14
(2016-May-13, 10:31:55)Emil Wrote:
(2016-May-12, 22:02:03)MichaelZimmer Wrote: I also posted a set of questions here about the research ethics variables of the project (also emailed to the author), but that post has been removed without any communication to me. Addressing research ethics is central to peer-review.

[Edited to add] I also notice that the dataset is now password-protected.


You sent the same questions per email to which you received an answer. However, because you were impatient and decided to duplicate the questions here as well, I removed the forum version.

I recommend some patience in receiving answers and not duplicating content.


You responded to my email, but you did not answer my questions.
 Reply
#15
The lead author and some members of the editorial board are presently attending an international conference in London. Discussion of the matter will have to resume next week. I requested that any accessible copies of the dataset be removed from OSF.
 Reply
#16
(2016-May-13, 15:33:57)MichaelZimmer Wrote: More to the point, why do you not consider peer-review of the methodology used in this paper appropriate for this "open" peer-review process and forum?


See here.
 Reply
#17
FYI: https://www.wired.com/2016/05/okcupid-st...a-science/
 Reply
#18
I note that I have been banned from posting on this site for a while and am again allowed to post.

(2016-May-13, 10:36:15)pdehaye Wrote:
Quote:Personal data means identifiable data. So the downloading and organizing of the data with the identifiers may have indeed been a violation. (Point 2.) Can someone clarify the matter? On the other hand, a reprocessing of the dataset in which the personal IDs were removed and the data was resaved would seem not to be.

This reasoning is based on:
  • recital 26 of the EU Data Protection Directive (emphasis mine):
    Quote: (26) Whereas the principles of protection must apply to any information concerning an identified or identifiable person; whereas, to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible;

  • the recent EU Court of Justice Rynes case, which concerns the use of CCTV to film a street (without consent of individuals walking in the street), for the purpose of protection of personal property (i.e. a stronger, more justifiable, motive than testing scientific hypotheses without ethical consent). My interpretation of the authors' actions in light of this judgement might however need to be tested in court.

I have limited myself in this particular post to the points 1 and 2 that you have listed, but am even more convinced now of the rest of the six points I had made in my previous post.


My reasoning above relied on Danish Data Protection Law faithfully transcribing the EU directive into Danish law. It turns out that the Danish version is even more restrictive. The collection of data was clearly illegal at the outset.

Of relevance are:
  • The Danish Data Protection Act;
  • The guidance of the Danish Data Protection Authority;
  • The standard terms for research projects, which apply for research projects successfully registered with the data protection authority, if no modification to those terms is granted.

Judging from the register of private research projects (click "Fortegnelse"=="list", then search for the authors name in the form you land to as "Dataansvarlig"=="data controllers"), the authors have NOT registered their study with the Data Protection Authority, which should have occurred prior to the collection of data. Quoting from the guidance above: "The Act on Processing of Personal Data states that it is punishable by law to refrain from notifying a project to the Danish Data Protection Agency, and that it is punishable by law to violate the conditions stipulated by the Danish Data Protection Agency. The maximum penalty is a fine or imprisonment for up to four months."

Even if they had notified the data protection authority, the authors would most likely have needed to ask for a change in the standard terms, as those explicitly forbid disclosure and transfer to third countries. Both conditions were violated by the original release of data, including usernames, and would still be violated after removal of usernames. While the first disclosure could be argued to be a naive mistake, it becomes harder to argue that in light of the responses the initial release has received, without consulting directly with the Danish Data Protection Authority (they are unfortunately closed until Tuesday).

Technically, the Danish Data Protection Act is also relevant, in that it could be argued that an approval could be obtained outside of the purview of the Danish Data Protection Authority. This is true, but is however reserved for truly exceptional circumstances. That possibility is, I think, merely envisioned when the government wants to use its executive power to circumvent existing laws transcribing European laws. In those cases, special notification is required to other EU Member States, for instance.

I would argue in consequence that:
  • The original data collection was illegal in Danish law;
  • The original data release was illegal in Danish law;
  • A new data release of non-aggregated data would be illegal in Danish law;
  • The original data collection was illegal in multiple other laws, because the Danish Data Protection Act includes this: "Any rules on the processing of personal data in other legislation which give the data subject a better legal protection shall take precedence over the rules laid down in this Act.". Technically this exposes the authors to liability in just about any jurisdiction with strong privacy laws, with the full recognition of Denmark, and extreme obligations of reciprocity towards other EU Member States.

I repeat my call for a thorough investigation by the board of this journal of the situation, the resignation of its Editor-in-Chief (and, should he refuse, the board of the journal).

I add a call for a thorough investigation of other papers by the same authors in this journal, as I spotted a few which failed according to the same legal standards.

Finally, I want to observe that the author has actively used his role as Editor-in-Chief/Forum Administrator/Lead Author to:
  • inject his opinions into the review of this paper, by rejecting criticism of the ethical aspects of his work as "non-scientific" (an argument that has also been criticised by a world expert on that exact topic)
  • actively banning me from the forums for posting in this thread and engaging with the rest of the Editorial Board (even if only temporarily).

Even if my call for permanent resignation by the Editor-in-Chief is not heard, it seems to me that he should recuse himself from his role as Forum Administrator while discussion of his paper is ongoing, if he can't take criticism and has to use censorship.
 Reply
#19
(2016-May-15, 16:07:26)pdehaye Wrote: I note that I have been banned from posting on this site for a while and am again allowed to post.

(2016-May-13, 10:36:15)pdehaye Wrote:
Quote:Personal data means identifiable data. So the downloading and organizing of the data with the identifiers may have indeed been a violation. (Point 2.) Can someone clarify the matter? On the other hand, a reprocessing of the dataset in which the personal IDs were removed and the data was resaved would seem not to be.

This reasoning is based on:
  • recital 26 of the EU Data Protection Directive (emphasis mine):
    Quote: (26) Whereas the principles of protection must apply to any information concerning an identified or identifiable person; whereas, to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible;

  • the recent EU Court of Justice Rynes case, which concerns the use of CCTV to film a street (without consent of individuals walking in the street), for the purpose of protection of personal property (i.e. a stronger, more justifiable, motive than testing scientific hypotheses without ethical consent). My interpretation of the authors' actions in light of this judgement might however need to be tested in court.

I have limited myself in this particular post to the points 1 and 2 that you have listed, but am even more convinced now of the rest of the six points I had made in my previous post.


My reasoning above relied on Danish Data Protection Law faithfully transcribing the EU directive into Danish law. It turns out that the Danish version is even more restrictive. The collection of data was clearly illegal at the outset.

Of relevance are:
  • The Danish Data Protection Act;
  • The guidance of the Danish Data Protection Authority;
  • The standard terms for research projects, which apply for research projects successfully registered with the data protection authority, if no modification to those terms is granted.

Judging from the register of private research projects (click "Fortegnelse"=="list", then search for the authors name in the form you land to as "Dataansvarlig"=="data controllers"), the authors have NOT registered their study with the Data Protection Authority, which should have occurred prior to the collection of data. Quoting from the guidance above: "The Act on Processing of Personal Data states that it is punishable by law to refrain from notifying a project to the Danish Data Protection Agency, and that it is punishable by law to violate the conditions stipulated by the Danish Data Protection Agency. The maximum penalty is a fine or imprisonment for up to four months."

Even if they had notified the data protection authority, the authors would most likely have needed to ask for a change in the standard terms, as those explicitly forbid disclosure and transfer to third countries. Both conditions were violated by the original release of data, including usernames, and would still be violated after removal of usernames. While the first disclosure could be argued to be a naive mistake, it becomes harder to argue that in light of the responses the initial release has received, without consulting directly with the Danish Data Protection Authority (they are unfortunately closed until Tuesday).

Technically, the Danish Data Protection Act is also relevant, in that it could be argued that an approval could be obtained outside of the purview of the Danish Data Protection Authority. This is true, but is however reserved for truly exceptional circumstances. That possibility is, I think, merely envisioned when the government wants to use its executive power to circumvent existing laws transcribing European laws. In those cases, special notification is required to other EU Member States, for instance.

I would argue in consequence that:
  • The original data collection was illegal in Danish law;
  • The original data release was illegal in Danish law;
  • A new data release of non-aggregated data would be illegal in Danish law;
  • The original data collection was illegal in multiple other laws, because the Danish Data Protection Act includes this: "Any rules on the processing of personal data in other legislation which give the data subject a better legal protection shall take precedence over the rules laid down in this Act.". Technically this exposes the authors to liability in just about any jurisdiction with strong privacy laws, with the full recognition of Denmark, and extreme obligations of reciprocity towards other EU Member States.

I repeat my call for a thorough investigation by the board of this journal of the situation, the resignation of its Editor-in-Chief (and, should he refuse, the board of the journal).

I add a call for a thorough investigation of other papers by the same authors in this journal, as I spotted a few which failed according to the same legal standards.

Finally, I want to observe that the author has actively used his role as Editor-in-Chief/Forum Administrator/Lead Author to:
  • inject his opinions into the review of this paper, by rejecting criticism of the ethical aspects of his work as "non-scientific" (an argument that has also been criticised by a world expert on that exact topic)
  • actively banning me from the forums for posting in this thread and engaging with the rest of the Editorial Board (even if only temporarily).

Even if my call for permanent resignation by the Editor-in-Chief is not heard, it seems to me that he should recuse himself from his role as Forum Administrator while discussion of his paper is ongoing, if he can't take criticism and has to use censorship.



I invite users to remember that the privilege of posting direct, unfiltered comments is very unusual in academic journals. They are usually screened by an editor who has full executive power and can decide whether to publish any complaints or not. My complaints to editors regarding plagiarism or lack of transparency have often been ignored and no answer was received but being they closed journals, nobody of course knows this. For someone who cares so much about ethics, some gratitude for being allowed to use this platform is warranted.
As co-founder of this journal and editor of a sub-journal (OBG), I can act as moderator of this thread and the OKC paper thread if both parties (Emil) and pdehaye agree. I am not a lawyer or a legal expert on ethics so I cannot provide guidance but merely the role of neutral moderator.
 Reply
#20
Just information...

I was looking at Facebook a few minutes ago, when I got a notice that there was a new video from PC Magazine. I occasionally watch them for news about computers and electronics. Shortly into the video, they began to discuss this paper and called out Emil, with a number of nasty and false comments. Their description of the OKCupid news item (it has been popping up in various places) was inaccurate. Obviously, they have not read the paper.

Meanwhile, I have read the paper twice and wanted to post some comments, but I am unsure of the mechanics others are using. I think I can replicate the format though.
 Reply
 
Forum Jump:

Users browsing this thread: 1 Guest(s)