[Dev Chat Followup] Input on a new license for EteRNA content

This intro primarily paraphrased from Rhiju’s comments in the developer chat on July 7, 2015, with a condensed version of the points raised, with the names of those who raised them, following. Read the original conversation in the dev chat log starting here: http://eternawiki.org/wiki/index.php5/2015.07.31_Dev_Chat#licensechat

EteRNA content is currently licensed under the  Creative Commons Attribution Non Commercial ShareAlike 4.0 International License. This will likely not be strong enough to protect biotech companies from taking any designs we make through EteRNA-Medicine. We would like any clinically used products to benefit EteRNA, and to to cite EteRNA players as inventors. A different license for the EteRNA-Medicine labs could be used, or another option is to start working from a more thought out license, such as the one that Foldit has.

Please take a look at this and post any comments you may have.

Items raised in the dev chat were:

  • [Machinelves] “By definition anybody can take whatever they want [under the current license]”, [LFP6] so long as it is for a noncommercial purpose
  • [Jennifer Pearl] “My friends have been mentioning that with me developing tools”
  • [jandersonlee] “You might lose some players depending on the license”
  • [LFP6/jandersonlee/rhiju] Old content would remain under the existing CC license
  • [rhiju] “We are also beginning to talk to legal team and IP experts. It would be pretty astonishing if eterna could patent molecules with, say, 500 inventors listed on the patent. But that’s probably what would have to happen, since we will have a strong record of who proposed sequences, core ideas, and mods to get to these functional molecules. There are going to be crazy issues with getting that patent, and we are excited about figuring out how to make this happen!!”
  • [jandersonlee] “Might need to have a separate site/server to easily separate the content”
  • [Machinelves]  “It would also be good to protect the data since it can be potentially exploited. At the same time, we don’t want to hand over keys or partnerships to anyone who will themselves exploit our inventions. So it will be critical to choose carefully who to do business with” [Rhiju] “We are trying to line up partners ahead of time who we trust – its going to be an interesting process.”
  • [Machinelves] “The whole goal of this research is to get medicines to market so at some point we need to make this leap”
2 Likes

Didn’t even know we were on this path so soon, excellent. One question: Why 500 inventers?

1 Like

I assume the thought is that there are more contributors than just the player that submitted the solution, via mods of mods, analysis, etc.

1 Like

(quick note from devs) – we are following this thread eagerly. thanks LPF6 for setting it up. players:  everything in the agreement/license is up for discussion. if there are parts that creep you out let us know. if there are parts that you think might allow an external company to poach eterna players’ inventions, let us know. 

and: i am actually talking to foldit devs in person now, and again in 8 hours – if you have questions about their license, i can ask.

1 Like

Something that definitely needs to be concidered is the various types of content in EteRNA. Player puzzles, scripts, solutions, comments, profiles, etc. are all different and may require different protection. For example, should scripts be licensed under an OSI software license like GNU or MIT? Do we want them to be that open? Should script makers have a choice?

2 Likes

We have talked about having a general license that is superseded, on occasion, by a more specific license.  For example, if we have a medical partner who wants to preserve certain rights or the medical application requires a slightly different scope.

We could do something similar to address different kinds of content.  I just don’t want people to be drowning in licenses.  People don’t read EULAs as it is.  Perhaps a good approach would be to use general language like “scientific discoveries” (which is the language the Foldit license uses) in a general license, along with sparing use of more targeted language as needed in a more specific license.

One thing that we should focus on regardless of how we go about it is to make the language accessible to all players.  Creative commons does this, by having a “plain-language” license that gives users a (hopefully) accurate picture of how the license effectively works.  Foldit also does this.

Giving script makers a choice about how to license their scripts would add a layer of complexity to the progress, but might be worth it to give them more autonomy.  It might be helpful to see if the current script makers have diverse feelings about how others should use their scripts.  If not, it might make sense to go with the consensus approach.  If we do want to give script makers explicit choices, I would encourage the use of a default license that applies unless a player switches to a different one (which would then become the default for that player).

FYI - OSI allows people to “sell” copies of the software? http://opensource.org/licenses/MIT

2 Likes

AGPL doesn’t, and I think that’s still OSI.

I think that simplicity should be important, but the different ‘categories’ of content should be identified and appropriately protected. For example, CC actually is not a valid software license.

I’d think starting with a general license would be good, and adding specific clauses where needed. I didn’t know if a software license for software would be wanted, as opposed to just including it in the custom licensing. There should be a human-readable and legal version of the licensing terms as Foldit and CC have done, to both provide understandability and prevent abuse.

2 Likes

hello! I am just now able to take a look at this thread, sorry for the delay, and to have missed the opportunity to ask foldit devs about the license. Hopefully there will be another opportunity! 

Quick note on anything that would “allow an external company to poach eterna players’ inventions”.

I had forgotten that we specifically set up the creative commons to be protective and not strictly open free for all. So we actually don’t have to worry so much right now about other people taking our work and exploiting it as I had previously been concerned.

People currently cannot exploit our research monetarily, since this particular creative commons license is non-commercial.

  • NonCommercial  — You may not use the material for commercial purposes.
    Additionally [I am not a lawyer so I don’t know for sure] but I would think we are also protected from people patenting and squirrelling away our research by virtue of the requirement that it is Share Alike:
  • ShareAlike  — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

So under the current license, the only poaching that would be taking place would be noncommercial and bound to continue to be shared under the same terms. Therefore it is currently possible for others to use our research, but not to make money from it, and not to lock it down or bury it as a competitive strategy.

That being said, because our data is openly available, it is possible for people to collect it, and depending on laws on copyright, fair use, derivative works, modification, inspiration, etc. could potentially patent slightly divergent flavors of our work.

And I have heard of, though cannot cite at the moment, cases where companies will even have the audacity to take research, modify it enough to make it patentable, and then come back and sue the originator of the research for it being too similar to their newly patented ( and yet stolen ) data. Hopefully that is urban legend, if I can dig up a real world example I will cite it.

So one thing to possibly address in the next license is specifically addressing and defining what constitutes our intellectual property.

There is an interesting question about the right to patent genes for example, which I think applies to all aspects of the natural world. I don’t personally care for patenting particular genes, chemicals, strands of RNA, etc. because I don’t think we can own the natural world, geometry, or physics. However, I do hear that in order to move forward with production, we need to have a license that is compatible with production relationships. I suppose the next step would be to clarify exactly what kind of relationships we are looking to form, what kind of profit and cost models are to be expected, and get a real outline of the intended production pipeline. That will inform us as to what kind of licensure is needed to protect our IP, and also manage the costs of production.

I find it very easy to have a knee jerk reaction against any kind of for profit model, but the whole point of this research is to get medicine into people’s hands. So I am happy to listen to the reality on the table and see if we can figure out something that keeps things moving while also protecting the research that we all hold so dear.

Okay those are my initial thoughts on our current license… now on to examining the foldit license. :slight_smile:

1 Like

Defining what content we have, the different aspects of said content we’re looking to license/protect/share, what we want people to be able to do with it, and what we want to restrict people from doing with it is definitely the important thing here.

1 Like

this is a really good point, and I wish I had read the whole thread before commenting above!

Right on LFP6, you are exactly right that there are many kinds of IP on our site, and each one has different nuances. I was thinking those nuances would be defined within one contract. It’s an interesting idea to license out each piece under different contracts.

One challenge there is clearly defining in some meta area / contract which pieces are covered by which licenses. I think this is not insurmountable, but any time a whole site is not covered by one unifying contract, it does introduce ambiguity for the legal standing of any piece of IP not explicitly labelled everywhere it is used with its pertinent license. 

However, it’s an interesting enough approach that I think it is worth considering. Maybe we could have a meta license that explicitly defines each category of IP and assigns an appropriate sub license to that category? The solidity of this approach is beyond my pay grade, but perhaps a little research or asking eterna’s legal team would clarify whether this is at all common practice.

I don’t think that having one license is necessarily a bad idea either, just that it would need to be written so that all different content has correct coverage (everything that needs to be dealt with is, but in a way so that everything correctly applies or else is noted as not applicable, ie the reason why CC shouldn’t be used for software).

Hopefully that made sense? :slight_smile:

1 Like

I wholeheartedly appreciate making licenses readable. However, while it is true that the average person does not read a license, and will be unduly burdened to read through multiple licenses, the fact remains that the people who will seek to exploit our research are the ones we are writing the contract for, and they absolutely will read to the letter every word of every contract we use, if they want to exploit our data to the fullest extent they are legally allowed. So while it’s annoying for licenses to be wordy and specific, they are not written for the average joe who will never do anything with our data. They are written for the big pharma company with more lawyers on staff than we have active researchers and scientists put together, and they absolutely have the bandwidth to read and exploit our contracts.

I think LFP6’s suggestion for different licenses for different kinds of IP would actually preserve some readability for less critical path IP such as scripting, comments, puzzles, etc… though now that I am thinking about it, I wonder if we are too quick to consider those things of any less valuation than the raw data coming out of the lab. Because a particular script for processing the data could be just as pertinent a ‘discovery’, same with forum comments, or random puzzle designs. Working on fundamental research means we still don’t know what is and is not scientifically relevant. 

But ( and LFP6 please correct me if I am wrong ) I think his wish in general was to offer the general user base an alternative to the heavyweight license meant to protect core research, by allowing some kinds of user-generated IP to have lighter weight licenses. Whether we all agree on this remains to be seen, but I think it is worth considering.

As LFP6 says, regardless of the end result of the license, we can make a human readable version, even with complexity.

Now what Ben suggests about having an occasional specific license that supersedes the general license is interesting. One concern here is understanding how this is not a bait & switch… are we to have an open license until we make something worth going into production over, and then suddenly it gets locked down? Or is this to simply add additional protections and open up ability for profit models to be compatible with production and distribution costs?

While tedious, we will have to get into specifics of these production relationships to understand exactly what kind of licensure we need to both enable and protect next steps in the process.

Just to chime in here. My understanding is that you’re correct: companies aren’t interested in content that we’ve already created.  Take individual sequences, for example.  Even if they weren’t under a Creative Commons license, they are out there in the public, which means any company is going to have a hard time patenting them (and therefore making money off of them).  As far as individual sequences are concerned, the only way that a company would probably want to develop them would be if we entered into an agreement that let them do so.  This would probably mean entering an individual agreement with a specific company for a specific project, which would involve an individual license that is only tied to that project.

As far as software is concerned, however, Eterna can make money by giving companies licenses to use the software we develop.  Of course, we could give academic and non-profit institutions a license to use it for free (I think this model is pretty common).

Other content: comments, profiles, forum posts, etc. are automatically given copyright protection (for the literal phrasing), at least under US law (as far as I know).  If you guys want to read more about this stuff, Teresa Scassa is an IP attorney in Canada who has taken a big interest in citizen science.  She’s got several articles out on the topic and is writing more.

3 Likes

I wanted to thank everyone who has contributed to this thread!

I want to sum up a tentative plan moving forward – and we welcome further responses.

(1) We are committed to keeping eterna a fully open project where player sequences, ideas, and scripts are not sequestered but available for the public good, especially the advance of medical research. We think the Creative Commons license (Non-Commercial Share-Alike) remains a good way to move forward on publicly available player contributions.
We have discussed with academic and corporate partners what Eterna’s very public and open nature means for them. We have now confirmed with several potential partners they are unlikely to ‘steal’ this content for commercial use if it is in the public domain. Instead, any partners who fund eterna lab puzzles would create design challenges involving sequences that would not be commercialized. In terms of what they would gain, the partners would be more interested in the automated prediction/design algorithms generated during these design challenges to create their own (non-public) products – see next.

(2) To sustain Eterna in the longer term, we are taking steps now to ensure that algorithms like Eternabot can generate revenue as demand for RNA design increases. Eternabot code is in a private repository and does not include code directly written by players, except devs who have agreed to ‘sign over’ copyright to Eterna, currently administered by Stanford. We will make that code publicly available for non-commercial use, especially as we write papers describing improvements. But we will charge for-profit entities a yearly license for use of Eternabot and related design algorithms.  There is strong precedent for this model in other software communities, such as the Rosetta software that underlies Foldit.

(3) In addition to player in-game contributions and algorithms, the other major set of eterna ‘creations’ are peer-reviewed papers. These will continue to be published, following standard licensing of publications. However, we are additionally committed to making all eterna papers open-access , even if this requires additional journal fees. 

Thoughts from the community?

1 Like

I’m still concerned about using CC to cover scripts. They specifically state that it is NOT a valid software license. I’d suggest taking a look at AGPL.

1 Like

Yes that makes sense, I think you are saying to be sure that we are specific as to what license applies to what piece of IP, particularly such that it is appropriate to the particular kind of IP. And you raise a really good point about CC not being for software itself, per CC’s own FAQ:

"Can I apply a Creative Commons license to software?

We recommend against using Creative Commons licenses for software. Instead, we strongly encourage you to use one of the very good software licenses which are already available. We recommend considering licenses made available by the Free Software Foundation or listed as “open source” by the Open Source Initiative.

Unlike software-specific licenses, CC licenses  do not contain specific terms about the distribution of source code , which is often important to ensuring the free reuse and modifiability of software. Many software licenses also address patent rights, which are important to software but may not be applicable to other copyrightable works. Additionally, our licenses are currently not compatible with the major software licenses, so it would be difficult to integrate CC-licensed work with other free software. Existing software licenses were designed specifically for use with software and offer a similar set of rights to the Creative Commons licenses.

Our licenses are currently not compatible with the GPL, though the CC0 Public Domain Dedication is GPL-compatible and acceptable for software. For details, see the relevant CC0 FAQ entry. We are looking into compatibility of BY-SA with GPL in the future; see the license compatibility page for more information.)

While we recommend against using a CC license on software itself, CC licenses may be used for software documentation, as well as for separate artistic elements such as game art or music."

https://wiki.creativecommons.org/wiki/Frequently_Asked_Questions#Can_I_apply_a_Creative_Commons_lice…

Bold and underlined my own addition. 

And also I wonder about the subtleties of having a general license that says non-commercial share alike, and then expecting to be able to apply a separate commercial license to anything produced under the originally non-commercial license umbrella. Would the first license not still stand? 

I’ll need to read Rhiju’s post below in more detail to comment further, but this is something we want to be clear on for sure, so that we don’t accidentally lock ourselves out of being able to go into production on our own research.

As for the point CC makes about lacking patent-specific clauses, here is an excerpt from a EULA on just that, which may be useful to consider:

SECTION x.x - You agree to a prohibition on Patent Action, and that You may not file a lawsuit in ANY court alleging that any of the following infringes any patent claims that are essential to own, use, copy, modify, publish, review, or otherwise engage with [this software]:

x.x.x - [This software] as a whole

x.x.x - Any part, component, element, or individual aspect of [this software]

x.x.x - Any of [this company’s] Intellectual Property

Since we are also publishing papers on our research, those act as prior art I think, though my understanding is rudimentary. However as you say I believe it is difficult to patent anything in public domain, particularly if there is official, dated documentation of its release ( such as being published ).

So citing algorithms, production / analysis methodologies, etc. in publication may offer some protection, without the pitfalls of giving away the full blueprints as is required in actual patent filing.

Thank you for the reference to Teresa Scassa’s research. I found one of the papers to which I think you are referring?
https://www.wilsoncenter.org/sites/default/files/Typology_of_Citizen_Science_IP_Rights_Scassa.pdf

She raises relevant points such as:
"Citizen science project coordinators should be concerned about the management of intellectual property rights because of their potential to lead to unanticipated consequences that may hinder the dissemination or use of the research output.

For example, when citizen scientists are invited to contribute content in which they have copyrights, such as photographs or written accounts, it would be difficult for a researcher to disseminate the datasets containing these contents or to reproduce the copyright-protected contributions without authorization."

I’m not sure I agree with this next statement as a whole, since contributing raw data is still a procurement and curation that would not have occurred in the same way by another individual’s hand, but it is also relevant since she thinks this is the case, and I do agree with the second half of the conclusion that detailed prose qualifies as original expression. Perhaps there is some precedent for defining procurement of raw data as not deserving of IP protection? I assume with all the emerging citizen science projects, if there is not already such a case, there will be one eventually. For example, where do we draw the line between what is data and what is design? Is a piece of DNA not a piece of data? Yet it is patented left and right by companies who invest significant effort and research into discovering those pieces of data. How is this qualitatively different from a citizen who has invested significant time and effort in finding just the right RNA sequence to solve a puzzle? It looks like, from the table on pages 10-11, Teresa defines activities at Eterna to be under the category of ‘problem-solving’ and not strictly ‘data gathering’.

“A contributor who provides only raw data to a project has no intellectual property rights in that data; by contrast, observations expressed in detailed prose or in a photograph may qualify as original expressions.”

“In terms of patents in the citizen science context, a key issue might be whether the contribution of any individual participant amounts to inventive activity such that they should be included as a co-inventor in a research project that leads to a patentable invention.”

So with regard to her particular concerns about feasibility of applying licenses that allow redistribution ( commercially or non commercially ) of citizen-science gathered research, I think the important thing is to clearly contextualize and disclose exactly what will and will not be considered distributable IP, and to get clear permission from users on rights for distribution. We have discussed on a few occasions having a EULA popup that new users ( and on introduction, existing users ) would agree to, to be sure among other things of this, and for example general disclaimers for users who are under 18, etc.

It would be good to include in such a EULA clarity and outline of what licenses we use, and which kinds of IP are under which license, and whether anything is exclusively licensed to the user or not. Many websites that accept user generated content of any kind require a carte blanche access to rights for that content, if not to directly exploit it, simply to avoid liability for exactly the issue you mention where US citizens are automatically granted copyright simply by the act of publishing their IP. ( if I understand correctly - on this and everything else I have said please anyone correct me where I am misunderstanding ). 

I see in the above paper, a photo of a tree from another citizen science project is included, with a credit to the citizen who took the photo, but I’m not clear on whether their actual permission was obtained. If not, it would be ironic, seeing as citing the author does not necessarily permit use, and certainly not publication. I guess I’ll assume explicit permission was obtained? :smiley:

Anyone wishing to see a nice comparison between Patent, Copyright, Trade Secrets, and Database Rights is advised to check out her Table I on page 7 of the above linked paper.

All in all an interesting paper and very relevant to what we are doing, especially during the exciting transition from purely foundational research into actual production pipelines. 

Also, yes as you mention below, MIT’s OSI permits sale of the software.

I’ll address applying an individual license for companies wishing to pursue application of specific aspects of our research in Rhiju’s comment below.

very good point. scripts need a distinct license; I missed that above.

@LFP6, AGPL looks good. My feeling is that devs or anyone else should not be able to make money off player’s eternaScripts. so AGPL would be quite appropriate. It would also allow players to reuse other players’ AGPL-covered scripts in their AGPL-covered scripts, if I understand correctly – please correct me if I am wrong.

The last question is: would AGPL on eternaScripts prevent adoption of eternaScript ideas into softare like eternaBot that we could license for commercial users to help sustain eterna.  if devs adapt ideas tested in eternaScripts into eternaBot, the ideas would minimally have to be rewritten (we use different languages) and reoptimized. Does that seem acceptable to players? If the ideas encoded in players’ eternaScripts are put into ‘competing’ (outside-Eterna) software packages, is that OK with players? My sense is that would be OK – sharing of ideas is what makes scientific research work. If some players want to hide their script ideas, they would see the AGPL license and could decide to not use eternaScripts but use separate bots (like Nando did with ViennaUTC). 

LFP6 and Elves, would appreciate your thoughts.  I think we’re honing in on a reasonable solution, and can start drafting EULA & plain-english versions.

1 Like

Just read the Scassa paper. Funny that Eterna got highlighted in the ‘red’ part of Table 2! See thread below proposing different forms of IP – I think we have delineated the types and some working models for ownership & licenses – but please comment below if we are missing some type of IP and need to discuss.

1 Like

If any user content is licensed as noncommercial, whether under CC, AGPL, or any other license, no one can use it commercially. That was actually one concern I forgot to mention (doh). In order for it to be used commercially, the licencor/rights holder (in this case, the content creator, NOT EteRNA or Stanford) would need to give permission.

I personally do think that it’s reasonable to make such profit to sustain EteRNA, and would love to see that, but the license needs to reflect that you have permission to do so.

1 Like