Licenses for Data

Overview

Exercises: 25 min
Questions
  • What is a license?

  • What licenses are there?

  • How do I choose a license?

Objectives
  • Understand the main issues in licensing.

  • Know what license suits your data.

What is a license?

Roughly, in lay terms, when you put in work to create or curate something, you acquire rights over that thing. Your rights include things like the right to be identified as the author. When other people use your work, they may do things that infringe upon your rights. To help them understand what they can and can’t do, and to give them a legal basis on which to interact with your work, you can apply a waiver or a license to your work.

If you don’t include a waiver or a license people will not know how they can legally and appropriately use your work. Include a license.

How do I choose a license?

What common data licenses are there to choose?

There are several families of licenses which offer suites of similar licenses that place different restrictions on the users. The families covered here are Creative Commons (CC-x), Open Data Commons (ODC-x), and (UK) Government License (xGL). Each subheading below addresses a different use case.

No restrictions

To make your data maximally reusable, at the cost of losing the (legal) right to attribution and any control over how it is used, apply a public domain license. Public domain licenses can consist of a waiver, giving up all your rights, and a permissive license in case the waiver is not acceptable. These licenses avoid the attribution stacking problem of attribution licenses, but can only be used where

Community norms

Just because you’re legally allowed to do something, doesn’t mean you should. It’s within the license terms of a public domain release to include a work verbatim without attribution, but many communities have strong norms against plagarism.

If the norms of your community cover your concerns, and you’re not especially worries about how your work is used outside your community, you can use a public domain license even if you would like attribution (you just don’t require it).

Attribution

You can require that people attribute your work to you when they use it. This is a common requirement, and modern licenses usually make it clear what constitutes attribution (e.g. a URL for a page with attribution information). A disadvantage to requiring attribution is that it can lead to attribution stacking, where attribution lists become unweildy because each derivative of a work must attribute the creators of each previous incarnation.

Share-alike (copyleft)

You can require that people share derivative works as openly as the original. This can be awkward in practice because many copyleft licenses are incompatible with one another, meaning that it can be very difficult to find a way to release resources that include multiple components with sharealike licenses.

No commercial use

You can require that people release derivatives of your work free of charge. It is quite common for work with a non-commercial license to include an alternative license for use by those who wish to use work commercially (acquisition of the work under the more permissive license usually costs money). The definition of commercial can vary - in some cases textbooks or academic articles may count as commercial (even if the person producing the derivative work doesn’t get the money).

Discussion 15 min

Which licensing approach seems most relevant for your data?

Does that approach suit all your data from all your projects?

Further reading

Key Points

  • Licenses let you control who can do what with your work.

  • Licenses are not a substitute for strong community norms.

  • Using a well-known license makes it easier for users and automated systems to understand and respect.