• Share this article:

License Certification (Mostly) Just Happens

Wednesday, July 19, 2017 - 13:17 by Wayne Beaton

The Eclipse Intellectual Property Policy defines two types of intellectual property (IP) due diligence for third party content. The so-called Type A Due Diligence is concerned exclusively with license certification; and Type B Due Diligence is concerned with license certification, provenance checking and a deep dive scan of the content for various anomalies.

Regarding the analysis of Type A third party content, the IP Policy makes this statement:

It will be the responsibility of the Eclipse Project to run and analyze the results of a scan tool provided by the EMO, using parameters provided by the EMO, to obtain the terms and conditions under which such Content would be distributed by the Eclipse Foundation, and ensure that such terms are consistent with the Project Licenses. The Eclipse Project will certify that the terms and conditions of its Non-Eclipse Content conform to the then-current licensing guidelines as provided by the EMO. No further approvals will be required from the EMO prior to the Eclipse Project placing the Non-Eclipse Content into the Repository.

In short, it is the project team’s responsibility to run a tool, analyse the results, and certify the content as being consistent with the various conditions. It took us a while to identify the scan tool.

The Eclipse IP Team has been using open source Fossology for a while. While Fossology is a very comprehensive tool that’s great for IP Analysts, our experience suggests that the learning curve is too steep for it to be used by committers. What we wanted to provide was a tool that could generate a simple report (both human and machine readable) containing a manifest and corresponding licensing.

We found what we needed in ScanCode, which is produced by some old friends at NexB. ScanCode has the ability to generate the manifest and summarize findings in a number of different formats, including HTML, JSON, and SPDX RDF and Tag/Value. Using ScanCode is pretty easy. In fact, it’s so easy that the Webmaster integrated its use into an Eclipse Genie script. So… for a project team to run the tool, all they really need to do is create a CQ. The rest just happens automatically.

license-certification-workflow

In cases where the third party content has a single white listed license, committers only need to create the CQ and then add the content to their builds. If Eclipse IP Team review is required, the committer may need to participate in that review.

To leverage this automatic scan, a project committer creates a Type A contribution questionnaire (CQ) in the usual way and attaches the corresponding source code. The magic happens after the PMC gives their approval: the Eclipse Genie process identifies every Type A third party content CQs that has been approved by the PMC, runs ScanCode on the source code attachments, and then attaches the report directly to the CQ.

If a single license is identified for all files in the third party content, and that license is on our white list (see below), then the CQ is automatically marked license_certified, its license information is updated, and the CQ is marked resolved. If multiple licenses, blacklisted licenses, or otherwise problematic licenses are detected (i.e. anything other a single white listed license), then the CQ is sent to the Eclipse IP Team for further investigation.

When you see something like the following on your CQ, you’re good-to-go.

Screenshot from 2017-07-18 15-19-46

At this point, the content can be used in project builds and included in milestone builds. Once all project CQs are either marked as license_certified (type A) or approved (type B), the project can do an official release.

I’m really curious to see what sort of hit rate we get on automatic license certification, vs. how many requests will have to be reviewed by the Eclipse IP Team. My hope is that we will get to a point where we have an 80% automatic approval rate; but we don’t have enough data to make a call yet.

Our current implementation has the following licenses in the white list:

  • Apache License 2.0
  • Apache License 1.0
  • Apache License 1.1
  • BSD 2 Clause
  • BSD 3 Clause
  • BSD 4 Clause
  • Eclipse Public License 1.0
  • Eclipse Distribution License 1.0
  • MIT License
  • ISC License
  • NTP License
  • OpenSSL License
  • Public Domain
  • SIL OPEN FONT LICENSE
  • W3C Software and Notice License
  • zlib license

We need to find a home for this. This list will grow.

We’re using Bug 496959 to track our work to update our processes and documentation regarding the Eclipse IP Process.