By releasing the Open Source License Compendium and the Open Source Compliance Advisor, Deutsche Telekom has supported Open Source Compliance. At BOSL‑3.0 I was one of the co-authors — on behalf of DT. But DT offers so many complex Open Source based products that it is too expensive to create the necessary Open Source compliance artifacts manually. Thus, DT needs a practically usable automated toolchain. This post discusses a new method (TDOSCA) and a new tool (OSCake) for automating FOSS compliance that DT develops and contributes under the umbrella of the Open Chain Project.
3 simple questions for an Open Source Compliance tool
Without any doubt, there exist already many Open Source compliance tools. The Open-Chain-Reference-Tooling-Work-Group has compiled a list of relevant information:
- Some of the tools can be grouped by the offering organizations like the Apache Foundation, SPDX, Eclipse, or the About Code Initiative.
- Some of the tools are on the sidelines because they have a specific focus or are not really tools or anything else.
- Some other means are services, not tools.
Deutsche Telekom has a simple point of view on FOSS compliance tools. Whenever DT comes across such a tool, it asks: Does this tool deliver the FOSS compliance artifacts DT really needs? If not
- What part of them can it deliver?
- How much work does DT still have to do manually if it used the tool?
DT has a long tradition of evaluating FOSS compliance tools. Its employees met excellent tools and brilliant experts. They often thought they could essentially support DT. But in the end, DT mostly felt like they didn’t really understand what DT needed (and still needs). To clarify this point: Whoever delivers large lists of (found) FOSS items and says that a company now has to discuss each entry of the list with its legal department does not really help the company.
Nevertheless, DT has to deal with such large lists, today known as ‘Software Bill of Material’. Open-Source-Compliance is not a question of pleasure or displeasure. Either one uses Open-Source software and fulfills the respective requirements, or one does not use the software. Therefore DT can’t wait anymore. The complexity of its products enforces DT to advance the automation of open source compliance actively. For solving that issue, it doesn’t want to start the next greenfield approach but to participate in existing projects — entirely in the spirit of the open-source idea.
Setting up the Test-Driven environment
DT’s first step was to improve its own communication: it wants to clarify in a better way what it really needs — from the point of view of a large company dealing with many complex software stacks. Thus, DT tried to apply the idea of ‘Test-Driven Software Development’ to the development of compliance tools:
- On the one side, these test cases should contain really usable software, licensing statements, and dependency information — in a way that real projects use.
- On the other side, these test cases should contain those compliance artifacts that would allow the distribution of the software compliantly if added to the respective software package.
Additionally, DT thinks:
- Existing open-source projects are mostly too complex for being used as reference material.
- Artificially generated software could better focus on essential compliance issues.
- The reference software should functionally be a simple hello world program.
- And it should ‘implement’ sophisticated compliance issues in a way that real open-source projects use.
By using such test cases, DT wants to enable the community, the tools, and the companies to verify,
- with which compliance traps a tool can already successfully deal,
- which artifacts a tool already deliver (and which not),
- where there are still some open issues, and
- where deviating results are only a matter of interpretation.
The ‘Hello World’ Open Source Compliance Test Cases
All TDOSCA-test-cases are offered under the umbrella of the GitHub organization Open-Source-Compliance and clustered by the prefix tdosca. The README of main repository tdosca describes the general approach: one may expect that each test case offers the same structure. For example, take a look at tdosca-tc06-plainhw:
- On the top level, a test case-specific README describes its intention.
- In the directory input-sources, you find a compilable software package
- that contains the licensing information just as real open source projects do
- and can be installed by a standard technique (in this case: java + maven).
- On the top level, a compliance-trap file describes the challenges that are implemented in the source and should be managed by the tools.
- And in the directory reference-compliance-artifacts, one can find the compliance artifacts that a tool should deliver:
- a BOM file listing the (sub) components of the package
- a list of the packages that must be preinstalled on the target host
- the Open Source Compliance File, which — added to the package — establishes a compliantly distributable open-source software package.
The test cases themselves are stored in the respective repositories tdosca-tc01 … tdosca-tc0n
The core reference entity of a test case is its Open Source Compliance File: Such a file shall contain all compliance artifacts so that a package is compliantly distributed if it is bundled with the respective OSCF. This idea was inspired by the file that CISCO adds to its jabber client: https://www.cisco.com/c/dam/en_us/about/doing_business/open_source/ docs/CiscoJabberforWindows-128–1578365187.pdf. This file is not completely sufficient. But it gives a good idea, how to deal with this issue. In the TDOSCA context, the meaning of such an Open Source Compliance File can be explained by looking at the OSCF of the 6th test case.
A summary and an addendum:
In general each TDOSCA test-case implements the following structure:
The TDOSCA initiative — hosted under the umbrella of OpenChain and the OpenChain Reference Tooling Work Group — could be a good method for the community to evaluate its tools by such test cases.
But if DT followed this approach purely, DT would easily slip into the role of a police officer or a judge. That’s not what DT wants to be; it wants to be a supportive part of the community. For that purpose, DT has already evaluated existing tools on the base of the TDOSCA test cases, has made some experiences, and decided on some consequences:
Applying the approach to ORT
First DT decided to use ORT — the Open Source Review Toolkit — for creating a break-through tool-chain-version which takes the test-case input and derives the compliance output:
In the picture you see
- the five components, ORT mentions in its README,
- the data they generate, and
- how they use the output of their predecessors.
Using this outline, we can now exemplify some of …
… and gaining experiences with ORT
- First, DT noticed that it could not evaluate even the first and most simple test case using the GNU Autotools
- Second, DT had to learn that in cooperation with Gradle, ORT — for the moment — can not decide which of the found licenses is the default license.
- Third, DT noticed that the standard templates included in ORT reader follow the principle of over fulfillment, the principle of over-fulfilling the license requirements.
What does the last point mean? If you have a software project completely and exclusively licensed under the MIT license, then it is sufficient to bundle the license text and its embedded copyright line with the package for making it compliantly distributable. Tools that follow the principle of over-fulfillment would also add the artifacts created based on the GPL requirements, such as ‘all copyright headers of all files’ and so on.
Many approaches apply the principle of over-fulfillment — and use a problematic strategy:
- On the one hand, the distributors must correctly create the required compliance artifacts. If they create them incorrectly, they have to expect that someone will approach them about it.
- On the other hand, the surplus compliance artifacts could overwrite or lever out the essential artifacts.
Fortunately, ORT follows the design principle to make everything configurable and extendable, which allows DT to adapt its needs in three ways:
- Deutsche Telekom plans to implement and give back to ORT an evaluation technique of the Autotools scripts.
- It will define, implement, and give upstream to ORT a generally usable strategy to determine the default license of a package.
Extending the case structure
- DT will define more test cases according to the multi-dimensional room: complexity, programming language, and dependency manager.
Defining an Open Source Compliance artifact knowledge engine
- DT develops an intelligent component into which it embeds the Open Source License Compliance knowledge in a declarative manner by
- adding respective writers into ORT
- adding a FOSS compliance domain-specific language realized on the base of Eclipse, XText
- adding a respective compliance artifact composer based on XTend.
DT names this new component of and for Open Source Compliance Chains OSCake — the Open Source Compliance artifact knowledge engine -, and develops it under the terms of the Eclipse Public License 2.0
OSCake shall close the gaps evoked by Open Source scanning tools that follow the principle of compliance over-fulfillment. It will take Open Source Compliance collections and deliver Open Source Compliance Files that really fit the requirements of the involved Open Source Licenses and their contexts. OSCake will become an agnostic compliance knowledge engine; it will not depend on a specific scanning tool but only on an error-tolerant input format. For being able to offer these features, OSCake will have an internal structure:
TDOSCA and OSCake establish a promising goal set for the company itself as well as for the community and other commercial approaches:
- DT indeed wants to set up a practically usable FOSS compliance toolchain that automatically generates the compliance artifacts we need.
- DT wants to reduce the manual work as far as possible.
- And DT develops this chain (and its components) under the control of TDOSCA: the project to develop Test-Driven Open Source Compliance Artifact Gatherers and Compilers — including our own tool ‘OSCake’.
And in what way is this …
… part of the overarching topic FOSS Compliance? For fulfilling the requirements of FOSS licenses, we have to consider specific individual cases as well as side effects — for software, pictures, or documents. We should unhide trends and write guidelines. Above all, however, we must drive forward the automation of license fulfillment, make our licensing knowledge freely available, cast it into smaller tools, and bring it into larger systems: Because FOSS thrives on freedom through license fulfillment, large and small. That’s what also this article is about.
- OSLiC sources: https://github.com/telekom/oslic
- OSLiC homepage: http://telekom.github.io/oslic/
- OSLiC version 1.0.2: https://telekom.github.io/oslic/releases/oslic.pdf
- OSCAd sources: https://github.com/telekom/oscad
- OSCAd homepage: https://telekom.github.io/oscad/
- OSCAd instance: http://oscad.fodina.de/
- OpenChain homepage: https://www.openchainproject.org/
- Respective Linux Foundation project page: https://www.linuxfoundation.org/projects/security-compliance/
- Introduction into the Open Chain Reference Tooling Work Group: https://www.openchainproject.org/news/2020/03/15/openchain-reference-tooling-work-group-in-2020
- Open Chain Reference Tooling Work Group homepage: http://oss-compliance-tooling.org/
- Existing Open Source license compliance tools: http://oss-compliance-tooling.org/Tooling-Landscape/OSS-Based-License-Compliance-Tools/
- Open-source Review Toolkit: https://github.com/oss-review-toolkit/ort
- Test Driven Open Source Compliance Initiative: https://github.com/Open-Source-Compliance/tdosca
- Open Source Compliance artifact knowledge engine: https://github.com/Open-Source-Compliance/OSCake
- Open Compliance Summit 2020: https://events.linuxfoundation.org/open-compliance-summit/program/schedule/