Compliance

Automating FOSS Compliance: TDOSCA & OSCake

By releas­ing the Open Source License Com­pendi­um and the Open Source Com­pli­ance Advi­sor, Deutsche Telekom has sup­port­ed Open Source Com­pli­ance. At BOSL‑3.0 I was one of the co-authors — on behalf of DT. But DT offers so many com­plex Open Source based prod­ucts that it is too expen­sive to cre­ate the nec­es­sary Open Source com­pli­ance arti­facts man­u­al­ly. Thus, DT needs a prac­ti­cal­ly usable auto­mat­ed tool­chain. This post dis­cuss­es a new method (TDOSCA) and a new tool (OSCake) for automat­ing FOSS com­pli­ance that DT devel­ops and con­tributes under the umbrel­la of the Open Chain Project.

[ en | de ]

3 simple questions for an Open Source Compliance tool

With­out any doubt, there exist already many Open Source com­pli­ance tools. The Open-Chain-Ref­er­ence-Tool­ing-Work-Group has com­piled a list of rel­e­vant infor­ma­tion:

TDOSCA architecture
  • Some of the tools can be grouped by the offer­ing orga­ni­za­tions like the Apache Foun­da­tion, SPDX, Eclipse, or the About Code Ini­tia­tive.
  • Some of the tools are on the side­lines because they have a spe­cif­ic focus or are not real­ly tools or any­thing else.
  • Some oth­er means are ser­vices, not tools.

Deutsche Telekom has a sim­ple point of view on FOSS com­pli­ance tools. When­ev­er DT comes across such a tool, it asks: Does this tool deliv­er the FOSS com­pli­ance arti­facts DT real­ly needs? If not

  • What part of them can it deliv­er?
  • How much work does DT still have to do man­u­al­ly if it used the tool?

DT has a long tra­di­tion of eval­u­at­ing FOSS com­pli­ance tools. Its employ­ees met excel­lent tools and bril­liant experts. They often thought they could essen­tial­ly sup­port DT. But in the end, DT most­ly felt like they did­n’t real­ly under­stand what DT need­ed (and still needs). To clar­i­fy this point: Who­ev­er deliv­ers large lists of (found) FOSS items and says that a com­pa­ny now has to dis­cuss each entry of the list with its legal depart­ment does not real­ly help the com­pa­ny.

Nev­er­the­less, DT has to deal with such large lists, today known as ‘Soft­ware Bill of Mate­r­i­al’. Open-Source-Com­pli­ance is not a ques­tion of plea­sure or dis­plea­sure. Either one uses Open-Source soft­ware and ful­fills the respec­tive require­ments, or one does not use the soft­ware. There­fore DT can’t wait any­more. The com­plex­i­ty of its prod­ucts enforces DT to advance the automa­tion of open source com­pli­ance active­ly. For solv­ing that issue, it does­n’t want to start the next green­field approach but to par­tic­i­pate in exist­ing projects — entire­ly in the spir­it of the open-source idea.

Setting up the Test-Driven environment

DT’s first step was to improve its own com­mu­ni­ca­tion: it wants to clar­i­fy in a bet­ter way what it real­ly needs — from the point of view of a large com­pa­ny deal­ing with many com­plex soft­ware stacks. Thus, DT tried to apply the idea of ‘Test-Dri­ven Soft­ware Devel­op­ment’ to the devel­op­ment of com­pli­ance tools:

  • On the one side, these test cas­es should con­tain real­ly usable soft­ware, licens­ing state­ments, and depen­den­cy infor­ma­tion — in a way that real projects use.
  • On the oth­er side, these test cas­es should con­tain those com­pli­ance arti­facts that would allow the dis­tri­b­u­tion of the soft­ware com­pli­ant­ly if added to the respec­tive soft­ware pack­age.

Addi­tion­al­ly, DT thinks:

  • Exist­ing open-source projects are most­ly too com­plex for being used as ref­er­ence mate­r­i­al.
  • Arti­fi­cial­ly gen­er­at­ed soft­ware could bet­ter focus on essen­tial com­pli­ance issues.
  • The ref­er­ence soft­ware should func­tion­al­ly be a sim­ple hel­lo world pro­gram.
  • And it should ‘imple­ment’ sophis­ti­cat­ed com­pli­ance issues in a way that real open-source projects use.

By using such test cas­es, DT wants to enable the com­mu­ni­ty, the tools, and the com­pa­nies to ver­i­fy,

  • with which com­pli­ance traps a tool can already suc­cess­ful­ly deal,
  • which arti­facts a tool already deliv­er (and which not),
  • where there are still some open issues, and
  • where devi­at­ing results are only a mat­ter of inter­pre­ta­tion.

The ‘Hello World’ Open Source Compliance Test Cases

All TDOSCA-test-cas­es are offered under the umbrel­la of the GitHub orga­ni­za­tion Open-Source-Com­pli­ance and clus­tered by the pre­fix tdosca. The README of main repos­i­to­ry tdosca describes the gen­er­al approach: one may expect that each test case offers the same struc­ture. For exam­ple, take a look at tdosca-tc06-plain­hw:

  • On the top lev­el, a test case-spe­cif­ic README describes its inten­tion.
  • In the direc­to­ry input-sources, you find a com­pi­l­able soft­ware pack­age
    • that con­tains the licens­ing infor­ma­tion just as real open source projects do
    • and can be installed by a stan­dard tech­nique (in this case: java + maven).
  • On the top lev­el, a com­pli­ance-trap file describes the chal­lenges that are imple­ment­ed in the source and should be man­aged by the tools.
  • And in the direc­to­ry ref­er­ence-com­pli­ance-arti­facts, one can find the com­pli­ance arti­facts that a tool should deliv­er:
    • a BOM file list­ing the (sub) com­po­nents of the pack­age
    • a list of the pack­ages that must be pre­in­stalled on the tar­get host
    • the Open Source Com­pli­ance File, which — added to the pack­age — estab­lish­es a com­pli­ant­ly dis­trib­utable open-source soft­ware pack­age.

The test cas­es them­selves are stored in the respec­tive repos­i­to­ries tdosca-tc01tdosca-tc0n

The core ref­er­ence enti­ty of a test case is its Open Source Com­pli­ance File: Such a file shall con­tain all com­pli­ance arti­facts so that a pack­age is com­pli­ant­ly dis­trib­uted if it is bun­dled with the respec­tive OSCF. This idea was inspired by the file that CISCO adds to its jab­ber client: https://www.cisco.com/c/dam/en_us/about/doing_business/open_source/ docs/CiscoJabberforWindows-128–1578365187.pdf. This file is not com­plete­ly suf­fi­cient. But it gives a good idea, how to deal with this issue. In the TDOSCA con­text, the mean­ing of such an Open Source Com­pli­ance File can be explained by look­ing at the OSCF of the 6th test case.

A summary and an addendum:

In gen­er­al each TDOSCA test-case imple­ments the fol­low­ing struc­ture:

The TDOSCA ini­tia­tive — host­ed under the umbrel­la of Open­Chain and the Open­Chain Ref­er­ence Tool­ing Work Group — could be a good method for the com­mu­ni­ty to eval­u­ate its tools by such test cas­es.

But if DT fol­lowed this approach pure­ly, DT would eas­i­ly slip into the role of a police offi­cer or a judge. That’s not what DT wants to be; it wants to be a sup­port­ive part of the com­mu­ni­ty. For that pur­pose, DT has already eval­u­at­ed exist­ing tools on the base of the TDOSCA test cas­es, has made some expe­ri­ences, and decid­ed on some con­se­quences:

Applying the approach to ORT

First DT decid­ed to use ORT — the Open Source Review Toolk­it — for cre­at­ing a break-through tool-chain-ver­sion which takes the test-case input and derives the com­pli­ance out­put:

In the pic­ture you see

  • the five com­po­nents, ORT men­tions in its README,
  • the data they gen­er­ate, and
  • how they use the out­put of their pre­de­ces­sors.

Using this out­line, we can now exem­pli­fy some of …

… and gaining experiences with ORT

  • First, DT noticed that it could not eval­u­ate even the first and most sim­ple test case using the GNU Auto­tools
  • Sec­ond, DT had to learn that in coop­er­a­tion with Gra­dle, ORT — for the moment — can not decide which of the found licens­es is the default license.
  • Third, DT noticed that the stan­dard tem­plates includ­ed in ORT read­er fol­low the prin­ci­ple of over ful­fill­ment, the prin­ci­ple of over-ful­fill­ing the license require­ments.

What does the last point mean? If you have a soft­ware project com­plete­ly and exclu­sive­ly licensed under the MIT license, then it is suf­fi­cient to bun­dle the license text and its embed­ded copy­right line with the pack­age for mak­ing it com­pli­ant­ly dis­trib­utable. Tools that fol­low the prin­ci­ple of over-ful­fill­ment would also add the arti­facts cre­at­ed based on the GPL require­ments, such as ‘all copy­right head­ers of all files’ and so on.

Many approach­es apply the prin­ci­ple of over-ful­fill­ment — and use a prob­lem­at­ic strat­e­gy:

  • On the one hand, the dis­trib­u­tors must cor­rect­ly cre­ate the required com­pli­ance arti­facts. If they cre­ate them incor­rect­ly, they have to expect that some­one will approach them about it.
  • On the oth­er hand, the sur­plus com­pli­ance arti­facts could over­write or lever out the essen­tial arti­facts.

For­tu­nate­ly, ORT fol­lows the design prin­ci­ple to make every­thing con­fig­urable and extend­able, which allows DT to adapt its needs in three ways:

Improving ORT

  • Deutsche Telekom plans to imple­ment and give back to ORT an eval­u­a­tion tech­nique of the Auto­tools scripts.
  • It will define, imple­ment, and give upstream to ORT a gen­er­al­ly usable strat­e­gy to deter­mine the default license of a pack­age.

Extending the case structure

  • DT will define more test cas­es accord­ing to the mul­ti-dimen­sion­al room: com­plex­i­ty, pro­gram­ming lan­guage, and depen­den­cy man­ag­er.

Defining an Open Source Compliance artifact knowledge engine

  • DT devel­ops an intel­li­gent com­po­nent into which it embeds the Open Source License Com­pli­ance knowl­edge in a declar­a­tive man­ner by
    • adding respec­tive writ­ers into ORT
    • adding a FOSS com­pli­ance domain-spe­cif­ic lan­guage real­ized on the base of Eclipse, XText
    • adding a respec­tive com­pli­ance arti­fact com­pos­er based on XTend.

DT names this new com­po­nent of and for Open Source Com­pli­ance Chains OSCake — the Open Source Compli­ance arti­fact knowl­edge engine -, and devel­ops it under the terms of the Eclipse Pub­lic License 2.0

OSCake shall close the gaps evoked by Open Source scan­ning tools that fol­low the prin­ci­ple of com­pli­ance over-ful­fill­ment. It will take Open Source Com­pli­ance col­lec­tions and deliv­er Open Source Com­pli­ance Files that real­ly fit the require­ments of the involved Open Source Licens­es and their con­texts. OSCake will become an agnos­tic com­pli­ance knowl­edge engine; it will not depend on a spe­cif­ic scan­ning tool but only on an error-tol­er­ant input for­mat. For being able to offer these fea­tures, OSCake will have an inter­nal struc­ture:

Fazit

TDOSCA and OSCake estab­lish a promis­ing goal set for the com­pa­ny itself as well as for the com­mu­ni­ty and oth­er com­mer­cial approach­es:

  • DT indeed wants to set up a prac­ti­cal­ly usable FOSS com­pli­ance tool­chain that auto­mat­i­cal­ly gen­er­ates the com­pli­ance arti­facts we need.
  • DT wants to reduce the man­u­al work as far as pos­si­ble.
  • And DT devel­ops this chain (and its com­po­nents) under the con­trol of TDOSCA: the project to devel­op Test-Dri­ven Open Source Com­pli­ance Arti­fact Gath­er­ers and Com­pil­ers — includ­ing our own tool ‘OSCake’.

And it is an out­stand­ing aspect that DT is going to devel­op both parts under the umbrel­la of Open­Chain and its Open Chain Ref­er­ence Tool­ing Work­group.


And in what way is this …

… part of the over­ar­ch­ing top­ic FOSS Com­pli­ance? For ful­fill­ing the require­ments of FOSS licens­es, we have to con­sid­er spe­cif­ic indi­vid­ual cas­es as well as side effects — for soft­ware, pic­tures, or doc­u­ments. We should unhide trends and write guide­lines. Above all, how­ev­er, we must dri­ve for­ward the automa­tion of license ful­fill­ment, make our licens­ing knowl­edge freely avail­able, cast it into small­er tools, and bring it into larg­er sys­tems: Because FOSS thrives on free­dom through license ful­fill­ment, large and small. That’s what also this arti­cle is about.


Leave a Comment

Your email address will not be published. Required fields are marked *

To top