Overview

The Dependency Checker is a tool to explore both dynamic and static linkage dependencies of binaries and libraries built with FOSS components. Once dependencies are identified, the GUI can provide an easy to interpret visual indication of possible license issues, based on in-house license policies.

The system consists of two pieces, a command-line program "readelf.py" and a GUI frontend that runs in a web browser.

You can view the development source from git, or check it out using standard git commands:

git clone http://git.linuxfoundation.org/dep-checker.git

Bugs can be filed under the Compliance product.

There is also a mail list for discussion of the tool.

Setup:

System Requirements:

The command-line program and the GUI require python. It also runs the OS commands: file, ldd, objdump and readelf, so these should be present on your system. The GUI requires Django, along with sqlite support for the results database. A web browser is also needed to interact with the GUI. If your distribution does not provide Django, you can follow these installation instructions.

Installation:

From Packages:

The program is packaged as an rpm package, with dependencies on python-django. If your system does not provide django, or it's named differently, you may need to install using --nodeps:

rpm -Uvh dep-checker-0.0.5-1.noarch.rpm --nodeps

Note: If you had to use --nodeps, then you must make sure django is installed and functional on your system. Both the command line program and the gui depend on django.

The installation creates a "compliance" user/group and should create a desktop menu entry to launch the server and open the GUI in your web browser.

In the future we may bundle django with the package to make things simpler, as well as provide .deb packaging.

From Git:

You can also checkout the project from git and run it in place:

git clone http://git.linuxfoundation.org/dep-checker.git
cd dep-checker
Alternately, you can get the latest tarball from the git web page by clicking on the snapshot link in the upper right-hand part of the page.

Once you have the tarball, unpack it (example, the numbers of your download may differ):

tar -xf dep-checker-3af829ae0cc5aba33192c000ef0365ef6bced843.tar.gz
cd dep-checker
Create the application database and the documentation (you will need w3m to create README.txt).
make
If you don't have root permissions on the machine to install Django, you can install it in-place with the dep-checker install:
tar -xf Django-x.x.x.tar-gz
cp -ar Django-x.x.x/django dep-checker/compliance
cd dep-checker/bin
ln -s ../compliance/django .
Run the server and the gui should show up in a browser window:
./bin/dep-checker.py start
To kill the django server, you can run:
./bin/dep-checker.py stop

System Layout:

The application installs under the /opt/linuxfoundation namespace:

Under the compliance tree, is a typical django project layout:
Running the GUI server:

To run the gui/server (as user compliance for installed package), there is a script that su's to the compliance user, starts the server and attempts to open a browser page to the GUI:

/opt/linuxfoundation/bin/dep-checker.py start
To stop the server run:
/opt/linuxfoundation/bin/dep-checker.py stop

If for some reason this does not work, you can manually perform the steps to start the server:

su - compliance
cd /opt/linuxfoundation/compliance
python manage.py runserver
You can terminate the server from this console by hitting ctrl-C

Running the command line program:

The command line program is called readelf.py, and it resides in /opt/linuxfoundation/bin:

Usage: readelf.py [options] <file/dir tree to examine> [recursion depth]

Options:
  -c                   output in csv format
  -d                   write the output into the results database
  -s DIR               directory tree to search
  --comments=COMMENTS  test comments (when writing to database)
  --project=PROJECT    project name (when writing to database)
  --no-static          don't look for static dependencies
  --version            show program's version number and exit
  -h, --help           show this help message and exit
The -c option is primarily used to pass data to the GUI. The format without this argument is more human-readable if you are using the command line directly.

The -s option expects a directory as an argument. If you specify this option, the program will attempt to drill down through the directory mentioned to find only files with the name specified by the next argument to analyse:

/opt/linuxfoundation/bin/readelf.py -s /foo bar
The program will search everything under /foo, for ELF files named bar

Specifying only a directory will search and report on every ELF file in that directory tree:

/opt/linuxfoundation/bin/readelf.py /foo
Specifying only a file will attempt to test only the specified file:
/opt/linuxfoundation/bin/readelf.py /foo/bar/baz
The recursion level is an optional argument, that will attempt to not only report the direct dependencies, but also report the dependencies of each library used by the target file:
/opt/linuxfoundation/bin/readelf.py /foo/bar/baz 4
This would attempt to recurse down 4 levels from the target file, giving output something like this:
[1]/foo/bar/baz:
  libtermcap.so.2
  [2]/lib/libtermcap.so.2.0.8:
    libc.so.6
    [3]/lib/i686/libc-2.10.1.so:
      ld-linux.so.2
[1]/foo/bar/baz:
  libdl.so.2
  [2]/lib/libdl-2.10.1.so:
    libc.so.6
    [3]/lib/i686/libc-2.10.1.so:
      ld-linux.so.2
  [2]/lib/libdl-2.10.1.so:
    ld-linux.so.2
[1]/foo/bar/baz:
  libc.so.6
  [2]/lib/i686/libc-2.10.1.so:
    ld-linux.so.2
You will note that even though we asked for a recursion level of 4, the test stopped at level 3, as the program detects when no further recursion is possible.

Static library dependencies will appear with (static) appended to the SONAME:

libncurses.so.5 (static)

The --no-static option suppresses trying to resolve staticly linked dependencies.

The -d, --project, and --comments are for using the command line program to feed results into the database used by the gui. Setting -d forces -c and compiles the collected results into a list that is fed to the compliance database, where it will show up with the results of tests executed from the gui. The --project and --comments are optional, as they are from the Check Dependencies tab. Multi-word strings should be enclosed in quotes. Here is an example:

/opt/linuxfoundation/bin/readelf.py -d --project=test --comments='this is a test' /usr/bin/foo

All the other options, such as searching, recursion, and disabling static checking are also available in this mode, and the program will still output error conditions and the data to stdout.

Accessing the GUI:

If a browser does not open by launching the menu item, you can access the GUI (once the server is started): at http://127.0.0.1:8000/linkage.

Using the GUI:

The GUI interface is pretty straightforward, with tabs to access various aspects of program:

The final page, which isn't visible in the tabs, is the test results detail page, brought up by either running a test, or clicking on the link in the results page.

Check Dependencies:

A test sequence would typical start at the Check Dependencies page, where you enter the test criteria. This setup parallels the operation of the command line program, where you select whether to search for a file under a directory, test a whole directory, or just a single file. There is also a drop-down to select the recursion level. You can disable checking for static dependencies via a checkbox.

The user field is pre-populated with the compliance user, but can be overridden. The project and comments fields are optional for your use in tracking tests.

Once you enter the test criteria, click on the Run Dependency Check button. After the test runs you will be presented with the detailed test results in tabular form. Depending on the number of files to be tested and the recursion level, the test can take a few minutes, so be patient.

Until there are licenses and bindings defined, the results detail will show TBD for both the target and dependency licenses. Now that there is data in the system, you can go back and define these relationships and update the test data.

There is a Print Results button on the detail page that should open the browser print dialog to print to a physical printer or to a file. Some parts of the GUI are hidden in the printed output so that only the test results show up in the printed report.

Review Results:

The test results should also be accessible from the Review Results page. This is a tabular list of all the test runs, sorted by test id/date. The far-right column has the information entered from the Check Dependencies tab. Clicking on the link for the target file or directory will open the detail tab. If you want to delete test results, you can select the checkboxes and delete them from here, using the Delete Selected Tests button.

Licenses:

The License tab lets you enter license/version info. You enter the license name (example: GPL) in the left-hand field and the version (example: 3.0) in the right-hand field. Like the Review Results tab, you can select and delete licenses using the checkboxes and the Delete Selected Licenses button. The license-version combination will be concatenated in the report to look like: GPL 3.0.

This tab also contains the entry form to map the license/version info used by the application to any possible string variations provided by imported data from other sources. You can select a license/version from the system and then provide up to 9 alternate names (aliases) that will be considered equivalent when examining test results for policy violations. Additional aliases can be added to an existing list by simply selecting the license, entering just the new alias and clicking "Add" again.

License Bindings:

The License Bindings tab lets you define the license binding for the target files, that is, the files that are being tested for dependencies. The same type of bindings can be done for the dependency libraries.

The drop down under Target will show all files having test data. The drop-down under License will show all the licenses defined in the License tab. If there is no test-data or no licenses, the drop-downs will be empty. If there is test data in the system, you can update the license information for current data using the Update Target Test Data button.

The drop down under Library will show all libraries in the test data. The drop-down under License will show all the licenses defined in the License tab. The License selector does not differentiate between static and dynamic versions, both will be treated the same. If there is no test-data or no licenses, the drop-downs will be empty. If there is test data in the system, you can update the license information for current data using the Update Library Test Data button.

License Policies:

The License Policies tab lets you define pairing of target/library licenses that could have potential issues. You select the Target License and Library License from the drop-downs and then select the relationship, either Static, Dynamic, or both. You can also set the state or either Approve or Disapprove. When a test is run, violations of these policy settings will show up the the report detail printed in red with a red flag after the License name. License pairings that are approved will have normal black text, and unknown/undefined pairings will be highlighted in orange with an orange flag. Like the other tabs, you can select and delete policies using the checkboxes and the Delete Selected Policies button.

In the screenshot below, you can see an example of a flagged policy violation. The application myapp has been compiled against libmylib.so. The licenses: L1 2.0 and L2 1.3 have been defined in the policy screen as being an issue. When the test data is displayed, this relationship is flagged as being problematic:

policy_flag.png

If the target (file) or library is using a license naming convention that is not defined in the application licenses tab, but has a naming convention defined as equivalent in the aliases table, the license violation will look like:

alias name (real name) [graphical flag]
Settings:

The Settings tab lets you change the static data used to detect static libraries in use by the program being tested.

The symbol data used for static detection is based on the libraries currently installed on the test system. You can reload the data by activating the Reload Static Data button at the top of the page.

By default, system libraries from the normal system paths are loaded into the database for static symbol detection. The list of paths to search is provided in a large edit box, one per line; you can add or remove paths from this list, and activate the Save Changes button to save those search paths.

Admin Interface:

In the current configuration, the django admin interface is enabled. While you can use this interface to directly access the database records, one should take care not to alter existing records, except in the case of wishing to add license information to records.

admin interface: http://127.0.0.1:8000/admin (username compliance, password compliance)

Database Schema:

The database for the application is in the file compliance in the compliance directory. It is an sqlite3 database file. Tables used by the application are as follows (arranged more or less as they are integrated into the application tabs):

Importing License Data:

Because linkage_license, linkage_aliases, linkage_filelicense, linkage_liblicense, and linkage_policy are more or less independent of the test data, one could easily load these tables from other data sources, using sqlite3, as long as the existing table schemas are honored. To illustrate, let's walk through an example of importing a file library/license mappings from another source.

Say have a csv file of library,license data like this:

libfoo.so.6,LPGLv3
libbar.so.2,BSD1
libbaz.so.4,APACHE 2
We can easily process this into SQL statements we can load into dep-checker using whatever script language you're comfortable with. With the shell and awk, perhaps something like this:
awk -F,  '{print "INSERT INTO linkage_liblicense (library, license) VALUES (\"" $1 "\",\"" $2 "\");"}' < liblicenses.csv > liblicenses.sql
cd dep-checker compliance
sqlite3 compliance < liblicenses.sql
And if we look at the database now:
sqlite3 compliance 
SQLite version 3.6.23.1
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from linkage_liblicense;
1|libfoo.so.6|LPGLv3
2|libbar.so.2|BSD1
3|libbaz.so.4|APACHE 2
So our data is loaded, but we have a slight problem in that the license naming conventions from our data file don't match the format used in dep-checker to feed into our license polices. In dep-checker, "LGPLv3" would be "LGPL 3.0", "BSD1" would be "BSD 1.0" and "APACHE 2" might be "Apache 2.0". We can correct this either by define alias mappings in the Licenses tab or with some additional SQL (assuming we know the "correct" naming defined in dep-checker):
INSERT INTO linkage_aliases (license, alias) VALUES ('LGPL 3.0', 'LGPLv3');
INSERT INTO linkage_aliases (license, alias) VALUES ('BSD 1.0', 'BSD1');
INSERT INTO linkage_aliases (license, alias) VALUES ('Apache 2.0', 'APACHE 2');
If you have a large number of alias mappings to perform, SQL may be the way to go, otherwise they can be assigned under Aliases in the Licenses tab where you'l be assured of the correct mappings, as only "known" licenses will be available in the drop-down.

A similar process could be used to load license associations for target files.

Once the license data is loaded, the functions update_file_bindings() and update_lib_bindings() in views.py would apply this information to existing test data, which can also be done from the gui Target Licenses and Library Licenses tabs by clicking the Update Test Data button.

How it Works:

As mentioned earlier, the command line program, readelf.py does all the actual file search and analysis. For discovering dependencies, it uses readelf, ldd, and file, using the following methodology:

Limitations:

There are certain limitations in the analysis of binaries/libraries for static/dynamic dependencies.

In the static case, the symbol table is created either on the build server (packaged version), or the user's machine (run in-place from git). The content of the table will be largely driven by the libraries present on the system, and may not completely reflect the system where the target files have been built.

Also, the same symbol can come from one than one library, and the tool can only provide the possible sources of the symbol in question. Some deeper investigation of the actual build system may be required to identify the actual static linkage.

For the dynamic case, the level one dependencies are pretty clear-cut, but for the recursive case one can get a slightly different set of dependencies when drilling down into the system libraries, depending on how the system libraries themselves are built.

Authors:
Stew Benedict <stewb@linuxfoundation.org>
Jeff Licquia <licquia@linuxfoundation.org>
Changelog:
0.0.4:
    Change name to Dependency Checker Tool
    Drop Home, About tabs
    Drop copyright notice
    Add AUTHORS, Changelog
    Add bzr, bugzilla, mail list info to README.html
    Fix date/time display in detail, results
    Fix directory recursion
    New License file
    Add contribution requirements
    Rework dep recursion in command line app
    Fix test.html so it displays ok on most browsers
    Workaround weird display issue for results,detail in firefox 3.6.3
    Fix detail page printing, suppress unwanted elements with css
    Add new logo
    Detect/use local timezone settings
    Add directory/file browsing to test input form
0.0.5:
    Add license/license policy tabs
    Rework documentation
    Add dep-server-stop.sh to stop running server instances
    Add target/library license binding tabs and tables
    More GUI tweaks for more tabs
    Insert file/library bindings into test data after run
    Add policy violation flagging to test results
    Expose --no-static in the GUI
    Add license<-->alias mapping/reporting capability (bug 459)
    Add approved/disapproved policy highlighting (bug 475)
    Pre-load with license/alias data (bug 478)
    Enable data collection from the command line (bug 472)
Contributing:
Any contribution submitted for inclusion in the Dependency Checker Tool must 
be signed by its author following the Developer's Certificate of Origin 1.1.

By making a contribution to this project, I certify that:
a) The contribution was created in whole or in part by me and I have the right 
   to submit it under the open source license indicated in the file; or
b) The contribution is based upon previous work that, to the best of my 
   knowledge, is covered under an appropriate open source license and I have 
   the right under that license to submit that work with modifications, 
   whether created in whole or in part by me, under the same open source 
   license (unless I am permitted to submit under a different license), as 
   indicated in the file; or
c) The contribution was provided directly to me by some other person who 
   certified (a), (b) or (c) and I have not modified it.
d) I understand and agree that this project and the contribution are public 
   and that a record of the contribution (including all personal information 
   I submit with it, including my sign-off) is maintained indefinitely and 
   may be redistributed consistent with this project or the open source 
   license(s) involved.

Patches to the mailing list need to be signed as:

	Signed-off-by: <author name> <author email address>

Same thing applies for reviewers: 

	Signed-off-by: <reviewer name> <reviewer email address>

and committer:

	Signed-off-by: <committer name> <committer email address>

License:
Copyright (c) 2010 Linux Foundation

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.