How to track and secure open source in your enterprise

Your developers are using open source — even if you don't know about it. Here's how to take control and why you need to

How to track and secure open source in your enterprise
Thinkstock

Recently, SAS issued a rather plaintive call for enterprises to limit the number of open source projects they use to a somewhat arbitrary percentage. That seems a rather obvious attempt to protest the rise of the open source R programming language for data science and analysis in a market where SAS has been dominant. But there is a good point hidden in the bluster: Using open source responsibly means knowing what you’re using so you can track and maintain it.

Most enterprises aren’t aware of how much open source their developers use and what vulnerabilities that might expose them to. You can’t do security assessments or patch management on open source projects you don’t know you’re relying on.

Sonatype’s 2016 software supply chain study found that third-party components comprise eighty to ninety percent of the code in a typical enterprise Java application — and one in sixteen of those components that enterprises download has a security vulnerability. Older components have three times as many security flaws as newer versions, and over half of the components used in enterprise apps are over two years old. Two years after the Heartbleed bug was found, more than half of the OpenSSL versions Cisco Security Research tested in 2015 were still vulnerable.

In 2014, Veracode found that open source and third-party components used in enterprise web applications introduced an average of 24 known vulnerabilities into each of the 5,000 applications it scanned.

“Even software companies that already know they’re using open source code may need tools to manage that better. Enterprises rarely know just how much open source they’re using,” Rami Sass, CEO and co-founder of open source monitoring and management service WhiteSource tells CIO.com.  “Enterprises like banks, financial services companies, media companies have large software engineering departments these days. They’re often surprised to find out how extensive their use of open source is and how little of that their manual inventory processes have been tracking. On average, they find three times the number of components they thought they had. Sometimes it’s as high as 10 times.”

This isn’t to say that you don't want your developers to take advantage of open source, especially if you’re moving towards DevOps, because there are so many useful tools available in areas where writing your own code doesn’t do anything to differentiate you from your competitors. “Using open source in business makes sense, because you want your developers to stay focused on your core business,” Sass says. “So much of what you need has already been invented; you want to reuse something that’s already been tested and that’s maintained by the community, so that you don’t have to do all the heavy lifting. That’s why everyone loves open source — but unfortunately, open source has its own issues.”

Open source license liabilities

In the past, businesses were often most concerned with the licensing side of open source. “Open source is free but it does come with a lot of strings attached,” Sass points out. Open source licensing can be a minefield for commercial organizations. Although an increasing number of projects use permissive licenses like the MIT and Apache licences, which have minimal requirements about how the code can be redistributed, other licenses have more onerous requirements. Google’s recent guidance on how it uses open source includes notes on which licenses, like AGPL, are banned internally because of the requirements to publish the code of derivative works.

Even software projects that claim to be public domain or “free for any use” need to be considered carefully, as it’s a non-trivial matter to place software in the public domain. If you’re a commercial business, you need to avoid software that’s free for noncommercial use, which includes several Creative Commons licenses.

That doesn’t mean you must avoid open source, but you need to understand the ramifications of the licenses you’re accepting by using an open source project. The interconnected nature of open source projects can make that more complicated, as many people using the npm package manager found out when, after a dispute over package names, a developer unpublished a number of packages that thousands of other projects depended on.

“One open source component can have dependencies on many other open source components. Whenever a developer takes on an open source component, they’re bringing in the whole tree of dependencies behind it and often you don’t have any visibility of that. You need to see what your inventory of open source components is, but most organizations are not on top of that,” says Sass.

Take the so-called “copyleft” licences, like GPL, which generally require you to publish any modifications you make to their code. “The average enterprise will be using some open source components with a GPL licence,” Sass says. “Out of 300 components, maybe one or two or three [will be GPL]. That’s almost always news to them.”

As well as knowing what open source you’re using, you need to track the open source projects to which your developers might be contributing code. One way to do that is with GitHub Business. Although most organizations think of GitHub Business as a cloud service that saves them the trouble of running GitHub Enterprise on their own servers, it also gives you control of the identities under which developers in your organization consume and contribute to GitHub repositories.

“Our customers want more of a direct connection to the developers and projects, the community, which is what GitHub is all about. So much of that open source code is valuable to our customers. They benefit from it, and they want to contribute to it. They want access to the wide range of platform tools that our partners contribute,” Connor Sears, senior director of Product Design at GitHub, tells CIO.com.

GitHub Business integrates with your existing identity management tools, whether that’s Azure Active Directory, Okta or other SAML and SCIM-compatible identity systems like OneLogin and Shibboleth. That means if developers from your business download open source code, contribute back to the project or fork it for an internal project, they’re doing it from official company accounts you will continue to control even if they leave, not from their personal GitHub logins. 

Open source security

The other key issue with using open source code is making sure you update it when security problems are found. “When a developer takes a vulnerable open source component and incorporates it in your software, you are vulnerable and you make your customers vulnerable,” Sass says.

The real issue, however, is not that there are vulnerabilities, because there will always be vulnerabilities, but that they go unpatched. “You will always find libraries that are out of date, you will always find vulnerable components, you will almost always find licences that an enterprise didn’t intend to use."

"The good thing about open source is that the issues are rather easy to fix once you know about them. Usually it doesn’t take a huge effort to go out and update the components, although sometimes there are compatibility issues. Usually, someone in the open source community has already gone to the effort of fixing the problem.”

Applying those fixes systematically means tracking and managing the open source you use like any other part of your supply chain, and doing that manually is inefficient. Software composition analysis tools like WhiteSource, Black Duck, Palamida (recently acquired by Flexera), Sonatype Nexus, Synopsys or Veracode help you automate that.

WhiteSource, for example, has plugins for popular source management tools and services like Visual Studio Team Services and Jenkins, and it is being built into Visual Studio 2017, so it can automatically collect details of the open source components your developers are using and produce reports showing what security vulnerabilities have been found and what you need to do to mitigate the issues. You can also get reports on what licenses those components use, and even set policies to act based on either licensing issues or security vulnerabilities.

“You can have both a blacklist and a whitelist of licenses; customers often have a blacklist for licenses like GPL and a whitelist for permissive licences like MIT,” says Sass. “You can also have a policy around security vulnerabilities. If a developer introduces a new open source library with a known vulnerability, we can block it. We can also proactively send a push notification about a newly discovered vulnerability in a library you're using; we don’t wait for you to run a report.”

That policy can tie directly into your existing developer workflow, via the build server or through integration with systems like Git, GitHub, JIRA and Artefactory. “If you’re doing continuous integration and a developer introduces a new library that uses a license like GPL or that has a high impact security vulnerability, the policy will take effect; the build will fail and the developer will get a notification about the problem library. When someone tries to add a new artefact to a repo, you can check the policy and block it then. If you add a library that has a medium or low impact security vulnerability, we can route the notification to the security officer, so there’s more than one person involved in the decision-making process.”

Think of that as shifting governance left so it comes earlier in the development process, as well as automating some of the work involved in compliance. “We can find a problem as early as possible; we can block a component from being able to enter your environment.”

WhiteSource also has a browser plugin designed to help developers make better choices about what open source components to adopt. When developers visit a web page that mentions open source components, the plugin matches them to WhiteSource’s database and shows a popup with details of the license, any known vulnerabilities and your company policy. “We test it against your organization’s policies and tell the developer if it’s going to get approved. We also list any places the component is already being used in your organization; you can see if there’s another team using the same version, or a different version of the component you’re looking at,” Sass says.

If you want to comply with regulated financial standards like PCI-DSS or FS-ISAC guidance, you need to have a policy governing your use of open source and third-party components. There aren’t yet standard policies for handling open source that have been widely adopted by enterprises in other industries, but Sass believes that’s coming, especially as the U.S. government has started producing such policies for itself. “The government is going into the field of open source. They’re now required to release some percentage of the code they write as open source, and they’re going to want to regulate that with some sort of policy. Once they do that, large organizations will follow their lead,” he suggests.

CIOs can’t afford to turn a blind eye to the amount of open source components enterprise developers are using. Instead, you need to start tracking and managing it to make sure you’re on top of the issues around licensing and security. “The benefits considerably outweigh the faults,” Sass says. “You just need to manage this, so you're able to use open source and have it boost your productivity and not be worried about it.”

And as for the plea SAS made to simply limit the amount of open source used by your development team? “That ship has sailed.”

This story, "How to track and secure open source in your enterprise" was originally published by CIO.

Copyright © 2017 IDG Communications, Inc.