Category Archives: Data Privacy

Product Composition Risk Management

When I first heard the term Software Composition Analysis (SCA), I was excited to hear of a new vision for what was thought of as only an open source discovery tool. I knew the vendors in this new SCA space were thinking more deeply about the problems faced by product owners than just generating a bill of materials which detailed the open source code used by and distributed with their proprietary code.

However, after thinking about the broad spectrum of what SCA vendor are actually doing, I came to realize that the only word in that market categorization which is fully applicable is: composition. Both the words software and analysis are far too narrow for the work being done by SCA vendors.

Software, Firmware, and Webware

Even while they have been benefitting by this market category, SCA vendors have been processing not only their customers’ desktop and server software, but also their mobile application software, device firmware, and webware written with open web APIs.  Just being positioned as servicing “software” limits the perception of the wide variety of intellectual property delivery and deployment models SCA vendors process daily.

Risk Detection, Assessment, and Mitigation

Merriam-Webster defines analysis to be a “separation of a whole into its component parts”. Not only is this redundant with the word composition, but SCA vendors have gone beyond simply identifying open source components.

SCA users have consistently received more than a bill of open source materials. They have achieved well-defined business outcomes that have resulted in minimized risk around the security, data privacy, operations, license compliance, and terms of use compliance.

Product Composition Risk Management

Therefore, to represent the actual scope of benefits provided by SCA vendors, the category “Product Composition Risk Management” is more appropriate.

A modern digital product is composed of one’s own proprietary code, code from commercial and non-commercials providers, and web service providers. The word product is not limited to software, firmware, mobile, or web development; it encompasses all modes of digital product composition which use all types of intellectual property.

There is risk in composing one’s product only from one’s own proprietary code, which is why that code is measured against multiple non-functional requirements. However, composing one’s product from intellectual property owned by others creates an inherent risk that is much greater. You don’t know the care with which that IP was created and don’t know the resources available to maintain it.

SCA vendors not only identify open source risk, they assess the risk, and provide mitigation alternatives for their customers.

So, while the SCA market categorization served its purpose for a few years, it is time to acknowledge the greater benefits that SCA vendors bring to a customer’s entire supply chain.

Data Privacy Requires Data Security, Just Ask Equifax

The following post was originally published here by Black Duck Software…

The EU’s General Data Protection Regulation (GDPR) will be enforced starting May 25, 2018. One of its goals is to better align data privacy with data security, as depicted in this simple Venn diagram:

That is, you can have data security without data privacy, but you can’t have data privacy without data security.

Equifax painfully has come to this same conclusion, and well before the May 25, 2018 date.

A Little History on Data Privacy Principles

Many years ago, Equifax could have successfully argued that they have complied with data privacy requirements because they have not sold consumers data without those consumers’ permission. That was how low the bar was set when data privacy first became an issue.

Even as long ago as 1995, one of the data privacy principles in Directive 95/46/EC required appropriate security controls when handling private data. However, data privacy had focused only on issues of consumer consent and intentional disclosure of private data; that is, until Equifax clarified for uslast week that that is not enough.

Behind the Equifax Breach: A Deep Dive Into Apache Struts CVE-2017-5638

GDPR: New Requirements for Security Controls

Just like with Directive 95/46/EC, one of the data privacy principles of the GPDR requires similar security controls, but the important requirement that GDPR adds is that companies must provide evidence of those security controls.

Certainly, GDPR regulators will want to see evidence of security controls, but even companies that are not directly targets of regulators will be required to produce such evidence to their customers if any company downstream in their supply chain perceives themselves to be a target of regulators. Evidence of security controls will be a condition of doing business.

The Equifax breach makes clear in a visceral way what the GDPR will make clear through regulations: the consequences to the private individual are just as damaging, if not more, when their private data is breached compared to when it is sold to an unauthorized party, ask the 140 million individuals in Equifax’s database.

David-Znidarsic-Corporate-Photo-200x300.jpg

David Znidarsic is the founder and president of Stairstep Consulting, where he provides intellectual property consultation services ranging from IP forensics, M&A diligence, information security management, open source usage management, and license management. Learn more about David and Stairstep Consulting at www.stairstepconsulting.com

Compliant? Sure, But With What?

The following post was originally published here by Black Duck Software…

The term compliance is used more and more in business. Some job titles even include the term: VP of Compliance, Compliance Officer, Compliance Manager. Usually these roles have focused on the legal and operational requirements imposed by external groups like licensors and regulatory agencies.

While abiding by such external requirements is the cost of you doing business, you give up control of your business or product development by only following the requirements of others and not establishing your own policies and complying with them.

Limited Scope

Let’s look at how the term “compliance” has been used to limit the scope of open source governance

Open source compliance has been narrowly interpreted to mean that one must abide by the open source author’s license terms. Indeed, that will always be a requirement, but consider that an open source author’s work is replacing the work of one of your own software engineers.

If the only hurdle to cross before using open source is to be compliant with the author’s license terms, that is like saying you fully trust all the code developed by one of your software engineers if and only if your management meets its legal requirements during the hiring and employment of that engineer!

A Question of Trust?

While that seems preposterous, in practice, you probably impose many more requirements on the work product of your own engineers than on the work product of open source authors. Is it your intention to trust open source authors more than your own employees? The assumptions you might be making are:

(a) every open source project is staffed by many more development, testing, and maintenance engineers than your company can deploy to solve the same problem, and

(b) those engineers know and have fixed all security vulnerabilities.

However, www.openhub.com shows that might be true for some open source projects, but not all. Therefore, unless your product teams perform the appropriate due diligence, they won’t know whether their assumptions are valid.

Open source management best practices require organizations to know the open source in their code in order to reduce risks, tighten policies, and monitor and audit for compliance and policy violations. Automating identification of all open source in use allows development and license teams to quickly gain visibility into any known open source security vulnerabilities as well as compliance issues, define and enforce open source use and risk policies, and continuously monitor for newly disclosed vulnerabilities.

David Znidarsic is the founder and president of Stairstep Consulting, where he provides intellectual property consultation services ranging from IP forensics, M&A diligence, information security management, open source usage management, and license management. Learn more about David and Stairstep Consulting at www.stairstepconsulting.com

Best Technology Stack Transcends Language

In Entrepreneur.com, Rahul Varshneya observes that a technology stack is often chosen by your same software or firmware developer who will be responsible for writing code in that stack’s programming language.

Who would be brave or foolish enough to recommend themselves out of a job by choosing a stack which requires expertise in a language they do not understand? Mr. Varshneya warns you to use an evaluator unbiased towards programming language.

This is because the programming language should only be one of the criteria when choosing a technology stack. However, even if an unbiased evaluator chooses a stack that meets the current and future technical needs of your company and uses the correct programming language, they can still make a wrong choice if the technology stack supplier is not right for your company.

Often evaluators choose a technology stack containing non-commercial software components that have been developed by open source authors. The additional challenge is to choose these open source “suppliers” based on your non-functional requirements.

Does your evaluator consider the security vulnerabilities that have been disclosed for each component of the stack they choose? Do they know if anyone is working on that open source component? Even if enough people are working on the open source component, how active are they? Are they making fixes, making scalability improvements, and plugging security and data privacy holes that you would expect from your own developers, or are they only adding fun-to-develop features?

Make sure you and your evaluator choose your open source technology stack suppliers based on all the same criteria you would apply if you were to hire an employee or outsourcer to develop those components for you.

We are all now in a Regulated Industry

For many years, a small minority of companies were considered to be in a regulated industry: medical, financial, automotive, etc. Those of us not in one of those industries looked at those companies from afar with envy and pity: how are they able to produce what they produce under the weight of those regulations?

Starting May 25, 2018, we will all be in a regulated industry. Those companies who do business in the EU and UK (and thus process data identifying their citizens) will be required to comply with the General Data Protection Regulation.

The data privacy principles espoused by the GDPR are not much different than those in the Directive 95/46/EC from 1995. However, the EU has concluded that nicely asking companies for 22 years to abide by those directives has not achieved the data privacy they require for their citizens. Therefore, creating the GDPR has given teeth to regulators in the EU and UK to enforce their data privacy principles and thus brings us all into a regulated industry.

Web APIs are the New Open Source Software

If you are relaxing because you have your open source usage under control, beware. There is another increasingly common type of ungoverned third-party code that your engineers are using in your products: Web APIs.

There are many Web APIs published that, like open source software, are free of cost, readily available, provide great value, but are not free of obligations or risks. For example, https://www.programmableweb.com/api/keystroke-resolver is a Web API for mapping keystrokes from one type of keyboard to another. Perhaps useful, but what is this open source service doing with those keystrokes? Retaining them (if so, in what country)? Selling them? Marketing to your customers based on them?

Sometimes Web APIs are available to you as part of your license for a commercial software product or service. For example, you can build your own web applications using DocuSign’s published Web APIs. Use of those APIs is covered by your DocuSign license and access to them is only available to holders of an API key issued by DocuSign to paid licensees. However, even these commercial Web APIs have pitfalls for the products and services that use them.

Mistaken Assumptions About Web APIsNon-Commercial Web APIsCommercial Web APIs
API terms of use will remain sameMaybe NotProbably
API implementation will remain sameNoNo
API interface will remain sameMaybe NotProbably
API will process data locallyNoNo
API will be hosted in same legal jurisdictionMaybe NotMaybe Not
API will be available 100% of timeNoNo
API has an SLA
NoMaybe Not

The Web API author’s ability to instantaneously change it is good if they fix bugs and security vulnerabilities. But it is bad if they just as instantaneously introduce new bugs and vulnerabilities, and bad if they change the functionality or interface to break your application. You have no control over whether or not you use those daily changes because you’re always using their current implementation.

Even if the Web API uses strong encryption for data in transit between your application and their server, the fact that some of this data might be personally identifiable information means not only will it be sent over a public network, but it may even be sent to another country.

Here is an example of a Web API. The current weather at a particular latitude and longitude can be found using the following URL (visit it yourself to see the results):

https://api.weatherbit.io/v2.0/current?lat=48.8583701&lon=2.2922873&key=876daf42ac7f4488956caf9011a83212

If I were a French citizen and visiting a web page that uses the weatherbit.io Web API to find out the weather at my current location, my latitude and longitude would be sent to their server in New Jersey, USA. Certainly, a data privacy concern.

To take it a step further, what Web APIs hosted by yet other parties might weatherbit.io be calling to map the latitude and longitude to my time zone? to my city? to my state? to my country?

This is another example of the newest technology being adopted by organizations before management knows about it or can govern it. This is what happened with Shadow IT. Then Shadow Engineering emerged when software developers started using open source without permission from their management or procurement departments. Now, shadow web development via Web APIs is an increasingly common way for programmers to efficiently build web applications. Today, building web applications is a composition of proprietary code, outsourced code, open source code, and open source online services accessed via Web APIs. You must understand and manage the provenance of each of these components.

Assume Every Application is an On-Premises Application

We feel the need to label applications as either on-premises or cloud.

We try to assure ourselves that an application categorized as on-premises will not send or receive data over a public network, and an application categorized as cloud will not install client resources.

However, the reality is that most applications categorized as cloud require resources to be installed on the client, and sometimes install those resources silently.

This is usually because browsers and HTML aren’t powerful enough to drive the complexity required by those applications.

Therefore, applications categorized as cloud sometimes require native browser plugins, agents, or beacons. Sometimes they require native applications that supplement the browser client, like update utilities, upload utilities, etc. Sometimes the only client is a native application, like is the case with mobile apps.

Installing any of these requires explicit action on the part of IT or user, but are often overlooked as requirements because the application is categorized as “cloud”.

Cookies, web storage, and JavaScript are examples of client side resources installed without explicit IT or user action. Web storage is becoming more prevalent and harder to manage. It started with local shared objects (aka Flash cookies) and it continues to expand via standards like IndexedDB and proprietary client-side storage methods used by Internet service providers.

So if prevention or knowledge of an application’s required client-side installations is important to you, you need to do a technical analysis of what is and what is not installed; don’t rely on marketing materials and naïve categorizations. In the absence of such an analysis, assume every application you use requires some type of client-side installation or client-side storage.

Assume Every Application is a Cloud Application

We feel the need to label applications as either on-premises or cloud.

We try to assure ourselves that an application categorized as on-premises will not send or receive data over a public network, and an application categorized as cloud will not install client resources.

However, the reality is that most applications categorized as on-premises send data to and receive data from the Internet.

This is usually because most applications rely on highly dynamic content that must be installed and then frequently updated on the client device or computer.

Certainly most mobile applications are just thick native clients that access one or more on-line services. Just look at the apps on your phone and tablet and guess which features, if any, of each of those apps will work if you don’t have a data connection.

Desktop and server applications also often need cloud services to function: zip code to city lookups pass your location to an Internet service, desktop publishing templates, clip art, and help system content are now all accessed remotely, and some applications even “outsource” complex computations to cloud services, sending your data outside your organization.

So if prevention or knowledge of an application’s online access is important to you, you need to do a technical analysis of what is and what is not accessed; don’t rely on marketing materials and naïve categorizations. In the absence of such an analysis, assume every application you use is sending data to and receiving data from the Internet.

PII and Business Confidential Information (BCI)

Not all Business Confidential Information (BCI) supplied to you is PII.

Many customers blur the distinction between BCI and PII, in belief (or hope) that their BCI should be protected by the same laws that protect PII. However, data privacy laws only protect the identity of people.

BCI is protected, but by business-to-business agreements like NDAs, license agreements, partner agreements, etc.

So when handling information, make sure you know what information is governed by data privacy laws, business-to-business agreements, or both.

Practical guide to data privacy

The laws defining Personally Identifiable Information, and how PII can be controlled and processed are different in different countries, states, and provinces… and unfortunately, there is not much legal precedence for how these laws are to be interpreted.

When handling PII originating from a specific jurisdiction, the laws of that jurisdiction and the advice of legal counsel take precedence over what is written here, but since you continually handle PII and there is no single clear set of laws, here is a practical guide for how you should define, control, and process PII.

So what is PII?

Regular PII includes contact information, sensitive PII associates that contact information with attributes like medical condition, financial account information, sexual orientation, religious or political views, etc.

Responsibilities for data privacy are not because of business relationships. Even if one of your customer’s purchasing department or legal department doesn’t show concern over how you handle the PII of their employees, laws require you to accept the same data privacy responsibilities for their employees as companies who show greater concern. This is the case even if the customer says otherwise; for example, if a customer says it is OK to use their employees’ PII in a product demo, data privacy laws say it is NOT OK unless it is done legally.

So how can you legally process PII?

If you follow all these requirements, you will satisfy many jurisdictional requirements:

  • You must state to the individual:
    • what PII you intend to collect about the them
    • why your are collecting their PII
    • what you will do with their PII (this is also known as the stated purpose)
  • Before collecting the PII, you must receive explicit consent from the individual to do so; that is, you must assume the individual will not allow the collection, and only collect if they allow it; not the other way around
  • Once collected, you and all of your affiliates must process the PII only for the stated purpose
  • When the stated purpose has been completed, you and your affiliates must delete the PII
  • Upon request from the individual, you and your affiliates must correct or delete the PII
  • You must encrypt PII in transit (that is, when it is transferred)
  • If the PII is sensitive, you must also encrypt it at rest (that is, when it is stored)

Some jurisdictions like the EU, Canada, and the state of Massachusetts are particularly concerned about the PII of their citizens, but even more concerned when that PII is transferred outside of their jurisdiction. You usually have to provide citizens from these jurisdictions additional assurances that you will keep their information private after it has crossed their jurisdictional boundary.