Surveillance technologies and patron privacy: what can libraries do?

Commercial publisher practices of employing tracking technologies to collect and sell user data have been fairly widely addressed (see “tracking tools” post), and Emily Cukier recently summarized the issues for libraries in “What the Vendor Saw: Digital Surveillance in Libraries.” Commercial publishers, such as Thomson Reuters, the RELX Group, Clarivate, Wiley and others, are incentivized to make money, and they have expanded their revenue sources from the published content itself (subscription fees, author processing charges) to user data which they monetize in different ways.

Aside from the financial implications of extracting more revenue from libraries and their users, libraries’ reliance on these publisher platforms to deliver content conflicts with a fundamental tenet of the American Library Association’s Code of Ethics:

We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.

ALA Code of Ethics, #3

Code that tracks both a specific item of content and its user has potential and real chilling effects on intellectual freedom. Aggregated data that informs policy and practices can also “bake in” existing biases and inequities that further disadvantage marginalized communities.

So what can libraries do to protect patron privacy? A first step is to ensure that providers have clear, accessible and easily findable privacy policies. Another is to draw attention to these policies and their implications. Libraries should also make provider policies and practices a part of their contracts. Cukier cites ALA’s privacy best practice guides, including one on vendors and privacy that offers checklists for what should (e.g. security standards, disclosure to outside parties, how data is encrypted and stored) and should not (e.g. vagueness, lack of definition, reserved rights to monitor users) be in contracts. The Library Freedom Project also offers privacy resources, including a Vendor Privacy Scorecard and Privacy Audit Worksheet.

Finally, Cukier references an interview with Felix Rada from the Society for Civil Rights and the four aspects he says are important for contracts with external service providers:

  • Bid so that different companies have to compete
  • Avoid “lock-in effects” such as proprietary platforms that leave libraries permanently dependent on a specific provider
  • Let licenses allow unlimited further use on any platform, for any purpose
  • Prohibit search tracking at the level of individual researchers and run software in-house wherever possible.

Librarians and researchers will recognize these publisher practices. Ultimately Rada says, “Universities and libraries should preferably completely avoid these contracts and invest the money in their own infrastructure.” He advocates for open access and open science built on publicly-aligned infrastructure.

Keeping the researcher at the center of data control and quality: a review of the ORCID Trust Program

ORCID established its Trust Program in 2016, and this blog post celebrates its fifth anniversary. The ORCID organization, of which UMass Amherst is a member, has a mission “of enabling transparent and trustworthy connections between researchers, their contributions, and their affiliations” and “a vision of a world where all who participate in research, scholarship, and innovation are uniquely identified and connected to their contributions across disciplines, borders, and time.” These aspirations are made real on the basis of trust built on individual researcher control, accountability and strict organization provenance tracking. Researchers/scholars/contributors control who can write to, read from and edit the data associated with their ORCID profile, for how long they can do it, with verification of the source organizations.

With ORCID’s growth have come attempts to misuse the connections and tools it provides. These include automated search engine optimization and spam generators that could potentially undermine trust. ORCID has put in place brakes that halt these schemes. Another less common problem is academic fraud by which people misrepresent their works and affiliations. This is a violation of the terms of use and these records can be challenged through the dispute procedures. ORCID is not an arbiter of what data is associated with a contributor profile, but it does provide authenticated workflows with registered data providers. A researcher can determine for themselves the authenticity of the data and the provenance of the data provider before deciding whether or not to grant permission for data exchange.

ORCID is a non-profit, member-governed organization that provides an open platform for disambiguated, unique and persistent researcher/scholar/contributor name identifier and profile information. And ORCID ID is a free service to individuals. More information about ORCID is available from this guide.

Tracking tools, data collection and surveillance in the library platforms we use

I attended an excellent SPARC OpenCon Librarian Community call on August 10th about publishers, namely Elsevier, Thomson Reuters and Clarivate among others, who have transitioned their revenue streams over the past 30 years from selling published content to selling data products built upon proprietary systems and surveillance technologies. Data tracking in research: aggregation and use or sale of usage data by academic publishers, a white paper from DFG/German Research Foundation, describes the current situation: data tracked, collected and analyzed to develop new businesses to sell data about knowledge and to develop new services on existing platforms. However, data profiles about individuals and institutions could be created, potentially threatening data privacy and independent governance. Tracking tools are ubiquitous. The report also details three types of data mining: third-party data through micro-targeting, bidstream data and port tracking, and publisher spyware. Data is aggregated from different sources (think ScienceDirect, Twitter, Google and Facebook) to create user profiles. There are many reasons to be alarmed by this, including the inaccuracy of the data assigned to people and the chilling effect on academic freedom.