About CatVis

CatVis is the informal, easy-to-remember name of the project Visual Analytics for the World’s Library Data funded by the Netherlands Organisation for Scientific Research (NWO) in partnership with OCLC. OCLC and its member libraries cooperatively produce and maintain WorldCat, the largest online public access catalog (OPAC) in the world. OCLC hosts an extensive project description.

Project Objective

We aim to develop a cutting-edge visual analytics toolkit which brings the power of visualization to library data providers and the full range of its users, with particular attention to researchers in the humanities. Our private partner OCLC develops and maintains WorldCat, the world’s largest library metadata aggregation, which contains more than 321 million library records. In collaboration with OCLC’s research scientists we plan to develop and test visualizations based on the latest in library data modeling studies, including the FRBR (Functional Requirements for Bibliographic Records (IFLA 1998)) hierarchical framework and the SCHEMA.org linked data implementation by OCLC, to answer both the pressing needs of humanities researchers and concrete demands of the library industry. We will develop flexible, insightful, cost-effective and user-friendly visualizations for all steps of the data pipeline:

  1. Data cleaning, clustering, and enrichment.Supports the internal workflow of OCLC. Requires, for example, efficient algorithms for set and network visualization and new visual metaphors for incomplete or uncertain data sets.
  2. Data analysis. For example, a work's editorial history at a glance (how many expressions, how are they related) or exploration of similar works.Supports the internal workflow of OCLC and text-based humanities researchers. Requires, for example, new visual metaphors for provenance visualization and the visualization of time-varying data.
  3. Intuitive and interactive representation of search results. Specifically, geographic representations such as interactive maps.Supports humanities researchers, librarians, and the general public. Requires, for example, fast hierarchical clustering and interaction techniques to scale to WorldCat's size.