About the Exhibit of Datasets

An exhibit with showcases

This Exhibit is a platform for introducing research datasets in the humanities and social sciences, deposited in a trusted repository and thus made accessible for the long term. Most of them are described in more detail in a corresponding datapaper, published in the Research Data Journal for the Humanities and Social Sciences.

The showcase is a quick introduction to such a dataset, a bit longer than an abstract, with illustrations and other multimedia (if available). As a rule it also offers the option to get acquainted with the data itself, through an interactive online spreadsheet, a data sample or link to the online database of a research project. Usually, acccess to these datasets requires several time consuming actions, such as downloading data, installing the appropriate software and correctly uploading the data into these programs. This makes it difficult for interested parties to quickly assess the possibilities for reuse in other projects.

The Exhibit aims to help visitors of the website to get the right information at a glance by:

  1. Attracting attention to (recently) acquired deposits: showing why data are interesting.
  2. Providing a concise overview of the dataset's scope and research background; more details are to be found, for example, in the associated data paper in the Research Data Journal (RDJ).
  3. Bringing together references to the location of the dataset and to more detailed information elsewhere, such as the project website of the data producers.
  4. Allowing visitors to explore (a sample of) the data without downloading and installing associated software at first (see below).
  5. Publishing related multimedia content, such as videos, animated maps, slideshows etc., which are currently difficult to include in online journals as RDJ.
  6. Making it easier to review the dataset. The Exhibit would also be the right place to publish these reviews in the same way as a webshop publishes consumer reviews of a product.
Exploring deposited data online

With the arrival of fast network connections cloud computing becomes more and more feasible, also for research purposes. Google, Microsoft and Amazon have already spent large amounts of money to promote this technological trend and also offer options for small scale non-business application. Using this technology we may share not only the data (as through data archives and project websites), but also their processing. Some examples:

1. Letting users play with data

One can quickly explore data stored in Excel once the spreadsheet has been published as a full web document. A link to such a spreadsheet somewhere in the cloud suffices to allow users to "play with the data". This may give a better impression of the dataset's potential than reading extensive documentation.

In the Exhibit we have realized this through Zoho. Zoho Corporation is an Indian software development company, founded in 1996. It has a focus in web-based business tools and information technology solutions, including an office tools suite. Spreadsheets published through Zoho may be modified without changing the original document. The functionality of a Zoho spreadsheet is complete on a desktop computer, but limited on mobile devices ― however still much better than Google Docs and Zoho does not require any login as with Microsoft's One Drive.

Explore Data This button will open a spreadsheet online, in which the user may modify, sort and filter data, but the changes made are not saved to the original document. If wanted the spreadsheet can be saved to a local computer. However, downloading the dataset from the repository is preferable, because the online demo does not necessarily comprise the entire dataset. The button is also used to link to online query facilities of a project database, which serve a similar purpose.
2. Demonstrating data processing

Data structure and data processing are relatively simple in a spreadsheet. A next level is the demonstration of algorithms, for example using JSFiddle. In the context of RDJ a JSFiddle code snippet can illustrate a crucial stage of a project's computing work.

Even more ambitious is exploratory computing as with the popular Jupyter Notebooks. The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code (e.g. Python), equations, visualizations and narrative text. Notebook documents are both human-readable and executable; they can be run to perform data analysis. However, installing a Jupyter Notebook on your own computer may be a tedious job and you will need to add extensive data resources. Fortunately, the notebook document can also be converted to HTML and published in that format (see example in the Exhibit).

Technical Details This button is used to open a non-interactive demonstration of data processing, for example the HTML version of a Jupyter Notebook, and also for documentation of data structures (e.g. a database model).

Microsoft offers an interesting alternative in the form of Azure Notebooks, which is a 100% cloud implementation of Jupyter Notebooks and which allows to import data from the cloud (e.g. Google Drive, One Drive) as well. The Research Data Portal wants to incorporate these and other novel approaches, which will appear in the next years, to demonstrate the full potential of deposited datasets in a user-friendly way. It implies a gradual transition from describing to watching and experiencing data processing.

3. Data visualisation: examples

Data exploration will mostly be based on the text of the data paper. Text is the principal medium for explaining data structure and providing contextual information. Visual elements as illustrations, tables and charts can make understanding the textual discourse easier. They attract and sustain the interest of readers, and efficiently present large amounts of complex information. Because the Exhibit is an online environment with options for executing program code, charts, maps, timelines, videos and audio recordings can also be made interactive. We have created a gallery of data visualisation demos to get some inspiration and to demonstrate what is available on our website.

4. Service for making data visualisations

We understand that most scholars have been trained to write for printed publications. The technical aspects of making interactive data visualisations can be a real burden for authors. Therefore, DANS (in cooperation with ScieMedia) offers a limited free service to produce multimedia content for data showcases. We only ask authors to cooperate in the delivery of suitable examples with associated data and images. The technical editor will be happy to answer your questions and help you make choices. Feel free to contact ScieMedia for this purpose directly: info@sciemedia.com.

The Exhibit of Datasets is an initative of DANS
Data Archiving and Networked Services in the Netherlands
HOME