SOMAR Data Access Application Guide

SOMAR Data Access Application Guide

Welcome! Thank you for your interest in applying for access to SOMAR’s data!

This guide will help you complete SOMAR's application forms and prepare the required documentation for upload. Please carefully review this guide, and if you haven’t already, we recommend viewing the publicly available resources listed in the sections below for specific datasets you are interested in accessing.

Please use the sidebar on the right side of this page for easier navigation.

When ready, please return to the SOMAR Application Portal and select the dataset you wish to access to begin.

IMPORTANT NOTE: Please complete the application form in one sitting. SOMAR is working to enhance the forms so progress can be saved prior to submission.

This guide is a living document and will be updated as needed for easier navigation and more clarity in the information provided.

Feel free to reach out to somar-help@umich.edu if you have any questions about the applications!


Table of Contents


SOMAR Data Access Methods and Requirements

Data at SOMAR that require applications are typically available via one of two access methods. Researchers interested in exploring and using SOMAR data are encouraged to review the access methods and requirements defined below to understand the requirements to access a data file. Please reach out to somar-help@umich.edu for any questions.

Virtual Data Enclave (VDE) - data access only available in a secure virtual environment

Some SOMAR datasets have highly sensitive or identifiable information or usage restrictions. These datasets are only available in our Virtual Data Enclave (VDE). VDE datasets are noted in the “Dataset required for research” section. One example of datasets available in the VDE is the U.S. 2020 Facebook and Instagram Election Study Collection.

  • Accounts needed

    • Jira account (sign-up required; it's free)

  • Application requirements

    • Full application - All required items must be completed and provided, including IRB or Ethics Committee Review documentation, Restricted Data Use Agreement (RDUA), CVs or resumes, etc.

Controlled Download - data download available via a secure link

Some SOMAR datasets may cover sensitive topics or have usage restrictions. These datasets require an application and agreement to follow ICPSR’s Terms of Use. However, these datasets can be used in a researcher’s own computing environment. After your application is approved, we will send you a link to download the dataset. Controlled Download datasets are noted in the “Dataset required for research” section. One example of datasets available via Controlled Download is State of Social Connections Study.

  • Accounts needed

    • Jira account (sign-up required; it's free)

  • Application requirements

    • Short application - Requirements vary by dataset and may include a combination of the following: RDUA, IRB or Ethics Committee review documentation, experience handling confidential or sensitive data, ethical guidelines, and coding or technical expertise.

  • Agreement to Terms of Use in the Application

    • Researchers applying to access Controlled Download data must agree to standard Terms of Use from ICPSR


Resources for Data from Meta Platforms

Meta partners with the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan to share public data from Meta’s platforms in a responsible, privacy-preserving way. This partnership is enabled through ICPSR’s industry-leading SOMAR initiative.

Before applying, we recommend reviewing the Meta Transparency Center and its documentation to better understand available data, features, and search capabilities.

Access Options

Researchers can access data from Meta through different pathways depending on the dataset:

Dataset

Where You Apply

Access Environment

Dataset

Where You Apply

Access Environment

Meta Content Library and API

Meta Research Tools Manager

Meta Secure Research Environment (SRE)

OR

SOMAR Virtual Data Enclave (VDE)

US 2020 Facebook & Instagram Election Study

SOMAR Application Portal

SOMAR Virtual Data Enclave (VDE) only

Please refer to FAQs for more information, including the breakdown of both platform options to inform your decision during your application process.

For Meta Content Library, researchers apply through Meta Research Tools Manager and select their access environment (VDE or SRE) during the application process. All collaborators on a project must use the same environment. Projects using the VDE will require additional information from SOMAR after approval.

Datasets hosted directly by SOMAR, such as the US 2020 study, require a separate application through SOMAR Application Portal and are only accessible within the VDE.

Support & Documentation


Application Forms

The SOMAR Application Portal collects key details about your research project and verifies your research team to determine eligibility for restricted data access.

To access restricted datasets at SOMAR, complete one of two forms:

  • SOMAR Data Access Application

    • Submit an access request for restricted datasets (completed by the Lead Researcher or Applicant)

  • Collaborator Request

    • Add collaborators who need access to a research project (submitted by the Lead Researcher, Applicant, or collaborators)

Note: Most fields are required, indicated with an asterisk *

SOMAR Data Access Application

Project Application Fields

The fields listed below capture information that helps SOMAR staff to record and track applications as well as easily identify a point person for each application.

Applicable Datasets

Fields/Questions

Notes for Applicants

Applicable Datasets

Fields/Questions

Notes for Applicants

All datasets

Dataset required for research*

This is a cascading drop-down menu question.

Please let us know the names of the dataset(s) needed for your research project.

U.S. 2020 Facebook and Instagram Election Study

Since you selected the U.S. 2020 Facebook and Instagram Election Study, please select the files you need for your research project.

This is a multi-select dropdown question. Please select all of the datasets you need for your research.

Researcher Contact Details

Applicant Details

Individual who submits and manages the application as the primary contact. This role is administrative only; applicants do not receive data access. If access is needed, they must also apply as a collaborator. The applicant may also be the Lead Researcher or another research team member.

Applicable Datasets

Fields/Questions

Notes for Applicants

Applicable Datasets

Fields/Questions

Notes for Applicants

All Datasets

First name of Applicant*

 

Last name of Applicant*

 

Institutional email of Applicant*

The Applicant's email address must show their institutional affiliation.

 

Lead Researcher Details

The fields below collect high-level information about the Lead Researcher, their institution, and experience. Providing current and accurate information ensures efficient SOMAR application reviews.

The Lead Researcher leads the proposed research project or broader research agenda requiring dataset access. For team-based projects, the Lead Researcher is the Principal Investigator, Lead Investigator, Research Lead, or equivalent in your organization (e.g., Social Media Monitor). Sometimes, the applicant and Lead Researcher are the same person.

Applicable Datasets

Fields/Questions

Notes for Applicants

Applicable Datasets

Fields/Questions

Notes for Applicants

All Datasets

First name of Lead Researcher*

 

All Datasets

Last name of Lead Researcher*

 

All Datasets

Institutional email of Lead Researcher*

The Lead Researcher's email address must show their institutional affiliation. NOTE: Personal email addresses, such as @gmail.com, are not permitted.

All Datasets

Primary discipline or professional area of expertise*

 

  • Beyond the Hashtags: #Ferguson, #Blacklivesmatter, and the Online Struggle for Offline Justice

  • ChatGPT in education: A discourse analysis of worries and concerns on social media

  • Politweets: Tweets of politicians, celebrities, news media, and influencers from India and the United States

  • U.S. 2020 Facebook and Instagram Election Study

Highest degree earned*

Please name the highest degree earned by the Lead Researcher.

NOTE: Lead Researchers must hold a terminal degree (e.g., PhD, MD, DrPh, or JD) to access data the VDE.

  • Replication data for "Emergent structures of attention on social media are driven by amplification and triad transitivity"

  • State of Social Connections Study

  • Candidata

Highest degree earned*

Please name the highest degree earned by the Lead Researcher (e.g., PhD, MD, DrPh, JD, MA, MS, MPH, BA, etc.)

Note: This information is collected for demographic purposes; no degree is required to access the data.

All Datasets

Affiliational profile URL (organization, research, or academia)*

The Lead Researcher must provide a personal profile showing their current affiliation. Include a link to one of the following: an active institutional profile or biography, an organizational website with a directory or staff list, an ORCID or ResearchGate profile, or a webpage for their research group, lab, or department. If links are unavailable, submit other documentation at the end of the application to verify affiliation.

Note: LinkedIn profiles, GitHub Pages, and personal websites are not allowed.

All Datasets

Institution name*

The Lead Researcher must be currently affiliated with the institution listed here.

Lead Researchers must be affiliated with an academic institution or a non-university organization, institute, or society that operates as a not-for-profit entity and focuses on scientific or public interest research. Researchers from diverse disciplinary and professional backgrounds are welcome to apply.

All Datasets

Type of Institution*

 

All Datasets

Country of institution (ISO Code)*

Please provide the 3-letter ISO country code for your institution (e.g., USA, CHN, AUS). To find the correct code, use the following link and search for your country’s 3-letter code: https://www.iso.org/obp/ui/#search/code/.

All Datasets

State/Province of institution*

Please provide the full name of the state or province of the affiliated institution.

All Datasets

City of institution*

Please provide the full name of the city of the affiliated institution.

All Datasets

Department name*

If your institution does not have a department, please add “N/A’

All Datasets

Institute, center, or lab name*

If you do not have anything to report, please add "N/A".

All Datasets

Role at institution*

Applicants will choose one answer from the following options:

  • Assistant professor

  • Associate professor

  • Professor

  • PhD student

  • Master’s student

  • Lecturer

  • Postdoctoral fellow

  • Independent researcher

  • Staff researcher

  • Other (describe)

All Datasets

Lead Researcher's resume or curriculum vitae*

Please upload the Lead Researcher's resume or curriculum vitae in Adobe PDF or Microsoft Word.

 

Collaborator Details

Anyone other than the Lead Researcher who accesses or handles the data must submit their own application to join the research project. This includes computing staff, data librarians, research assistants (including students), or general staff. Collaborators may need official affiliation with the Lead Researcher's institution, depending on institutional requirements. Contact somar-help@umich.edu with any questions.

Applicable Datasets

Fields/Questions

Notes for Applicants

Applicable Datasets

Fields/Questions

Notes for Applicants

All Datasets

Are there any collaborators that need access to the requested data for this research project?*

 

Select 'Yes' only if your collaborators will perform analysis or data management activities.

Only individuals who need to analyze or directly interact with the data can be added as collaborators.

Those who review output or results outside the Virtual Data Enclave or just before submission to a publisher or conference do not need to be added as collaborators for your data access request.

Collaborators can be added any time after the Lead Researcher’s application has been submitted.

Complete a separate form for each collaborator and provide the following information:

  • Lead Researcher’s application ID (SOMARAPPLY-####)

  • Research project title

  • Lead Researcher’s last name and email address

To add collaborators, use the Collaborator Addition Request Form. This form includes questions similar to those on the Project Application, such as contact information, project details, and contracts/acknowledgements.

To add a collaborator for Meta Content Library API access in SOMAR’s VDE, follow the instructions in the Meta documentation.

Note: If the Project Application cites a collaborator’s experience to meet eligibility or expertise requirements, that collaborator must complete the Collaborator Addition Request form for the application to proceed. Until the form is submitted, we cannot consider their qualifications in the review process of the Project Application, and the application will be placed on hold.

 

Research Project Information

These items collect information about the Lead Researcher’s project, including a summary and the purpose of the request. Applicants should provide clear, relevant details about their research agenda, with the option to upload supporting documents about the project or their experience later in the application.

Applicable Datasets

Fields/Questions

Notes for Applicants

All Datasets

Research Project Title*

Please provide your project title or a brief, 1-2 sentence overview of your research agenda.

*Examples could include “investigate public health discourse on social media,” “study social media content related to upcoming presidential elections,” etc.

All Datasets

Research project summary*

In 250 words or less (up to 1750 characters), describe your research agenda or plans for a non-specialist audience. Your research agenda can include longer-term or ongoing research that you and your team are conducting. Please be sure to address: your general area of focus, guiding research questions, methodologies, as well as any endpoints, data types, or data fields of interest. If you haven't already, please consult the publicly available documentation about the data before you begin your application.

All Datasets

Keywords associated with research*

List the keywords associated with the research project. Please separate the keywords with commas.

All Datasets

Funding source(s)*

List the funding sources that will support the Lead Researcher's data analysis. Please separate the funding items with commas.

You may enter "N/A" if there are none.

All Datasets

Justification for why requested data are required for your research activities.*

(250 words or less)

Please briefly explain why the data you are applying to access are required for your research activities.

All Datasets

Research outcomes*

Please select the intended outcome(s) of your research project.

 

One or more answers from the following options may be selected:

  • Peer-reviewed article

  • Conference presentation

  • White paper

  • Article (popular press)

  • Study reproduction (for journal peer reviewers)

  • Study reproduction (for researchers/data users)

  • Article review

  • Report

  • Other

 

Additional requirements for data access in the VDE

These items gather extra information about the Lead Researcher and research project for review to access data in the Virtual Data Enclave. Applicants may upload additional supporting documents about the project or researcher’s experience later in the application.

Applicable Datasets

Fields/Questions

Notes for Applicants

  • Beyond the Hashtags: #Ferguson, #Blacklivesmatter, and the Online Struggle for Offline Justice

  • ChatGPT in education: A discourse analysis of worries and concerns on social media

  • Meta Content Library and Content Library API (for API Access)

  • Politweets: Tweets of politicians, celebrities, news media, and influencers from India and the United States

  • U.S. 2020 Facebook and Instagram Election Study

Including any education, the total number of years of coding experience*

 

Information about the Lead Researcher's coding experience is necessary to inform SOMAR how best to assist researchers for VDE access.

Coding experience can come from using Python, R, Java, C++, and so on.

 

Applicants will choose one answer from the following options:

  • No experience

  • Less than 1 year

  • 1 to 4 years

  • 5 to 9 years

  • 10-14 years

  • 15+ years

Preferred programming languages* 

Please select your preferred programming languages.

Preferred statistical software for quantitative data analysis*

Please select your preferred statistical software for quantitative data analysis.

Evidence of technical skills of Lead Researcher and/or collaborators*

Experience with Python, R, SQL, or another coding or querying language is recommended for VDE access. Explain your experience with a coding or querying language here, or applicants may provide evidence via a link to a GitHub repository or other location where code examples have been shared. Access to the VDE is not dependent on experience with coding. This field is to help SOMAR know how to best support researchers.

Note: If the Project Application references a collaborator’s experience to meet eligibility or expertise requirements, that collaborator must complete the Collaborator Addition Request form for the application to proceed. Until the form is submitted, we cannot consider their qualifications in the review process of the Project Application, and the application will be placed on hold.

*The character limit is 5000.

Evidence of responsible experience or use of sensitive or restricted data*

 

Evidence of responsible experience or understanding of sensitive or restricted-use datasets is required. Please provide up to 3 citations or examples of your research that demonstrate your experience or understanding of using sensitive data.

 

*The character limit is 5000.

Ethical guidelines to be used to guide research agenda*

Please 1) indicate the ethical guidelines you will use to guide your research project, and 2) describe how you plan to apply specific principles in practice. For example, you may refer to the AOIR Ethics, NeurIPS Code of Ethics, or Williams, Burlap, and Sloan (2017), but you must also describe how you adhere to the guidelines you selected.

I confirm that I will use these data solely for statistical analysis and reporting of aggregated information, and not to investigate specific individuals or organizations.*

 

I confirm that all research team members, including the Lead Researcher, have the skills needed to work with the data responsibly if access is granted.*

 

 

Required Questions for Virtual Enclave Onboarding

Each person using the VDE must complete the following ICPSR VDE training as part of the application and onboarding process. Please take the following steps:

  1. Watch the ICPSR VDE Training video (~7 minutes)

    • Please note that SOMAR’s VDE environment has some differences, but the VDE security information remains the same.

  2. Complete the VDE Training Quiz (~1-2 minutes)

    • Although instructions in the confirmation of submission message ask researchers to email ICPSR that they completed the quiz, SOMAR does not need these messages. We will onboard researchers to the VDE after their applications have been approved and all requirements for data access in the VDE are met.

  3. If you answered any quiz questions incorrectly, please rewatch the ICPSR VDE Training video.

  4. Complete the questions directly below.

Applicable Datasets

Fields/Questions

Notes for Applicants

Applicable Datasets

Fields/Questions

Notes for Applicants

  • Beyond the Hashtags: #Ferguson, #Blacklivesmatter, and the Online Struggle for Offline Justice

  • ChatGPT in education: A discourse analysis of worries and concerns on social media

  • Meta Content Library and Content Library API (for API Access)

  • Politweets: Tweets of politicians, celebrities, news media, and influencers from India and the United States

  • U.S. 2020 Facebook and Instagram Election Study

Do you currently have active credentials (example: uniqname) to access the SOMAR VDE?*

Please select “Yes” or “No”.

Active credentials mean you can access the SOMAR VDE.

Date of Birth (month/day/year)*

This field is required if you select “No” to the active VDE credentials question.

Phone Number (home/mobile)*

This field is required if you select “No” to the active VDE credentials question.

Please enter your phone number using numbers only (no symbols, spaces, or letters). Include your country and area codes. For example: 441234567890 (UK) or 12125551234 (US).

I confirm that the Lead Researcher has completed the VDE training video and successfully passed the VDE training quiz as outlined in Steps 1 and 2 above..*

This field is required if you select “No” to the active VDE credentials question.

 

Restricted Data Use Agreement with ICPSR

(also known as Data Use Agreement)

For access to data in SOMAR's Virtual Data Enclave, the Restricted Data Agreement (RDUA) must be reviewed, signed, and dated by the Lead Researcher and their institutionʼs legal representative or signatory. The Restricted Data Use Agreement is an agreement between the University of Michigan and the Lead Researcherʼs institution, signed by both the Lead Researcher and an institutional signatory (i.e., legal representative) of the Lead Researcherʼs institution, which specifies the terms of use of the restricted data.

Please download the agreement document and upload the completed version in the form.

SOMAR can work with researchers whose institutions need to modify the RDUA. All proposed changes—whether new or previously accepted elsewhere—must go through a standard review and approval process with the University of Michigan research contracts department.

How to Request Modifications:

An institutional representative (contracts officer, legal signatory, etc.) should submit proposed changes as a redlined Word document via email to ICPSR-help@umich.edu, noting that the request is for SOMAR data.

Please keep in mind:

  • Requesting changes can significantly extend processing time, potentially delaying data access by several weeks, as all revisions must be reviewed and approved by the University of Michigan legal team.

  • Note: While most agreements are completed within six weeks, some may take up to three months, depending on the current queue.

For more details, including a full timeline, see the ICPSR Agreement Modification Process:

 

Applicable Datasets

Fields/Questions

Notes for Applicants

Applicable Datasets

Fields/Questions

Notes for Applicants

  • Beyond the Hashtags: #Ferguson, #Blacklivesmatter, and the Online Struggle for Offline Justice

  • ChatGPT in education: A discourse analysis of worries and concerns on social media

  • Politweets: Tweets of politicians, celebrities, news media, and influencers from India and the United States

  • U.S. 2020 Facebook and Instagram Election Study

Upload Signed Restricted Data Agreement*

Please upload the signed agreement in PDF format. The Restricted Data Agreement must be signed and dated by both the Lead Researcher and an institutional representative authorized to act on their institution’s behalf. This representative typically handles research compliance and agreements, legal counsel, or serves as an executive such as a President, Vice President, Dean (at some non-U.S. institutions), or department head (at some non-U.S. institutions).

Institutional signatory† name*

Please provide the name of the legal authority at the affiliated institution that will sign the restricted data agreement associated with this research project.

Institutional signatory† title*

Please provide the title of the legal authority identified for this application (e.g., Vice President, Contracts Specialist, Chair, Dean, etc.).

Institutional signatory† email address*

Please provide the email address for the legal authority identified for this application

†An institutional signatory is a person authorized to enter into legal agreements and sign contracts on behalf of the Lead Researcher’s institution. At U.S. academic institutions, they often work in offices such as the Contracts Office, Office of Sponsored Research, or Research Contracts Management. For non-U.S. or non-academic institutions, this may be another individual with equivalent legal authority to review and sign agreements.

 

Ethics Committee or Institutional Review Board (IRB) documentation

An Ethics Committee ensures research follows national and international laws ethically.