Skip to Main Content
* UC Irvine access only

Business:  Datasets 

Your comprehensive station for UCI's business research content and beyond!
URL: https://guides.lib.uci.edu/business


Licensed Resources for Existing Datasets

 

ICPSR Data Archive
Access: UCI
Geography: Global

An international consortium of more than 700 academic institutions and research organizations.  ICPSR maintains a data archive of more than 500,000 files of research in the social sciences.   

 

Global Financial Data
Access: UCI
Geography: Global

Current and historic downloadable economic and financial time series for the U.S. and over 150 foreign countries.

  • After creating an account, make sure to CLICK the Login button (do not use an Enter/Return key) to get into the database.

 

OECD iLibrary
Access: UCI
Geography: Global

All publications and datasets released by OECD (Organisation for Economic Cooperation and Development), International Energy Agency (IEA), Nuclear Energy Agency (NEA), OECD Development Centre, PISA (Programme for International Student Assessment), and International Transport Forum (ITF) since 1998.

 

MarketLine Advantage
Access: UCI

Geography: Global

Country Statistics, Financial Deals, Company Prospector, Investment & Advisory Prospector, Company Report Generator, and Market Data Analytics.

  • Use the Databases tab in the top navigation menu.  

 

Roper iPoll
Access: UCI

Geography: US

Provided by the Roper Center for Public Opinion Research at Cornell University, Roper iPoll is the largest collection of poll data anywhere—from 1935 to present. Contains data from U.S. and international polling firms with broad topical coverage of opinions and behavior on social issues, politics, pop culture, international affairs, and more. Questions, charts, demographic crosstabs, and dataset downloads are immediately available,

 

Wharton Research Data Services (WRDS)
Access: UCI - Authorized Affiliates
Geography: Global

For PhD and faculty-level research.  Important databases in: finance, accounting, banking, economics, management, marketing and public policy. Merage users login through Catalyst.

 

 

Licensed Resources for Creating Datasets

 
S&P Capital IQ
Access: UCI - Authorized Affiliates
Geography: Global

A comprehensive company intelligence database, offering granular financial and capital structure information on millions of companies worldwide, including data and graphs for historic stock prices, searchable SEC filings, mergers & acquisitions activities, and venture capital/private equity data.  

  • Create custom tables of data; content is downloadable as Excel files.   
     

SimplyAnalytics
Access: UCI

Geography: US

Maximum 3 users at once! An easy-to-use interface offering thousands of social and economic data variables that can be used when building custom maps and reports (i.e. spreadsheets).  Register for a free account to save and return to your work. 

 

Ad$pender
Access: UCI - Authorized Affiliates
Geography: Global

Advertising expenditures and occurrence information for 3+ million brands across 18 media.  View a PDF user manual here

  • Learn this search!
  • For help with searching in this database, call the Ad$pender New York office (8am – 8pm Eastern): 1-800-497-8450.  Tell Ad$pender that you are with The University of California, Irvine, and that you would like to get help walking through your query from the beginning.

 

RateWatch Scholar Historical Data
Access: UCI - Authorized Affiliates
Geography: US

Information for U.S. financial institutions for research and analysis.  Data covers over 96,000 branch locations, depending on time period and data type, all provided voluntarily.  Data is gathered from institutions of all types and sizes, including banks, credit unions, savings and loan associations, etc.

 

Wharton Research Data Services (WRDS)
Access: UCI - Authorized Affiliates
Geography: Global

For PhD and faculty-level research.  Important databases in: finance, accounting, banking, economics, management, marketing and public policy. Merage users login through Catalyst.

 

Nielsen Datasets  
Access: UCI - Limited Affiliates

Geography: US

For PhD students and Tenure Track Faculty only!  The UCI Libraries' subscription includes the Consumer Panel dataset and the Retail Scanner dataset.  Data is from a partnership between Nielsen and the Kilts Center for Marketing at the Chicago Booth School of Business.


Other Resources

 

Tip  
Zillions of websites—academic, government, nonprofit, and corporate—offer datasets for research. 
Below are select academic sites with social science data. To find more datasets out in the web, try
pasting these sample searches in Google.  Adding site:.domain to a keyword search results only from
websites with that domain (.com, .gov, .org, .edu, etc.).  For example:

  • renewable energy data site:.org 
  • sustainable development site:.data.gov


UCI Machine Learning Repository
A collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. It is used by students, educators, and researchers all over the world as a primary source of machine learning data sets. 

 

California Policy Lab
A non-partisan research institute based at the University of California. Work is focused in six policy areas: education, criminal justice reform, poverty and the social safety net, labor and employment, health, and homelessness and high needs populations.  Offers and streamlines access to data for policy-related research.  (Note: some data may include fees or an approval process prior to use.)

 

IPUMS
Census and survey data from around the world integrated across time and space. IPUMS integration and documentation makes it easy to study change, conduct comparative research, merge information across data types, and analyze individuals within family and community context.  From the University of Minnesota.  (IPUMS was an acronym for Integrated Public Use Microdata Series, but in 2016 it became a standalone name, not an acronym).  

 

re3data.org
The Registry of Research Data Repositories.  Identifies and lists 1,500+ research data repositories, making it the largest and most comprehensive registry of data repositories available on the web.

 

Academic Torrents
A community-maintained distributed repository for datasets and scientific knowledge.  The service is designed to facilitate storage of all the data used in research, including datasets as well as publications.

 

SNAP
The SNAP library is being actively developed since 2004 and is organically growing as a result of Stanford University's research pursuits in analysis of large social and information networks.

 

Billion Prices Project
An academic initiative that uses prices collected from hundreds of online retailers around the world on a daily basis to conduct research in macro and international economics.

 

DocNow
Documenting the Now, a collaborative effort among multiple universities, is building a variety of tools to help researchers work with Twitter data.  Twitter's terms of service don't allow tweet datasets to be published on the web, but they do allow tweet identifier datasets to be shared.   DocNow maintains a catalog of Twitter datasets that are publicly available on the web.
 


What's the jargon?



Database vs Dataset
A database is usually the container for datasets.  Our Licensed Databases are resources that either contain datasets or help you create custom datasets by using the data that's stored within that database.  

Database Dataset
A collection of information and/or data, usually covering a broad theme (e.g., company information, psychology, etc.).  Info/data can exist in any medium- text, images, spreadsheets, video, etc. 

The info/data in a database may come from one producer (Mintel produces its own market research reports), or it may come from several producers (Factiva compiles news articles from thousands of different publications worldwide). 
A narrower collection of information and/or data, usually describing a specific topic or phenomenon (e.g., the stock price of a company, or consumers' responses to surveys, etc.). 

In business and economics, as academic disciplines, datasets are often presented as spreadsheets in .xlsx or .csv filetypes. 


Data

"Related items of (chiefly numerical) information considered collectively, typically obtained by scientific work and used for reference, analysis, or calculation." (OED Online)

Raw Data

This term is fuzzy because almost any observable or recorded phenomena could theoretically be raw data, but people often use it when referring to a dataset in a file format (i.e. an .xlsx or .csv file) that's presented as a spreadsheet. 

  • Caveatraw data is not always cleaned data.  


Cleaned Data
To clean data, people or computer programs assess the data and correct any errors, omissions, duplicates, inaccuracies, etc. so that the data is ready for analysis.  Determining whether data is clean enough for your needs can be tricky.  Read the supporting documentation that accompanies a dataset, and contact the data producer if you have additional questions.  

 

Licensed
If a database is licensed at UC Irvine, that means UCI pays the database producer to sponsor campus-wide access to these databases; it's like someone buying a Netflix subscription for their family.  

 


So you want an Excel spreadsheet...


Licensed Resources for Existing Datasets Licensed Resources for Creating Datasets

Use this page if you want to explore resources that will let you download a pre-made Excel file.  

This is generally an easier way to get a data file!

Use this page if you want to explore resources that will let you  customize what's in the Excel file before you download it.  

This is generally a more challenging way to get a data file!

 

Return to the top of the page.

 Box title- Access Answers 


What can I access? 


Access is generally available to all users. Registration or account creation might be required to access.

For commercial websites, the UCI Libraries do not offer premium memberships or subscriptions.

 


Access is available to all users ON the UCI campus and at GML.

OFF campus access requires Authorized Affiliates to log into the VPN with their active UCInetID and password. Authorized Affiliates are users with an active UCInetID and password, i.e. current UCI students, faculty, and staff.

 


Access requires an active UCInetID and password.

Authorized Affiliates are users with an active UCInetID and password, i.e. current UCI students, faculty, and staff.

 


These resources are not licensed by the UCI Libraries, but librarians occasionally promote them when they are relevant for certain types of research.

Access is available only for Authorized Affiliates, who are also affiliated with the Paul Merage School of Business.

 

The resources are limited to select UCI populations, based on the user’s status, e.g. current UCI Faculty or PhD students. Please refer to the UCI Libraries for access instructions.

Examples for why content may be limited include: a vendor set restrictions on who may access their information; alternatively, information may be sensitive, identifying, or embargoed;

 


How do I access? 
-  Students
-  Employees
-  Alumni
-  Visitors


New browser window icon.Am I on the UCI network?
Test my UCI connection.


New browser window icon.Am I responsibly using
what I access?

Typically acceptable vs. unacceptable use.

Want to learn how to prepare, analyze, and understand data? 

Check out the Software for Data Analysis guide, managed by the UCI Libraries' Computational Research Librarian.