The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. The archive was created as an ftp archive in 1987 by David Aha and fellow graduate students at uc Irvine. Since that time, it has been widely used by students, educators, and researchers all over the world as a primary source of machine learning data sets. As an indication of the impact of the archive, it has been cited over 1000 times, making it one of the top 100 most cited "papers" in all of computer science. The current version of the web site was designed in 2007 by Arthur Asuncion and David Newman, and this project is in collaboration with Rexa.info at the University of Massachusetts Amherst. Funding support from the National Science Foundation is gratefully acknowledged.
Even under an existing moratorium, earmarks have received a lot of attention in the media, in Congress, and around the water cooler in recent years. But despite the interest there is a good deal of disagreement about the definition of earmark, the role of earmarking in the budget process, whether it is an appropriate use of Congress's time, and whether earmarks serve the interests of taxpayers.
Common Core of Data (CCD) - Public Elementary/Secondary School Universe Survey Data Ashtracker.org - Tracking Groundwater Contamination at Coal Combustion Waste Disposal SitesCDC’s WISQARS™ (Web-based Injury Statistics Query and Reporting System)
Academic Torrents - Enables Sharing Enormous Datasets; Runs Data Portal as wellHarvard Geospatial Library
The Harvard Geospatial Library is the University’s catalog and repository for geospatial data. It houses thousands of layers of digital geospatial data, in both vector and raster (scanned maps) forms. HGL uses traditional text searching combined with map/coordinate based searches. Data can be viewed on-line, or downloaded for use in a desktop GIS. See a powerpoint presentation explaining HGL and its capabilities.
The Harvard Map Collection
The Harvard Map Collection maintains a large collection of geospatial data sets for use in Geographic Information Systems (GIS), Cartography, and Remote Sensing. The Map Collection's geospatial holdings include U.S. Census Bureau, Boston metropolitan area, and many worldwide and foreign data sets. These data can also be found in HOLLIS.
Graduate School of Design LAN
The GSD Frances Loeb Library and the GIS Specialist have gathered together a large amount of GIS data stored on the GSD Local Area Network for student access.
Public GIS Data Resource Listing
CGA maintains a list of organizations that provide geographic data, including government, educational or non-profit organizations that provide free data.
Collect your own data with CGA's GPS loaner systems .
U.S. population issues, trends, and statistics, in graphics and text, presented in an easy-to-use format. Includes education, race, age, migration, income and poverty, marriage and family. Graphs in pdf, data downloads in Excel or text.
http://www.huduser.org/bibliodb/pdrbibdb.htmlThe hud (US Department of Housing and Urban Development) USER Database is the only bibliographic database exclusively dedicated to housing and community development issues. It contains more than 10,000 full-abstract citations to research reports, articles, books, monographs, and data sources in housing policy, building technology, economic development, urban planning, and a host of other relevant fields.
http://cms.hhs.gov/researchers/default.aspPublications and downloadable data on various health and human services topics.
http://www.census.gov/epcd/cbp/view/cbpview.htmlCounty Business Patterns is an annual series that provides subnational economic data by industry. The series is useful for studying the economic activity of small areas; analyzing economic changes over time; and as a benchmark for statistical series, surveys, and databases between economic censuses. Businesses use the data for analyzing market potential, measuring the effectiveness of sales and advertising programs, setting sales quotas, and developing budgets. Government agencies use the data for administration and planning. County Business Patterns covers most of the country's economic activity. The series excludes data on self-employed individuals, employees of private households, railroad employees, agricultural production employees, and most government employees.
http://www.chass.utoronto.ca/datalib/other/A guide to data libraries, data archives, and related institutions about which information is available via the Internet, as well as to primary research data and related resources available for access or acquisition via tcp/ip-based tools. Also the literature available on data management.
http://www.econdata.net/This is one of the best data search engines on the web. It allows for searches by subject (demographics, employment, occupation, income, output and trade, prices, economic assets, quality of life, industry sectors, and firm listings) and provider (Census, BLS, BEA, and other federal, state, local, and private sources). The site also gives a list of 'top tens' where you'll find the majority of the data you're likely to need and a list of data collections, including: access to tools of multiple data series, statistical compendia, indices, rankings, and comparisons, economic analyses and forecasts, guides to data on the web, data intermediaries, search engines, microdata, and mapping resources.
http://woodrow.mpls.frb.fed.us/research/data/Data search engine, CPI calculator via Federal Reserve Bank of Minneapolis.
http://www.upjohninst.org/erdc/index.htmWith the cooperation and assistance of the U.S. Department of Labor, the Upjohn Institute serves as the repository of many research and evaluation projects conducted by the Department of Labor. The site gives a listing of projects, a summary, and links to reports and other information related to it. Data from these projects must be purchased. Abstracts, executive summaries, and listings of the contents of the data CDs are available at the site.
http://www.icpsr.umich.edu:81/GSS/The GSS (General Social Survey) is an almost annual "omnibus," personal interview survey of U.S. households conducted by the National Opinion Research Center (NORC).
Links to all federal statistical agencies, such as the Bureaus of the Census, Economic Analysis, Labor Statistics, Justice Statistics, and Transportation Statistics, as well as links to similar international agencies.
http://members.aol.com/copafs/fedlinks.htmOffers links to demographic profiles, money income in the US, earnings by education and attainment, historical tables, small area income and poverty, dynamics of well-being, and income and poverty related data and reports of the Census.
US Census Bureau http://www.census.gov/hhes/www/income.htmlhttp://www.ipums.umn.edu/usa/intro.htmlThis document describes the Integrated Public Use Microdata Series (IPUMS-98), created at the University of Minnesota in October 1997. The IPUMS consists of twenty-five high-precision samples of the American population drawn from thirteen federal censuses.
http://www.nber.org/data/Links to macro, industry, and individual data sources, many data sets that may be difficult to find elsewhere, including: segregation data, school district data, manufacturing productivity, Federal Reserve Economic Data (FRED), vital statistics, child health, occupational wages worldwide, business cycle data, and more. Also provides help on how to read large data sets into Access or Excel.
An interactive data tool providing access to statistical data from the U.S. government. Provides easy access to useful government data not available elsewhere on the Internet. http://govinfo.kerr.orst.edu/index.html
The PSID is a longitudinal survey of a representative sample of US individuals and the families in which they reside. It has been ongoing since 1968. The data were collected annually through 1997, and biennially starting in 1999. The data files contain the full span of information collected over the course of the study. PSID data can be used for cross-sectional, longitudinal, and intergenerational analysis and for studying both individuals and families. http://www.isr.umich.edu/src/psid/
Census site on poverty statistics. Includes poverty definitions, thresholds and guidelines, current population survey (CPS), survey of income and program participation (SIPP), decennial census, and other poverty data. US Census Bureau http://www.census.gov/hhes/www/poverty.html
Annual Income, employment, wage, and salary data. Bureau of Economic Analysis http://www.bea.doc.gov/bea/regional/reis/
Links to data extraction tools and data categorized by federal government, university, foundation, commercial, UN & World Bank, and foreign and international sites. http://www.trinity.edu/~mkearl/data.html
A data search engine designed to be used by persons using assistive technologies. It allows for searches for both the Web site and the data holdings simultaneously. Subjects include Census, community and urban, economic behavior, education, geography and environment, government, health care, and more. Inter-University Consortium for Political and Social Research http://www.icpsr.umich.edu/search-basic.html
Quicklinks to Census and other popular sites. World and national data on income dynamics, displaced workers, crime and judicial, political polls and government systems, political and social indicators, military expenditures, and other online data and metadata. University of California at San Diego http://ssdc.ucsd.edu/
Data search engine. http://ssdc.ucsd.edu/
Data search engine University of Michigan Documents Center http://www.lib.umich.edu/govdocs/stats.html
A data search engine for national, state, and regional data on employment and earnings. Economic Policy Institute http://www.epinet.org/datazone/
Demographic media and data links as well as teaching modules for census data analysis in the classroom. http://www.ssdan.net/
Historical poverty data for the United States, text document. Green Book, Appendix H (Poverty) (1994) http://aspe.os.dhhs.gov/94gb/apenh.txt
Historical census data for the United States provided by the University of Virginia, the data presented here describe the population and economy of US states and counties, from 1790 to 1960. http://fisher.lib.virginia.edu/census/
Contains access to interactive Internet tools, downloadable software, and data from the Census Bureau. http://148.129.75.3/main/www/access.html
Occupation, employment, wages, benefits, inflation, demographics, consumer spending data. http://www.bls.gov/home.htm
All 2002 KIDS COUNT data is now available from an easy-to-use, powerful online database that allows you to generate custom graphs, maps, ranked lists, and state-by-state profiles. You can also download the entire KIDS COUNT data set as delimited text files. Kids Count Data Book Online GeoData at Tufts National Transportation Atlas Database School Attendance Boundary Information System (SABINS) http://www.aecf.org/kidscount/kc2002/
Check out these abundant sources of free and open data.