Comp 151 Project 5

A "lets work with excel and real data" project.

Due: Wednesday Oct 31 Monday Oct 29th at 11:59pm (don't wait a long time, your exam is in this time as well.)


Summary: This project is intended to help you  work with real data like the kind found everywhere in MS excel, and do a visualization


Description:

write a program that will count all of the institutions of higher education in each US state, then display these as a choropleth map of the US.


Use this US government generated file of higher ed institutions: CollegeData.xlsx 

First go look at the data and understand what you have - a decent sized 6000 row excel file.

Be aware: this file will take about 2 minutes to open and read in. If your program appears to hang, give it a couple of minutes.

Lets assume this data is reasonable, it often lists multiple campuses of the same institution as separate rows. To make life easier, we'll go with the assumption that each campus/row is its own institution.

then count up how many institutions are in each state, and map those onto a US map.


in case you're laptop crashed, here was the display code from our in class example:  Change the bits you need to according to options we discussed in class.

state_names_Pandas = pandas.Series(state_names)
change_in_income_Pandas = pandas.Series(change_in_income)

data = [dict(
type='choropleth',
colorscale="BlueRed",
autocolorscale=False,
locations=state_names_Pandas,
z=change_in_income_Pandas,
locationmode='USA-states',
colorbar=dict(
title="Median Income Change in Dollars")
)]

layout = dict(
title='Change in median income for each state',
geo=dict(
scope='usa',
projection=dict(type='albers usa'),
showlakes=True,)
)
fig = dict(data=data, layout=layout)
plotly.offline.plot(fig, filename='comp151-map.html')


to save some typing, in addition to (or instead of) the dictionary that we used in class, you might find this list of state abbreviations useful

states = ["AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DC", "DE", "FL", "GA",
"HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD",
"MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ",
"NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC",

"SD", "TN", "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY"]

source: https://gist.github.com/JeffPaine/3083347


Submission:

Since you need to submit your text file as well as your program this time, be sure to zip up the entire project folder, remember to include your name in the name of the project so I don't end up with a dozen peoples work in one "Project 4" folder when I unzip everything.

Submit the lab via blackboard as usual. Don't forget to make sure your file has you name in the file name.