-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathindex.html
More file actions
300 lines (243 loc) · 22.8 KB
/
index.html
File metadata and controls
300 lines (243 loc) · 22.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>BABS Data Challenge</title>
<link rel="stylesheet" href="stylesheet.css">
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-82812853-4', 'auto');
ga('send', 'pageview');
</script>
</head>
<body>
<script src="js/d3.v3.js" charset="utf-8"></script>
<script src="js/jquery-1.11.0.js"></script>
<script src="js/colors.js"></script>
<header>
<h1>Bay Area Bike Share Data Challenge</h1>
<a href="explore.html">Explore</a> <a href="index.html">Home</a>
</header>
<div class="wrapper">
<h2 class="analysis">What is the Bay Area Bike Share?</h2>
<p class="analysis">The <a href="http://bayareabikeshare.com/">Bay Area Bike Share</a> system allows users to rent bicycles for short journeys between stations throughout the city. Users can be annual members or short term (1 or 3 days). The system is completely automated for users. </p>
<p class="analysis">There are <strong>69</strong> stations across <strong>5</strong> cities in the Bike Share system, with an average of <strong>17</strong> docks per station.</p>
<div id="pies" class="chart"></div>
<p class="analysis">About <strong>50%</strong> of the stations and docks are located in San Francisco, but it makes up <strong>90%</strong> of the system use.</p>
<p class="analysis">The data we analyze here came from rides between August 29, 2013 and February 28, 2014.</p>
<p class="fact">In those 185 days there were <strong>144,015 rides</strong>, averaging about <strong>20 minutes</strong> per trip.</p>
<p class="fact">Total riding time: <br/><strong>49,241½ hours</strong>, or <strong>5 years, 32 weeks, 1 day, 12 hours, 27 minutes, and 4 seconds</strong>.
</p>
<hr class="clear"/>
<h2 class="analysis">How much is the Bay Area Bike Share used?</h2>
<input type="checkbox" id="weekendcheck">Highlight Weekends</input>
<div id="row2" class="chart"></div>
<p class="analysis">This chart makes it pretty clear that ridership drops on weekends. Check the box to highlight Saturdays and Sundays.</p>
<p class="fact"><strong>917 rides</strong> are made on average each weekday</p>
<p class="fact"><strong>442 rides</strong> are made on average each weekend</p>
<table class="fact tabCenter">
<thead> <th>Busiest day</th> <th>Calmest day</th> </thead>
<tbody>
<tr> <td>2013-09-25</td> <td>2014-02-09</td> </tr>
<tr> <td>1,264 rides</td> <td>81 rides</td> </tr>
</tbody>
</table>
<hr class="clear"/>
<h2 class="analysis">Who uses the Bay Area Bike Share?</h2>
<div id="row3" class="chart ILL"></div>
<p class="analysis">Almost <strong>80%</strong> of riders have an annual subscription. The remaining <strong>20%</strong> of riders were customers who purchased a 24-hour or 3-day pass.</p>
<hr class="clear"/>
<h2 class="analysis">When is Bay Area Bike Share used?</h2>
<p class="analysis clear">We saw already that users are likely to be annual subscribers, and system use drops off on the weekends. Let's group the rides by weekday and see how subscribers' use compares to customers':</p>
<div id="weekday" class="chart ILR"></div>
<p class="analysis">Weekday riders are overwhelmingly subscribers, and ridership among subscribers falls on weekends so that rides by customers just manage to outnumber them.</p>
<p class="analysis clear">And now we'll group rides by time of day:</p>
<div id="hourly" class="chart ILL"></div>
<p class="analysis">Among subscribers we see spikes in use at 8am and 5pm with another small bump at 12 noon. These users must be riding a bike to get to work, to go to lunch, and to head home. </p>
<p class="analysis">Customers hourly usage seems to fall along a bell-shaped distribution peaking at two in the afternoon. There doesn't seem to be a lunchtime rise among customer use. These users must be riding around throughout the daytime at their leisure.</p>
<p class="analysis">From these usage behaviors, it would be fair to characterize the two groups of subscribers and customers as <strong>commuters</strong> and <strong>tourists</strong>, respectively.</p>
<p class="analysis">The Bike Share system is intended to be used for short rides: trips under a half hour do not incur any additional charges. Do riders use the system in the intended way?</p>
<div id="duration" class="chart ILR"></div>
<p class="analysis">Yes they do. The chart on the right shows that the most common ride length is <strong>5 to 10 minutes</strong>. Subscribers are clearly savvy to the price structuring. Very few rides longer than a half hour are taken by subscribers. For the most part, customers are savvy as well. Their trips last a little longer on average, but mostly less than 30 minutes. There's a bump in rides that last longer than an hour, let's examine those a little closer: </p>
<div id="durationH" class="chart"></div>
<p class="aside ILR">By the current pricing structure, a customer purchasing a 24 hour pass and taking a ride that lasts 2 hours and 59 minutes would pay in total $41. A ride of 2 hours and 29 minutes would only be $34. There are several bike rental companies in San Francisco which offer 3-hour rentals for around $32. Customers taking trips less than three hours likely did not understand the bike share system and incurred unwanted overtime fees, or they decided the convenience of the automated system was worth the premium paid over other rental offerings.</p>
<p class="analysis">Rides lasting longer than an hour are most commonly 1 to 2 hours long, and almost always taken by customers. This could be due to confusion over the nature of the "24-hour pass" or one of many other factors including theft, forgetfulness, or getting lost. </p>
<hr class="clear"/>
<h2 class="analysis">Where do people ride Bike Share?</h2>
<p class="analysis">We saw earlier that 90% of Bike Share rides took place in San Francisco. Let's examine the average numbers for each city on a day-to-day basis: </p>
<table id="avgday" class="tabCenter"><caption>Average Rides per Day</caption>
<thead> <th></th> <th>System Wide</th> <th>San Jose</th> <th>Redwood City</th> <th>Mountain View</th> <th>Palo Alto</th> <th>San Francisco</th></thead>
<tbody>
<tr> <td>Total</td> <td>778</td> <td>48</td> <td>4</td> <td>15</td> <td>9</td> <td>702</td></tr>
<tr> <td>Subscriber</td> <td>614</td> <td>39</td> <td>3</td> <td>12</td> <td>5</td> <td>555</td></tr>
<tr> <td>Customer</td> <td>164</td> <td> 9</td> <td>1</td> <td> 3</td> <td>4</td> <td>147</td></tr>
</tbody>
</table>
<p class="analysis">Hmm. Redwood City and Palo Alto don't see much daily use, but not much interesting here. <span class="strike">Let's look at a chart.</span> Let's look at an interactive chart.</p>
<p class="analysis">Here we see total daily rides for each category of user plotted across the entire timespan, similar to the chart at the top of the page. Use the radio buttons on top to examine different cities and the checkboxes below to highlight days with a factor that might have influenced riders.</p>
<div id="cityRadio">
<input type="radio" name="city" value="all" checked="true">All</input>
<input type="radio" name="city" value="sj">San Jose</input>
<input type="radio" name="city" value="rc">Redwood City</input>
<input type="radio" name="city" value="mv">Mountain View</input>
<input type="radio" name="city" value="pa">Palo Alto</input>
<input type="radio" name="city" value="sf">San Francisco</input>
</div>
<div id="focuschart"></div>
<ul class="ILL">
<li><input type="checkbox" id="avgCB">Show Average Rides</input></li>
<li><span id="reset" >Reset</span></li>
</ul>
<ul class="ILL">
<li><input type="checkbox" id="weekendCB">Weekends</input></li>
<li><input type="checkbox" id="holidayCB">Holidays</input></li>
</ul>
<ul class="ILL">
<li><input type="checkbox" id="rainCB">Rain</input></li>
<li><input type="checkbox" id="tempCB">Temp</input></li>
</ul>
<ul class="ILL">
<li><input type="checkbox" id="49ersCB">49ers Game</input></li>
<li><input type="checkbox" id="giantsCB">Giants Game</input></li>
<li><input type="checkbox" id="sharksCB">Sharks Game</input></li>
</ul>
<ul class="ILL">
<li><input type="checkbox" id="americasCB">Americas Cup</input></li>
<li><input type="checkbox" id="bartCB">BART Strike</input></li>
<li><input type="checkbox" id="govCB">Government Shutdown</input></li>
</ul>
<span class="aside">Explore these factors closer on <a href="explore.html">the next page →</a></span>
<p class="analysis">If our generalization of users as commuters and tourists were true, we would predict use among subscribers to drop off when commuters don't go to work and increase when tourists are in town. With the overlays we can see this correlation holds true. Rides by subscribers in all cities decreased on weekends, over Thanksgiving, and between Christmas and New Years. Rides by customers increased noticeably on weekends, holidays, and during the America's Cup finals.</p>
<p class="analysis">Rain seems to be a large deterrent to bike use; on rainy days system use falls almost as predictably as on weekends or holidays. Temperature, however does not seem to have a major influence on riders. Bay area temperatures being notoriously moderate, the lack of correlation is not surprising. </p>
<p class="analysis">We can't draw any conclusions about the influence of home games by the 49ers, Giants, or Sharks. Heading to a Giants game would seem to be an ideal use of the Bike Share, but during this period the only Giants home games overlapped with Americas Cup races. There are a few dates where a Sharks game correlated with a slight rise above average usage, but such deviation from average is normal throughout this time period.</p>
<p class="analysis">Interestingly, both the federal government shutdown of October and the two BART worker strikes of October and December did not seem to have a major influence. Rider numbers during the strikes remained similar to numbers from the weeks before and after the strikes. This poses an interesting question, do BART commuters use the Bike Share System? To begin looking for an answer, let's look at the numbers for rides to and from individual stations.</p>
<table class="tabRight ILL">
<caption>Most Popular Starting Stations</caption>
<thead>
<th>Station</th> <th>Rides</th>
</thead>
<tbody>
<tr> <td>San Francisco Caltrain (Townsend at 4th)</td> <td>9,838</td> </tr>
<tr> <td>Harry Bridges Plaza (Ferry Building)</td> <td>7,343</td> </tr>
<tr> <td>Embarcadero at Sansome</td> <td>6,545</td> </tr>
<tr> <td>Market at Sansome</td> <td>5,922</td> </tr>
<tr> <td>Temp. Transbay Term. (Howard at Beale)</td> <td>5,113</td> </tr>
<tr> <td>Market at 4th</td> <td>5,030</td> </tr>
<tr> <td>2nd at Townsend</td> <td>4,987</td> </tr>
<tr> <td>San Francisco Caltrain 2 (330 Townsend)</td> <td>4,976</td> </tr>
<tr> <td>Steuart at Market</td> <td>4,913</td> </tr>
<tr> <td>Townsend at 7th</td> <td>4,493</td> </tr>
</tbody>
</table>
<table class="tabRight ILR">
<caption>Most Popular Destinations</caption>
<thead>
<th>Station</th> <th>Rides</th>
</thead>
<tbody>
<tr> <td>San Francisco Caltrain (Townsend at 4th)</td> <td>11,637</td> </tr>
<tr> <td>Embarcadero at Sansome</td> <td>7,590</td> </tr>
<tr> <td>Harry Bridges Plaza (Ferry Building)</td> <td>7,475</td> </tr>
<tr> <td>Market at Sansome</td> <td>6,238</td> </tr>
<tr> <td>2nd at Townsend</td> <td>5,655</td> </tr>
<tr> <td>San Francisco Caltrain 2 (330 Townsend)</td> <td>5,112</td> </tr>
<tr> <td>Market at 4th</td> <td>5,109</td> </tr>
<tr> <td>Steuart at Market</td> <td>5,080</td> </tr>
<tr> <td>Townsend at 7th</td> <td>5,073</td> </tr>
<tr> <td>2nd at South Park</td> <td>4,431</td> </tr>
</tbody>
</table>
<span class="clear"></span>
<table class="ILR tabCenter">
<caption>San Francisco Caltrain <br/>(Townsend at 4th)</caption>
<thead>
<th>Most popular starting point</th> <th>Most popular destination</th>
</thead>
<tbody>
<tr> <td>9,838 rides</td> <td>11,637 rides</td> </tr>
</tbody>
</table>
<p class="analysis">Here we see the top 10 stations in the entire system to start or end a ride. Interestingly, the stations near Caltrain and the Ferry building top the list. Both Harry Bridges Plaza and Steuart at Market are close to the Ferry Building. San Francisco Caltrain 1 and 2 obviously serve the Caltrain line. The station at Embarcadero at Sansome is next to Pier 27, where the America's Cup Pavilion was located.</p>
<p class="analysis">The Bike Share stations on the top ten lists within walking distance of a BART stop are Market at Sansome, Market at 4th, and Steuart at Market. Keep in mind that all these stops are also within close proximity to many points of interest in downtown San Francisco. </p>
<p class="analysis">Of course, the rest of the stations serving BART commuters could all be just outside the top ten. Let's take a look at the list of rides for all stations. BART stations are identified with blue text, Caltrain stations with red text.</p>
<p>Sort by:</p>
<div id="sort" class="center">
<input type="radio" name="stationSort" value="start" checked="true">Starting station</input>
<input type="radio" name="stationSort" value="end">Ending station</input>
</div>
<div id="hbar"></div>
<p class="analysis">While the BART stations are indeed all among the busiest stations, they are all in San Francisco, and their numbers don't set them apart from any other station within San Francisco. These stations are centrally located to a number of attractions in downtown San Francisco, so we would expect them to be highly trafficked. Bike Share Riders docking at these stations are not necessarily transferring to or from BART. If there was a large contingency of BART commuters who used the Bike Share, we would expect to see rides to and from BART stations in numbers setting them above other destinations. With what we've seen here, it seems safe to conclude that BART commuters do not make up a large proportion of Bike Share riders.</p>
<p class="analysis">Is there any more evidence to back up this assumption? Let's take a look at the rides at stations serving different forms of commuting to and from San Francisco:</p>
<table id="commuter" class="tabCenter"><caption>Total rides at stations serving:</caption>
<thead> <th></th> <th>BART</th> <th>Caltrain</th> <th>Ferry Building</th> <th>Transbay Terminal</th></thead>
<tbody>
<tr> <td>Starting</td><td>20,919</td> <td>14,814</td> <td>12,256</td> <td>5,113</td></tr>
<tr> <td>Ending</td> <td>21,325</td> <td>16,749</td> <td>12,555</td> <td>4,356</td></tr>
</tbody>
</table>
<p class="analysis">These numbers are somewhat misleading since there are four stations within walking distance of a BART station, but only two nearby the Caltrain and Ferry building. Instead, let's look at average rides to/from these stations: </p>
<table id="commuteravg" class="tabCenter"><caption>Average rides at stations serving:</caption>
<thead> <th></th> <th>BART</th> <th>Caltrain</th> <th>Ferry Building</th> <th>Transbay Terminal</th> <th>All SF</th></thead>
<tbody>
<tr> <td>Starting</td><td>4,305</td> <td>7,407</td> <td>6,128</td> <td>5,113</td> <td>3,710</td></tr>
<tr> <td>Ending</td> <td>4,401</td> <td>8,374</td> <td>6,278</td> <td>4,356</td> <td>3,710</td></tr>
</tbody>
</table>
<p class="analysis">Bike Share stations serving BART do not seem to be more trafficked than other stations in San Francisco. Of course, we would be making a massive mistake if we were to assume that all rides to or from these stations started or continued with trips on the related commuter system. The Ferry Building is a major tourism destination, every BART stop is within walking distance to numerous restaurants, museums, business, MUNI rail and bus stops. The Caltrain station, on the other hand, is only within walking distance of a handful of other points of interest. When we compare the numbers for the Caltrain station to the average for all stations in San Francisco, it becomes clear that Caltrain commuters are making great use of the bike share system.</p>
<hr/>
<p class="analysis">Each trip within Bike Share is recorded by starting station and ending station, so in addition to examining the popularity of stations, we can examine the popularity of individual routes. In the heatmap below, starting stations are listed in rows and the ending stations in columns. The total number of rides from least to most along each route is indicated by the cell color, lightest to darkest. </p>
<div id="tooltip" class="hidden">
<p><span id="value"></p>
</div>
<div id="heatRadio">
<input type="radio" name="heat" value="all" checked="true">All</input>
<input type="radio" name="heat" value="sj">San Jose</input>
<input type="radio" name="heat" value="rc">Redwood City</input>
<input type="radio" name="heat" value="mv">Mountain View</input>
<input type="radio" name="heat" value="pa">Palo Alto</input>
<input type="radio" name="heat" value="mvpa">MV/PA</input>
<input type="radio" name="heat" value="sf">San Francisco</input>
</div>
<div id="heat"></div>
<p class="analysis">On the heatmap diagram of systemwide rides we see that activity tends to be grouped into squares. These are trips that took place within city boundaries, and we notice that not many riders go beyond their starting city. An exception to this are trips between Mountain View and Palo Alto, between which 191 rides went from one city to the other. </p>
<p class="analysis">Looking at the station heatmap for San Francisco, we note that riders leaving from the most popular station, SF Caltrain, disperse throughout the system. Riders heading to SF Caltrain similarly tend to come from throughout the system. </p>
<p class="aside ILL">Two popular routes which seem to make up a round trip are Townsend at 7th to SF Caltrain 2 and vice-versa. Near to Townsend at 7th is an Expo center and at handful of influential tech companies including Adobe, Heroku, Citrix, Advent, and Zynga. This could reflect a corporate membership program popular among commuting employees, a conference well-attended by peninsula-dwellers, or more simply, a dearth of other forms of transportation to the area. </p>
<p class="aside ILL"> The most ridden route, Harry Bridges Plaza (Ferry Building) to Embarcadero at Sansome, does not have a return route with as great numbers, indicating that riders tended to ride to Embarcadero at Sansome and continue their journey elsewhere rather than return to the Ferry Building. Embarcadero at Sansome is the northernmost station along the Embarcadero, closest to tourism-heavy Pier 39 and Fisherman's Wharf. The bike path heading north along the Embarcadero is also much more bicycle-friendly than the southern route. </p>
<table class="tabRight center ">
<caption>Most Traveled Routes</caption>
<thead>
<th>From</th> <th>To</th> <th>Rides</th>
</thead>
<tbody>
<tr> <td>Harry Bridges Plaza (Ferry Building)</td> <td>Embarcadero at Sansome</td> <td>1,330</td></tr>
<tr> <td>Townsend at 7th</td> <td>San Francisco Caltrain (Townsend at 4th)</td> <td>1,322</td></tr>
<tr> <td>San Francisco Caltrain 2 (330 Townsend)</td> <td>Townsend at 7th</td> <td>1,116</td></tr>
<tr> <td>Market at Sansome</td> <td>2nd at South Park</td> <td>866</td> </tr>
<tr> <td>Embarcadero at Sansome</td> <td>Steuart at Market</td> <td>811</td> </tr>
<tr> <td>2nd at South Park</td> <td>Market at Sansome</td> <td>798</td> </tr>
<tr> <td>San Francisco Caltrain (Townsend at 4th)</td> <td>Harry Bridges Plaza (Ferry Building)</td> <td>782</td> </tr>
<tr> <td>2nd at Townsend</td> <td>Harry Bridges Plaza (Ferry Building)</td> <td>757</td> </tr>
<tr> <td>Steuart at Market</td> <td>Embarcadero at Sansome</td> <td>717</td> </tr>
<tr> <td>Harry Bridges Plaza (Ferry Building)</td> <td>2nd at Townsend</td> <td>710</td> </tr>
</tbody>
</table>
<hr />
<h2>In Conclusion</h2>
<p class="analysis">We saw here that Bike Share is used mostly in San Francisco, by commuters, when it isn't raining, for rides under 15 minutes. It looks like BART commuters don't use the system to the extent that Caltrain commuters do. On weekends and holidays, visitors to San Francisco use the system to ride around town.</p>
<p class="analysis">One dimension not explored in this analysis was popularity of each station among subscribers vs customers. With what we saw of the behavior of customers vs subscribers we could identify stations more popular with tourists or with commuters and potentially identify areas of the city with demand for future stations.</p>
<p class="analysis">One recommendation that can be made is for station growth into SOMA. Users are already shown to be commuters heading to work, and SOMA is one area of San Francisco with a high density of businesses, close proximity to the existing stations, and currently lacking any Bike Share stations.</p>
<h2>Post-contest Addendum</h2>
<p class="analysis">After submitting this entry to the bike share contest, I kept thinking to myself, "It doesn't make sense that BART riders don't make up a big group of Bike Share users. How else could I show where the Subscribers come from?" It turns out that information was in the original dataset all along, so I made <a href="map.html">a map of subscribers by zip code.</a></p>
<p class="analysis">Which shows that many Riders are Subscribers who live in the East Bay, near a BART station. Clearly, some more analysis is in order to figure out why the BART Bike Share stations don't stand out among other San Francisco stations.</p>
</div>
<footer>
I did not perform any actual statistical tests; with figuring out D3, I didn't have time to re-learn T-tests and whatnot. I would maybe be interested in doing that to see if my conjectures have any basis in math. <br />
Made with <a href="http://www.d3js.org">D3</a> and <a href="http://www.openoffice.org/">Open Office</a>. Data from the <a href="http://bayareabikeshare.com/datachallenge">Bay Area Bike Share Data Challenge</a>.
</footer>
<script src="js/main.js"></script>
<script src="js/indepth.js"></script>
</body>
</html>