Hamilton should embrace an open, shared data platform that increases participation and transparency while delivering better, more usable information to the public.
By Ryan McGreal
Published June 05, 2009
It's time for the City of Hamilton to start publishing its publicly available data in an open, accessible format instead of today's hodgepodge of closed, clunky and idiosyncratic legacy formats that are hard to find and even harder to use.
Have you ever tried to find, well, anything on the City of Hamilton website? The design is so broken it almost seems as though it were intended specifically to discourage any citizen who dares to try and look something up.
Both the site search engine and labyrinthine, deeply nested hierarchy of departments and offices are painful to use, but the search function warrants special mention for its sheer inability to deliver results that match what you're trying to find. No matter the search terms, it invariably dumps out an eyeball-searing block of links to committee meeting agendas instead of the official page that explains the policy, displays the results or links to the appropriate forms.
A Google site-specific search generally does a better job of finding relevant links than the city's own search engine, which brings me to my thesis: Hamilton should embrace an open, shared data platform that increases participation and transparency while delivering better, more usable information to the public.
Here's a case study to illustrate what I have in mind. Every month, a city employee sends out a summary report of monthly building permit activity to an email distribution list. The report includes a summary table of permits issued and total building value by category (residential, commercial, industrial, institutional and miscellaneous) and sub-category (e.g. "New one- and two-family dwellings", "Alteration to one- and two-family dwellings").
Significantly, it is published as a PDF, a format that is designed to preserve document design elements in print.
Unfortunately, PDF is awkward to access electronically. You generally need to fire up a separate PDF reader, like Foxit Reader for Windows or Evince for Linux, just to open the file. It's cumbersome to highlight and copy text (and many PDFs are next to impossible to copy) and very difficult to preserve tabular formatting.
If you want to create or modify a PDF, it's even more difficult. There are software applications (both proprietary and open source) to do this, but they require specialized, application specific knowledge and lots of manual intervention.
So when I decided to create a web-based summary graph of building activity for the past five years, I had download five years' worth of Building Activity reports (all PDF documents) one-by-one, open each one manually, scroll down to the data page and type out the values into a separate document.
Then I uploaded my data table into a database (where this kind of data belongs) and wrote a quick web script to display the data dynamically in an HTML-based report:
I know this report isn't fancy, but it is accessible to anyone with a browser (and it updates automatically when I insert the next month's data). Even if you are visually impaired, your browser's screen reader can read the numeric data table.
More importantly, it is also available in a machine accessible format. That means search engines can find and index it, and computer programs can parse the data automatically so third parties can easily create their own reports.
Now, this report is just a proof of concept, and it would certainly benefit from the ability to drill down to a single month to see the activities by category and subcategory. My point is that if the city made its public data available in a web-based application programming interface (API), I could spend my time building reports on the data rather than spending it having to extract the data manually from inaccessible print formats.
Incidentally, it took at least twice as long to get the building activity data for this report as it actually took to write the code that produces the report. Worse, someone at the city actually performed extra work to present the data in a format that required me to perform extra work to get the data back out again: a compounded diseconomy.
An additional problem is the fact that, since I'm manually copying data from one format to another, that introduces two opportunities for errors to creep into the report: typos and copying errors; and forgetting to do the manual update.
An open API would eliminate these problems, in addition to dramatically reducing the amount of time and effort involved in finding, analyzing and presenting city data.
The sample report I wrote above is presented in HTML, which is a simple, open standard for data presentation that any browser can read and display. Most programming languages include HTML parsing functions, so it's machine-readable as well as human-readable.
The problem with HTML is that its syntax was designed for documents (with headings, subheadings, paragraphs, lists, and so on) and not other structured collections of data objects.
It would be even more useful to third parties if it were in a more structured data format, like JSON or XML. I much prefer JSON to XML because it's a) far less verbose and, more important, b) human-readable as well as machine-readable.
Like HTML, these formats are based on open standards and can be parsed by a variety of different programming languages. They don't lock analysts or report developers into a particular language or platform.
There are a lot of developers and hackers - i.e. programmers and computer hobbyists who like to tinker and who enjoy exploring computer systems and solving problems - in Hamilton and beyond. That community is going to be smarter and more creative at digging into the city's public data and finding innovative ways to combine and interact with it than any one person or formal organization.
Any individual contribution to such a collaborative community endeavour must necessarily be modest, if only because the collaborative platform itself allows for a distributed, iterative approach to application development that beggars the creative potential of its participants in isolation.
Consider the Linux operating system and its vast ecology of distributions and free and open source applications, none of which would be possible if not for the fact that both the philosophy of open sharing and the collaborative platform of the internet are so effective at combining and aggregating the individual efforts - most of them quite modest on their own - of large groups of people.
Linux machines serve the majority of websites on the internet (and open source web servers, web programming languages and database servers manage most of its content). They are incorporated into computers ranging from large, industrial strength servers right down to compact hand-held devices and everything in between.
We already have an open source success story close to home: Hamilton entrepreneur Bob Young, owner of the Ti-Cats, made the fortune he has been willing to sink into that franchise by giving away a free, community-developed distribution of Linux called Red Hat and selling customer support and quality assurance guarantees.
By making the city website into an open platform, the city could trigger a real renaissance in citizen engagement and increased transparency. Tim O'Reilly, the founder of O'Reilly Media and a long-standing proponent of free and open source software, recently made a presentation on Government-as-Platform, in which he argued that successful technology platforms:
This may be O'Reilly's most important observation:
If you're really building a platform, your customers and partners build new features before you do.
My own contribution to such an ecology of open government applications must necessarily be modest, not only because of the nature of open source development but also because there are a lot of programmers out there who are a lot smarter than me. :) Nevertheless, here's one obvious idea that shouldn't be hard to implement: a live HSR bus map.
The city is purchasing new GPS systems for all its buses and will have them installed by the end of the year. If the city provides real-time GPS data for its bus fleet in a web-based API, someone can create a Google Maps mashup that places the buses on the map in their actual current position and lets users click on a bus to see its identity, route and schedule.
Again, the city could hire a consultant to produce such a tool. It would probably end up costing a lot of money, not working very well and having extremely limited responsiveness to the feedback of the user base.
Why pay for second-rate software when the city can encourage its own citizens and residents to create and improve a similar application simply out of a joy for creation, sharing and participation in the public weal?
Open Government is an idea whose time has come.
The basket of related technologies is mature enough to support a robust government-as-platform API for public data. The open source community development model is already proven to work. Switching to open source software and turning data analysis into a community project would actually reduce the city's expenses while providing better, easier to find information and greater transparency and accountability.
One of US President Barack Obama's first acts in office was to appoint a Chief Technology Officer to move the government to a working model in which the business of government is transparent, participatory and collaborative.
A major inspiration for the US government's commitment to openness is the Eight Principles of Open Government Data developed by the Open Government Working Group. I'll close this essay with the list of eight principles:
Government data shall be considered open if it is made public in a way that complies with the principles below:
- Complete: All public data is made available. Public data is data that is not subject to valid privacy, security or privilege limitations.
- Primary: Data is as collected at the source, with the highest possible level of granularity, not in aggregate or modified forms.
- Timely: Data is made available as quickly as necessary to preserve the value of the data.
- Accessible: Data is available to the widest range of users for the widest range of purposes.
- Machine processable: Data is reasonably structured to allow automated processing.
- Non-discriminatory: Data is available to anyone, with no requirement of registration.
- Non-proprietary: Data is available in a format over which no entity has exclusive control.
- License-free: Data is not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed.
By jason (registered) | Posted June 05, 2009 at 13:16:23
preach it brother! I was just doing a search today on the city's website and scrolled through about 5 pages of stuff that had nothing to do with what I was looking for. I finally gave up, went to Google and got better results on the first page.
This stuff you're talking about drives me nuts.
What's with all the PDF's??
Or my favourite - when they have PDF's from a public meeting but they have all the pages sideways. You not only have to open another reader, you need to start flipping around all the pages, or turn your computer sideways to read them.
Another beauty (this might just be due to the fact that I'm in Safari, on a Mac - but still, get with the times. Lots of people have Mac's) is when you type something in the search field and hit enter. It brings you back to the city's homepage with no results. I need to actually click on the little 'ok' button beside the search field to bring up results. This might be the only website left on planet earth that doesn't allow me to hit 'enter'.
Hopefully some IT guy at city hall reads your piece. If they're going to bother having a web presence and make info available to the public, at least pretend it's not 1993 anymore and do it right.
By leadorlag (anonymous) | Posted June 05, 2009 at 15:19:09
Open and participatory is the way of the future. We can embrace it, or we can get dragged kicking and screaming after everyone else is already on board.
Lead or lag.
By Creativist (anonymous) | Posted June 05, 2009 at 16:18:14
mcgreal for ecdev manager! really enjoying your "creative city" series, can't wait to see where you take it next. hopefully someone at city hall is paying attention, our future is definitely not more big box stores on farmland, it's creative industry downtown. too bad we've ignored that area so long that we've fallen behind.
By synxer (registered) | Posted June 05, 2009 at 16:40:56
As a programmer myself, I definitely agree.
By adrian (registered) | Posted June 05, 2009 at 22:44:34
Fancy? Shmancy! That graph is awesome! Very nice work.
You're bang on with this post. It would take a complete change in mindset, I think, for our municipal government to start looking at their data as our data (which, of course, it is) and giving us good ways of getting it. But I think its possible. I've been impressed by some of the visionary thinking I've seen from certain individuals well-placed at City Hall and I think there's a real opportunity here.
I would also note that it's not just the City that should provide this information; it's every government-funded agency and non-profit as well. There are many organizations who look at their data as their own strategic information and guard it jealously, even though they are partly or entirely funded by taxpayers.
By giving us a good way to retrieve information, we can go ahead and formulate it into knowledge, which benefits everyone.
I'd be curious to see what kind of official response you'd get if you broached this with the city.
By Brandon (registered) | Posted June 06, 2009 at 07:59:59
I'd be shocked if it went anywhere. Information is power and the interpretation of that information is what lets people get done what they want to get done. Allowing just anyone (who isn't willing to put in a ridiculous amount of work) to access that information means other interpretations of the data become common.
Fortunately we have people like Ryan who enjoy putting in ridiculous amounts of work for this sort of thing. :)
By faragol (registered) | Posted June 06, 2009 at 08:43:36
I program for a living right here in Hamilton and I would be very interested in getting my hands on the public data of our city.
This was a very good post and I hope something comes out of it.
By Oldcoder (anonymous) | Posted June 06, 2009 at 10:17:09
Another thing implied in the article but you didn't make explicit is that this could bring Hamilton's programmers together to a critical mass of contacts that could lead to more local business startups and programming jobs.
There are lots of programmers in Hamilton but like me they need to leave town to work - to Toronto, Missisauga, Waterloo, etc. But I'm old enough to remember the pre-Web days when Hamilton was a hotspot of BBS's and local PC hobbyists - we had one of the highest concentrations of local computer networks and I'm wondering where all those nerds went after Hamilton started to fall behind.
I see this as a chance to come from behind and get back out in front of the curve again! Open source city is still a new concept, maybe we won't be the last to the party this time.
By logonfire (registered) | Posted June 06, 2009 at 22:30:19
A great post.
Why don't you send it to every politician on Council and our Federal and Provincial members. It might start them thinking!
I am no computer expert but I think a bunch of computer savvy people could join forces and create Wikihamilton!!
Who's in for that and is there a volunteer leader ready to step forward?
By grassroots are the way forward (registered) | Posted June 07, 2009 at 00:39:11
Question: could you not create a key to identify groups? What I think and what could be the reality could be 2 different things. So does red represent residentail and so on down the line?
By adrian (registered) | Posted June 07, 2009 at 11:08:11
Stumbled across an interesting article that relates to this subject, addressing the difficulties in creating commonalities when sharing and publishing data:
By Democracy! (anonymous) | Posted June 07, 2009 at 21:11:49
This will terrify the people who benefit from Hamiltonians not really knowing what's going on and not being able to find out. All the more reason to do it ASAP.
By UrbanRenaissance (registered) | Posted June 08, 2009 at 08:33:17
Fantastic idea Ryan.
Just imagine a city where how Councillors have voted is posted online so we could actually know what they believe in, as opposed to what they tell the media they do. Not to mention their expense reports, political donations received and the like, all just waiting to be tabulated and dissected by anyone with an internet connection. Now thats what I call accountability!
And as Oldcoder said, it's also a great way for us programmers to network (sorry for the computer pun) and maybe bring the tech industry back to Hamilton.
By Mahesh P. Butani (anonymous) | Posted June 08, 2009 at 10:08:58
Congratulations Ryan on bringing forth this thought!
I believe it is from this kind of thinking that evolutionary change will be driven in our region.
It is refreshing to see an 'idea', and not politics or predispositions that is calling for change! This in itself is a remarkable and pioneering stance - besides the obvious immense benefits, naturally springing from the adoption of your proposal at every level in our community.
Discovering emergent patterns in our region is the first step towards nudging fossilized or just tired mindsets towards change, and possibly even breaking down the misguided silos and gated communities in our midst.
Our collective future as a community depends on strategic moves such as this - and I am confident that our politicians and staff at City Hall will embrace this 'idea with legs' - and enable you and all those here - in developing a rapid plan of action to make it a reality.
Wasting away such positive thinking and collective enthusiasm in politically funded project reports, or public-private technology round-tables, is something our political and community leaders simply cannot afford with this 'idea' in light of the oncoming election season.
Looking forward to seeing more such ideas in the coming months!
Mahesh P. Butani
By zookeeper (registered) | Posted June 08, 2009 at 16:16:39
Speaking as a non programmer, I'm all for letting the nerds get their hands into the city's data. I bet we'd discover all kinds of fascinating stuff once we start putting the numbers together instead of having them all spread out in different files that always crash my friggin' browser (maybe one of the nerds can help me with this...?) when I innocently click on them.
Here's the thing: those people working for the city that really want to do a good job will cheer this because it will make it easier for them to find out what they need to do there jobs well. Those people that don't care or who are in the city to play politics instead of serve the public will fight this tooth and nail....
You must be logged in to comment.
There are no upcoming events right now.
Why not post one?