Have you ever had a shit-tonne of documents dumped into your inbox with an impossible deadline demanding to suck out the hidden juicy bits? Or may be it has been a joyful experience of discovering the dump of an MILF's emails, diplomatic cables, or code dumps of an evil corporation's website? At moments like those, you might have uttered, "fcuk! ... Omne Ignotum Pro Magnifico!". Wouldn't it be nice if the needles just magically popped out of the haystack? Meet Reveal (clickable prototype) - a software framework that aspires to achieve that and may be a bit more.
Background:
While sed/awk/grep-ing the cablegate files, I stumbled upon a cable that mentioned Kofi Annan asking Robert Mugabe to step down in exchange for a handsome retirement plan during the Millennium summit. Being an ignorant bloke, I could hardly recall what the Millennium summit was about, had no clue if Mugabe was still in office, and if Kofi Annan has made a comment on this! Without the right background and context I could not appreciate the data to the full extent. Below is the #MozNewsLab final project idea pitch in the lights of the three speakets this week: Chris Heilmann, John Resig and Jesse James Garrett
"What is this thing for? What does it do? How is it supposed to fit into people's lives?", @Jesse James Garrett:
Journalists get amazing amount of digital data everyday which are in the form of numbers in tables. With some spreadsheet skills or help from newsroom programmers, they produce incredible revelations of the reality that hides behind those numbers. However, when the data comes in the form of unstructured text files written in natural language - there isn't much algorithmic help available, other than full text searches with a list of guess words. Using cutting edge information retrieval technique, Reveal would aim to build a framework that automatically annotates names, places, locations, dates etc. in the unstructured text files.
"Adopting Open Source, Open standards", @Chris Heilmann:
Being baptized by St. IGNUcius, the idea of Free as in Freedom runs through the core of the technology stack of Reveal. Standard LAMP stack for server side, UI powered by HTML5, CSS3 and jQuery plugins and a number of open source libraries for doing the information extraction - long post describing the information retrieval technology coming soon. (Mind map above).
Using the detected names, locations, dates etc., Reveal will try to aggregate additional information in the form of images, maps, news articles, videos, wikipedia pages, visualizations etc. via open API-s and use them as navigational elements to browse the data. Juxtaposed to the document under scrutiny, these will provide the right context to gauge the sensitivity of the information.
"User to Contributor", @John Resig:
Additionally, by showing a relative score of "How much does the world know?", calculated on the basis of the aggregated information published before the documents surfaced, we can excite the newsreaders to share the information across their own social network. Add some game mechanics by quantifying that "sharing", and we bust the filter bubble of ignorant blokes and turn them into responsible citizens who'll raise voices against wrong doings of totalitarian regimes, evil corporations or other bad asses. This will lead to creation of more content and will act as a feedback loop to the background and context aggregation step before.
That is my final software idea pitch inspired by Chris Heilmann, John Resig and Jesse James Garrett #MozNewsLab Week 2:




2 comments:
Some good thinking here, Tathagata. Would enjoy hearing / seeing what you think the "Minimum Viable Product" feature set would be.
Phillip.
I finally got the frontend code a somewhat working http://tathagata.github.com/moznewslab/reveal ... but its so far away from the MVP - phew, I'll never have the patience to make nice frontends.
Post a Comment