The Sunlight Foundation, a non-profit aimed at showing how corporate interests influence government released a pretty sweet tool for citizens and big data nerds on Monday. The tool, called Capitol Words, monitors how often, and which, legislators said certain phrases in an effort to track how those phrases enter and influence the political debate. Capitol Words is one of those random tools that gives us a glimpse of how cheap computing and better data analytics can change the business as usual in politics.
Already, the combo of cheap computing and big data are changing how retail firms set prices, offering insights into healthcare, and helping investors maximize rental income, so the notion it could lead to more government transparency isn’t all that crazy. In the case of Capitol Words and the Sunlight Foundation, the goal is to analyze what legislators say on the floor of the House and Senate to track how an idea can filter through a political party, a region or a debate by parsing the text data generated daily by the Congressional Record.
Tom Lee, the Director of Sunlight Labs at the Sunlight Foundation, said in an interview that the amount of data isn’t huge — about 50 or 60 gigabytes a day — but the text does need to be parsed so it can be made into something useful. So the Sunlight Foundation has developed algorithms and techniques, many of which it releases on Github, for using the data. It does the calculating and analysis on Amazon’s Elastic Map Reduce service and then uses Solr, an open-source search platform, to process people’s queries against the records. The database supporting the tool has upwards of 20 million records.
“The speech used by legislators is used to advance causes and manipulate the public,” Lee said. “And how their speech is similar or different can show how particular terms originate from some political messaging memo.”
Lee said the original version of the project in 2008 ran the search and the data parsing in parallel, but that approach was too compute-intensive and didn’t allow for the richness of the results the project can offer today by splitting the two steps up. However, he didn’t rule out coming back to running the job in parallel eventually as the data stores become larger and queries became more complex.
The Sunlight Foundation makes its findings and data available via a JSON API so others can build on it. It’s also hoping to expand beyond floor speeches to politician’s appearances on talk shows and other venues. It hopes to create other services by tying these political sound bytes to its repository of funding data, which tracks what lobbying groups and individuals politicians accept money from. It has over a terabyte-and-a-half of data on hand to work from.
And for those eagerly watching how our government attempts to become more transparent and share data, the Foundation is also working with the Government Printing Office, which published the Congressional Record to get the document in a more web-friendly, structured format. That would help the Capitol Words project become more useful and help others build their own data analytics based on what’s said in Congress. Right now, much of the esoteric (somewhat stilted) debate most often hits the general public when The Daily Show mocks it. While Jon Stewart may be funny, he doesn’t offer the ability to track ideas over time or in any broad fashion.
“For us, this is trying to expand the way Sunlight tracks influence,” Lee said. “We track the way the money flows around Washington and it’s not enough. The ways the system is affected are too subtle and deliberate, so we’re making an investment in tracking not just the flow of money but also the flow of ideas.” And when you’re trying to track something as nebulous as ideas, analyzing a lot of data using cheap compute is perhaps the only way for a non-profit to do it.

[DocDocc] Non-profit uses big data to track big government http://t.co/0UnAYSoJ via gigaOM
Non-profit uses big data to track big government http://t.co/X4AQNumW
Non-profit uses big data to track big government http://t.co/cYYwFYvN #Cloudcomputing
Non-profit uses big data to track big government http://t.co/NibfBtrU #cloud #gigaom
Non-profit uses big data to track big government http://t.co/aguLWI4w
Non-profit uses big data to track big government http://t.co/tRix1hPZ
Non-profit uses big data to track big government http://t.co/ewnenu9b
Non-profit uses big data to track big government http://t.co/FLez61JO @GigaOM
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how… http://t.co/GrKU9pUW
Non-profit uses big data to track big government http://t.co/xTp0XutW
Non-profit uses big data to track big government http://t.co/KauI23DW
Non-profit uses big data to track big government http://t.co/tjbGf9Mv
Non-profit uses big data to track big government http://t.co/E6QgDPCQ
Non-profit uses big data to track big government http://t.co/uAlN2caR
Non-profit uses big data to track big government http://t.co/1NbpCuc5
Non-profit uses big data to track big government http://t.co/pF43zxX8 by @gigastacey < “tracking not just … money but also … ideas.”
PROFIT NEWS! Non-profit uses big data to track big government http://t.co/01mEBFHJ
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed… http://t.co/KQXWWM4p
Non-profit uses big data to track big government http://t.co/T5zzBGpJ
Non-profit uses big data to track big government: GigaOM on Sunlight Foundation. http://t.co/TY36LN9a
Non-profit uses big data to track big government http://t.co/LygtiNPP via Stacey Higginbotham
Non-profit uses big data to track big government http://t.co/xHsGd8qu
.@SunFoundation uses #bigdata to track big government http://t.co/Xceo1d4F
Non-profit uses big data to track big government http://t.co/IF2qqNgw #marketing #business #success #biztips
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
gigaom: Non-profit uses big data to track big government http://t.co/x95FIcvK
Amazing to see so many Big Data applications emerge “Non-profit uses big data to track big government” http://t.co/CjViNBdN
Non-profit uses big data to track big government http://t.co/NO4pZU3b
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/3GfywOEZ
Non-profit uses big data to track big government http://t.co/d4grchkt
Non-profit uses big data to track big government http://t.co/CAKAASGP @amarchugg #news
Non-profit uses big data to track big government http://t.co/xIrdVGbb
Non-profit uses big data to track big government http://t.co/wsZdghTS @amarchugg #news
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/vcoN3FkB
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/ONjpYotS
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/Ui3yAgqN
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/RVf4ME0G
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/QHerel2s
Non-profit uses big data to track big government: http://t.co/ER9AYKUQ via @gigaom (attn: @antheawatson)
re: @SunFoundation cc @EllnMllr @sunlightonhill // RT @gigaom: Non-profit uses big data to track big government http://t.co/NN1ppFeo
Non-profit uses big data to track big government: 12, 2011, 4:00pm PT No Comments The Sunlight Foundation, a non… http://t.co/bFqwwgJj
Great piece about @sunlightfoundation and their use of big data to track government http://t.co/9Dxm9caA via @gigaom & @conniehwong
RT @antheawatson: Great piece about @sunlightfoundation and their use of big data to track government http://t.co/9Dxm9caA via @gigaom & @conniehwong
RT @antheawatson: Great piece about @sunlightfoundation and their use of big data to track government http://t.co/9Dxm9caA via @gigaom & @conniehwong
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
Non-profit uses big data to track big government: 12, 2011, 4:00pm PT No Comments The Sunlight Foundation, a non… http://t.co/FE06styT
Non-profit uses big data (EMR) to track big government http://t.co/p7MJulhT #aws #elasticmapreduce
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed… http://t.co/ZTpZdk4h
Non-profit (ie. @SunFoundation) uses big data to track big government http://t.co/TWMr5ocs
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
Non-profit #Sunlight uses #BigData to track how corporate interests influence big #government http://t.co/eAdJjyoW v @GigaOM
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
Non-profit uses big data to track big government http://t.co/uNGus83n
Non-profit uses big data to track big government: Capitol Words is one of those random tools that gives us a gli… http://t.co/UgsjDOle
Non-profit uses big data to track big government http://t.co/XJh3rSpZ
A tool for tracking the spread of the latest disinformation from lobbyists via Congress: http://t.co/vXTGjIB1
Non-profit uses big data to track big government http://t.co/VX8UzxNn
RT @gigaom: Non-profit uses big data to track big government http://t.co/SUj0e5Bp
Non-profit uses big data to track big government: 12, 2011, 4:00pm PT No Comments The Sunlight Foundation, a non… http://t.co/qZpTekko
#cloud Non-profit uses big data to track big government – The Sunlight Foundation, a non-profit aimed at showing how… http://t.co/QOG48QR2
Non-profit uses big data to track big government http://t.co/vd7tofke via @gigaom > one reason I love #BigData, Big Brother backwards. #in
Tenay Hankins Non-profit uses big data to track big government: It hopes to create other se… http://t.co/Dz23jOu6 http://t.co/esEmgvdO
RT @gigaom: Non-profit uses big data to track big government http://t.co/i9BTilkZ
Non-profit uses big data to track big government http://t.co/ehOVsoL1
Non-profit uses big data to track big government http://t.co/jPPste4H via @zite
Non-profit uses big data to track big government http://t.co/KzGoyDgS via @zite
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/bPR0Nfj9
Non-profit uses big data to track big government http://t.co/PYNQ7lTm
Non-profit uses big data to track big government http://t.co/qfFJ2RPr
Non-profit uses big data to track big government http://t.co/7EXni4Xc
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aim… http://t.co/yP2OnbTq CloudComputingTopics
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
Good story on @SunFoundation Capitol Words project: http://t.co/eN5UcOKA
Just caught the @gigaom writeup of @sunlight labs Capitol Words. Congrats! http://t.co/EopDKCFE
Non-profit uses big data to track big government – http://t.co/sm9qtTeW (db)
Capitol Words from @sunlightlabs: monitoring the hot air of legislators. Now that there is some big data! http://t.co/b5qnoG7u
Hell Yes! In addition to this, I’d love to see bills interpreted into layman’s terms for the average American.
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
Non-profit uses big data to track big government http://t.co/JppiUjZs shared via NewsMix http://t.co/EURKeXt3 (hat tip @gigaom)
RT @derrickharris: Non-profit uses big data to track big government http://t.co/pF43zxX8 by @gigastacey < “tracking not just … money but also … ideas.”
Non-profit uses big data to track big government http://t.co/Bx79mFIJ
Non-profit uses big data to track government http://t.co/KWFWgm8m
Non-profit uses big data to track big government – GigaOm http://t.co/mxKZVqfm
Non-profit uses #bigdata to track big government — http://t.co/8Fbkg9mY
RT @HadoopNews: Non-profit uses #bigdata to track big government — http://t.co/8Fbkg9mY
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
RT @gigaom: Non-profit uses big data to track big government http://t.co/jZ9HMsl4
RT @techpower: Amazing to see so many Big Data applications emerge “Non-profit uses big data to track big government” http://t.co/CjViNBdN
NGO uses #bigdata to track #gov. @SunFoundation’s tool monitors how often & which legislators say certain phrases http://t.co/Z015bsUu
Non-profit uses big data to track big government http://t.co/oGuJR58I
Non-profit uses big data to track big government — Cloud …: The Sunlight Foundation, a non-pro… http://t.co/69a0xB9F #bigdata #blogs
love big data “@techpower: …so many Big Data applications emerge “Non-profit uses big data to track big government” http://t.co/mDamszw9”
Non-profit uses big data to track big government http://t.co/w4LhJuEw
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
RT @gigaom: Non-profit uses big data to track big government http://t.co/tRix1hPZ
Non-profit uses big data to track big government http://t.co/xKULeCJg via @zite < tracking ideas in US politics as well as money
RT @kevglobal: Non-profit uses big data to track big government http://t.co/xKULeCJg via @zite < tracking ideas in US politics as well as money
Non-profit uses big data to track big government – http://t.co/4eMIIfNQ “For us this is trying to expand the way Sunlight tracks influence”
Cool new tool from Sunlight Foundation to track the government with data: http://t.co/Abuifg3G #itpol /via @glynmoody /cc @benteka
Non-profit uses big data to track big government — Cloud …: The Sunlight Foundation, a non-profit aimed at sho… http://t.co/SX13Q7S6
Non-profit uses big data to track big government http://t.co/2qA6ZXiN
http://t.co/dNQt2aub Non-profit uses big data to track big government – GigaOm
RT @glynmoody: Non-profit uses big data to track big government – http://t.co/4eMIIfNQ “For us this is trying to expand the way Sunlight tracks influence”
Non-profit uses #bigdata to track big government http://t.co/xL6sdM3D via @giagaom #data
GigaOM: Non-profit uses big data to track big government http://t.co/eF3AHhd3
RT @techpower: Amazing to see so many Big Data applications emerge “Non-profit uses big data to track big government” http://t.co/CjViNBdN
RT @Techpower Amazing – so many #BigData apps emerging < “Non-profit tracks big govt using big data” http://t.co/BrzS83yJ [via @BigDataExpo]
RT @digimindci Non-profit uses big data to track big government http://t.co/byvOpDve
Políticos controlados por la palabra RT @gigaom: Non-profit uses big data to track big government http://t.co/z0yU3vSi
PROFIT NEWS! Non-profit uses big data to track big government — Cloud … http://t.co/0JBdR8La
Non-profit uses big data to track big government http://t.co/UBmSR5We
Non-profit uses big data to track big government http://t.co/fJWItRbr via @zite #opendata #opengov #gov20
RT @jkonga: Non-profit uses big data to track big government http://t.co/fJWItRbr via @zite #opendata #opengov #gov20
Non-profit uses big data to track big government http://t.co/as7I6gNl
Non-profit uses big data to track big government http://t.co/csVturbb
Non-profit uses big data to track big government http://t.co/cI98Iuv4
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/4ZaE5ukW
Non-profit uses big data to track big government: The Sunlight Foundation, a non-profit aimed at showing how cor… http://t.co/hXsgQ9tq
[317] Non-profit uses big data to track big government
http://t.co/IUpaC228
Non-profit uses big data to track big government http://t.co/X1n2TDnI >> New ideas on holding politicians accountable using data.
RT @gigastacey: Non-profit uses big data to track big government http://t.co/X1n2TDnI >> New ideas on holding politicians accountable using data.
Non-profit uses big data to track big government http://t.co/HZOc2tYN
RT @gigastacey: Non-profit uses big data to track big government http://t.co/X3KSgXA4 >> I’m curious to see the “win-win” meter
“Non-profit uses big data to track big government”-@GigaOM http://t.co/tOuiOGI3 @gigastacey on “Capitol Words.” #opengov #gov20
Non-profit uses big data to track big government: http://t.co/Cfbs6aa0 HT @digiphile
RT @gigastacey: Non-profit uses big data to track big government http://t.co/X1n2TDnI >> New ideas on holding politicians accountable using data.
RT @digiphile: “Non-profit uses big data to track big government”-@GigaOM http://t.co/tOuiOGI3 @gigastacey on “Capitol Words.” #opengov #gov20
“@digiphile: “Non-profit uses big data to track big government”-@GigaOM http://t.co/ZW9Q6UBx “#opengov #gov20” @ComputerNewsME
RT @PallaviSharma5: “@digiphile: “Non-profit uses big data to track big government”-@GigaOM http://t.co/ZW9Q6UBx “#opengov #gov20” @ComputerNewsME
RT @gigastacey: Non-profit uses big data to track big government http://t.co/X1n2TDnI >> New ideas on holding politicians accountable using data.
Non-profit uses big data to track big government http://t.co/JAzVIvYG
On my reading list: http://t.co/Mw0jo63V Non-profit uses big data to track big government
Non-profit uses big data to track big government – http://t.co/o4UZ8Yay
Non-profit uses big data to track big government http://t.co/0TpkNg9h <wow!>
RT @iDeanUToronto: Non-profit uses big data to track big government http://t.co/Bx79mFIJ
Non-profit uses big data to track big government (GigaOM) http://t.co/7K6tiuVs
Non-profit uses big data to track big government (GigaOM) http://t.co/7K6tiuVs
Non-profit analyzes political influence of House and Senate by parsing text of Congressional Record http://t.co/Wtu1jH0c #bigdata
Interesting piece on @SunFoundation: “Non-profit uses big data to track big government” http://t.co/QPxRXHIU
RT @CorbinHiar: Interesting piece on @SunFoundation: “Non-profit uses big data to track big government” http://t.co/Xbo5vB8Y
@dandrinkard 60g new data per day, per http://t.co/rCQwuym9 ? Seemed big. Was wondering what the full data sets were @antheawatson @nekaro
Non-profit uses big data to track big government — Cloud … http://t.co/Kw6ArOt9
Bigdata use case :Non-profit uses big data to track big government http://t.co/CiPaBPlJ #bigdata #EMR #solr
Trying to have more transparent governments. Non-profit uses big data to track big government http://t.co/JarSOOBq
Cool! @SunFoundation uses data mining 2 compare legislator speech 2 corporate phrases 2 measure influence. http://t.co/pjrp51m3
Great article about @SunFoundation http://t.co/lNStPgcn via @iDeanUToronto
Non-profit uses #big #data to track big government — Cloud Computing News | http://t.co/oOVJqhm3
@gigastacey looks at #BigData & big #government in @GigaOm check out how the non profits are deploying #analytic tools http://t.co/ew1yCLpd
非営利団体、ビッグデータを用いてビッグな政府を監視
http://t.co/yjFoxaNi
企業の関心がどのように政府に影響を与えているかを示すことを目的 とする非営利団体Sunlight Foundationは12日、市民に向けた非常に 簡単に扱えるツールを発表した。
Non-profit uses big data to track big government — Cloud Computing News http://t.co/l0MpaLYG
RT @MarkDalgarno: Non-profit uses big data to track big government Cloud Computing News http://t.co/X0BgqNei #Cloud
Non-profit uses big data to track big government http://t.co/OZazQBwh
Non-profit uses big data to track big government ( Stacey Higginbotham ) ::: http://t.co/np8HC3sX :: #BigData
RT @gigabarb: Non-profit uses big data to track big government http://t.co/as7I6gNl
Non-profit uses big data to track big government http://t.co/1Ypm9hbQ
Non-profit uses big data to track “idea flow” in big government http://t.co/SgQeFFK2 #nonprofit #data #transparency #awesome
Non-profit uses big data to track “idea flow” within big government http://t.co/vVRvfSlg #cloud #awesome
Non-profit uses big data to track big government http://t.co/8NIm4TZt
Non-profit uses big data to track big government http://t.co/cQdBOFvh
We need this in the UK! RT @joannamikhail Non-profit uses big data to track big government http://t.co/POYY8pNy