How-to
-
Learn to create R packages in an hour, and its corresponding reddit discussion.
-
Modern Pandas, a tutorial on writing idiomatic pandas code.
Tech
-
A guy just transcribed 30 years of for-rent ads. Here's what it taught us about housing prices.
-
Online tracking: A 1-million-site measurement and analysis (Hacker News Discussion).
During our January 2016 measurement of the top 1 million sites, our tool made over 90 million requests, assembling the largest dataset (to our knowledge) used for studying web tracking. With this scale we can answer many web tracking questions: Who are the largest trackers? Which sites embed the largest number of trackers? Which tracking technologies are used, and who is using them?
Privacy
- Academic paper: Evaluating the privacy properties of telephone metadata. Remember when Snowden keeps repeating that metadata is all the government needs to properly track you? Here's the same thing, from the academic world:
Since 2013, a stream of disclosures has prompted reconsideration of surveillance law and policy. One of the most controversial principles, both in the United States and abroad, is that communications metadata receives substantially less protection than communications content. Several nations currently collect telephone metadata in bulk, including on their own citizens. In this paper, we attempt to shed light on the privacy properties of telephone metadata. Using a crowdsourcing methodology, we demonstrate that telephone metadata is densely interconnected, can trivially be reidentified, and can be used to draw sensitive inferences.