For example, on November 7, John went to Yogurtland to get frozen yogurt with his girlfriend; he made a post on Twitter, added a ‘froyo’ emoji, and tagged his location. On November 9, he went to Foot Locker and bought a pair of Chuck Taylor All-Stars; he was so excited about his new shoes that he wore them out of the store, and posted a geo-tagged photo to Instagram. With just these two data points — fixing John to two moments in space and time, the yogurt shop (11/7) and the shoe store (11/9) — the analysts are able to ‘narrow down’ the dataset: they find that there is one and only one person in the entire set who went to these two places on these two days. Once John is ‘de-anonymized,’ the analysts have in their hands a detailed profile of his entire spending history.


The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free. The print version will be available for sale soon.

scikit-feature is an open-source feature selection repository in Python developed by Data Mining and Machine Learning Lab at Arizona State University. It is built upon one widely used machine learning package scikit-learn and two scientific computing packages Numpy and Scipy. scikit-feature contains around 40 popular feature selection algorithms, including traditional feature selection algorithms and some structural and streaming feature selection algorithms.