Data Links #18

  • Here you have a new data analysis competition, this time brought to us by DrivenData: Naive Bees Classifier. From the competition description: Metis wants to know: using images from BeeSpotter can you identify a bee as a honey bee or a bumble bee? These bees have different behaviors and appearances …

more ...


Data Links #16

  • Are you a user of the amazing R caret package? Do you sometimes wonder why, when you use svmRadial or svmLinear (or an SVM with any other kernel), the result changes a bit (or a lot) depending on the logical control variable classProbs? Here is your answer, written by no …

more ...


From wide to long format in one line of R

A Spanish journalist friend told me his following problem: he had a dataset of longitudinal data (say, 10 years, one data point per year) for several countries. While gathering the data, he chose a wide structure: a column for the country name and a column for each year of data …

more ...


Data Links #13

  • A new Kaggle competition: Truly native?. Dato is sponsoring this competition with the noble goal of making native advertising live up to its name. With a dataset of over 300,000 raw HTML files containing text, links, and downloadable images, they also want to give Kagglers a challenge that encourages …

more ...