Podcasts I follow

These days I listen to podcasts quite regularly. Here are two that I’ve found particularly interesting:

The Talking Machines. (Link) This is a podcast about machine learning, which I heard of from Clemens. From a few episodes that I’ve listened to so far, their content falls under the following broad categories. (a) Description of a machine learning technique. This part is usually fairly technical. Not that they’ll have anyone recite complex formulas, but understanding them requires familiarity with machine learning. (b) Personal stories. During this part, a machine learning expert shares details of their own career, e.g., how they chose to do machine learning, what problems they like to work on and why, etc. This is the part that I like the most. (c) General discussion about the field – e.g., the impact on machine learning on society.

More Perfect. (Link) This is a new podcast about the United States Supreme Court. It’s a spinoff from Radiolab. Each episode discusses a case tried before the Supreme Court. Their content is quite accessible and does not require expertise in law. I like particularly how their discussion brings up the individual circumstances and personal aspects of each case; and how it blends with a discussion about why the case was important for an entire society.

Besides these two, I listen to the occasional episode by Freakonomics (example episode link), Radiolab (episode link) and sometimes New Yorker. I’m also eagerly looking forward Dan Carlin’s next episode. Regarding the latter, here’s another suggestion, if you like history: listen to his series on the Mongols.

A course on modern database systems

This spring, Aris Gionis and I are teaching a new course at Aalto University, on ‘Modern Database Systems‘. The course covers advanced topics in data management related to indexing, parallel processing, and memory-efficient data processing. We’ve started by discussing such issues in the context of relational database systems, and we’ll be moving on to semi-structured data and text, as well as big-data settings. For the latter, we’ll essentially be covering data processing with Hadoop and Spark. The course is designed so as to pair discussion on algorithms with hands-on experience with database systems (PostgreSQL for relational databases, MongoDB for semi-structured data, Hadoop and Spark for big-data processing). Below you can find the slides from the first three lectures.

Modern Database Systems – Lecture 00

A nice read on the future of intelligence — and some thoughts

I recently listened to the audiobook of ‘Superintelligence: Paths, Dangers, Strategies’ by N. Bostrom. The book discusses the possibility that machines surpass human intelligence, as well as various questions surrounding the issue. How might machines become smarter than human, what would that look like, how – if at all – would we still be able to control them? The author does a good job exploring the different possibilities in what might lie ahead in an objective and thorough manner and provides a lot of food for thought.

As for more personal impressions, perhaps it’s the fact that I have not read much on the subject before, but by the end of the book I was left with a sense of unease – will humans ever be ready to tackle the challenges in dealing with super-intelligence? It seems that, if super-intelligence becomes reality and stays under human control, different ways to use it and distribute its benefits could have largely different effects for the distribution of wealth and happiness among humans. Such a development would therefore be a huge test for human values (e.g., if production of goods can be planned and carried out perfectly by super-intelligent systems, should we distribute wealth equally? or according to each one’s needs? or according to the wishes of the system’s owner? who should be the owner of such systems?), but also on our decision-making institutions (e.g., how would democratic societies make use of super-intelligent systems? would democracies be enhanced with their help or substituted by them?) . Perhaps the best way to prepare for it is to work harder on the challenges we face in our current, ‘human’ affairs.

Back from Pisa

Last week I had the opportunity to travel to Pisa and attend the kick-off meeting of SoBigData, a project funded by the Horizon 2020 program of the EU. Aalto participates with two partners in the project – Santo Fortunato from the Complex Networks group and Aris Gionis from the Data Mining Group (to whom I owe my participation).

The consortium consists of many academic partners (from University of Pisa, ETHZ, CNR, TUDelft, Fraunhofer, Sheffield, IMT Lucca, King’s College London, Scuola Normale Superiore — and Aalto). Quite predictably, part of the project will be devoted to research in social data mining and related areas. What’s interesting, however, is that the largest part of the project will be devoted to integrating existing local research infrastructure (e.g. at national level) into a unified European ecosystem. The goal of the project is to build infrastructure to facilitate the sharing of datasets and research findings among European scientists.

ponte di mezzoInteresting fact about Pisa: with about 90,000 residents, it also hosts about 40,000 students – it’s a big college town.

Smart keyboard

I take as a given that whatever data I share online will circulate among companies,  partners, services, and other entities who take my privacy very seriously, but I’ve also come to expect that that happen in a discreet way that doesn’t make me aware of the fact. Well, that’s not always the case, apparently…

While texting someone a few days ago (June 11), I got  this suggestion on my keyboard.


I typed “…leave the conference”.
I’m pretty sure I had not typed the phrase “leave the euro” before on my phone, so I don’t think the prediction was based on my typing patterns. However, I had allowed the keyboard app to parse text from a couple of personal online accounts — one was Gmail and the other one was Evernote, where I had saved a few articles about the euro crisis. So it’s possible that the prediction was based on text contained there.

In any case, I got curious about how the app made that prediction, so I started playing around: for a few days I would type the same text now and then, just to see what the suggestions would be. At first, I would get suggestions like the one below — the keyboard app predicted I wanted to type what I actually typed that first time.


Then on June 15, things got more interesting.

IMG_1388“Leave the… 60”? That didn’t make sense immediately, so I followed the keyboard’s suggestions to complete the sentence. Here is what I got.IMG_1393As you might imagine, I was surprised and a little shocked that my smart keyboard would think I meant to say such a thing. I searched for the phrase on Google.

Screen Shot 2015-07-11 at 17.28.54The predicted phrase appeared verbatim in the book you see above. Why would the keyboard predict that phrase? Here are a three clues that I think are relevant: First, I had bought that book on May 30th and added it on that same day on my Goodreads account. Second, I use Facebook to sign in Goodreads. Third, I also use Facebook to sign in the keyboard app. So, my guess is that using Facebook for login allowed the keyboard to match me with that book… Or something of that sort. Well, at least it didn’t give away any spoilers.


ICWSM, ICCSS, visit to Inria

In the past few weeks, I had the opportunity to attend a couple of conferences and make a research visit to Inria, Lille.

First, there was ICWSM, in Oxford. Géraud Le Falher presented his paper and Karmen Dykstra presented her poster from her summer internship with the Data Mining group. The conference was different than the ones I’d been to so far, as it was quite interdisciplinary, with many social scientists attending.

The view from the dinner venue at ICCSS.

Then, there was ICCSS, in Helsinki. It was an event that I think impressed attendees with its excellent organisation and great line-up of keynote speakers (for those interested, the videos are online).  I found the kind of works that were presented there very similar to ICWSM. (Fun fact: Géraud presented his ICWSM paper again at ICCSS, as did the other presenters in the same session).

Center of Lille (Grand Place).

And finally, thanks to a kind invitation by Géraud and the MAGNET group at Inria, Lille, I travelled to Lille for a research visit. It was an exciting trip, as the group has an extraordinary team of researchers and it was my first time to visit Inria and present some of our recent work there. I’m also hopeful that the visit will lead to published research and more collaborations in the near future.