I use Telegram. Many use Whatsapp. Others may use other messaging apps.
I’ve heard of debate about whether the security of telegram was really superior to whatsapp and similar arguments for a long time, but recently, a comment about “metadata” got my interest in one of these discussions. The comment pointed out that telegram, which claims to provide a very good security to its users, was leaking a certain amount of metadata.
Metadata is data that describes other data. Meta is a prefix that in most information technology usages means “an underlying definition or description.” Metadata summarizes basic information about data, which can make finding and working with particular instances of data easier.
Actually, the same arguments stands for whatsapp, but this got my attention and I wanted to actually search for some interesting implications. Since telegram is mainly open source and pretty open, we got a number of implementations, and often we have their source code to examine and play with.
Checking out this telegram command line client, it was immediately clear to me that one important piece of metadata which is actually broadcast to all users is the user status.
If you are a telegram (or whatsapp) user, whenever you switch to the app, your user status changes to ‘online’. This information is broadcast to all the users which have your number in their contact list, if you didn’t change the related setting in the preferences of the app. The same happens when you close or background the app, changing your user status to ‘offline’.
Even if the official apps doesn’t let you do that, one of your contacts could certainly write the code to monitor the status of every user regularly, thus building an history of the statuses you’ve been through during the day, collecting a worrying amount of information about you and your habits. Given the amount of messages we’re flooded with in these apps (think about group chats), if you collect this kind of data about someone on his/her primary app of messaging, it’s pretty trivial to derive some sensitive information like sleeping cycles or working hours (not for everyone, but still).
Even worse if you think that when two people are texting, having both contacts on your list means you could probably exploit the data to actually match who is texting to who. This is a bit more trickier to automate, but a finely tuned machine learning algorithm should do it (of course there may be problems with simultaneous conversations and group chats, but it would definitely work at least in some cases).
This isn’t the end of the world, but it is a good insight about a really common security problem of which people using these apps daily are probably completely unaware.
PS. I have of course modified the telegram web app to do this, and even if I’ve spent very little time on this, it actually works. Here’s a screen.
Code is on github.