"The image is information (and) the information is image"
To my grandfather's memory; he was a very elegant pirate. My treasure: the coins he brought home from his travels around the world with his dreams. Ziberna F.
Federico Ziberna and Claudio Cavalera, independent Italian researchers, have conceived and described a completely new kind of privacy breach, based on avatars.
This type of violation may involve most users of the popular Instant Messaging apps: Whatsapp and Viber.
The researcher developed a system that allowed him to freely download an unlimited amount of avatars linked to as many accounts as users of famous Instant Messaging systems.
Using the User's Avatar as a “Search Key” (possibly combined with other data automatically extracted from the image thanks to facial recognition algorithms, such as ethnicity, age, gender, etc), it was possible to compare the avatar with other freely images in the network or on other accounts, in order to find a match.
This fact therefore allows you to have a chance to connect any unknown person's phone number to a real person, thanks to the avatar.
The hacking system works on multiple levels and is based on a tool that can automatically collect and store an unlimited number of images
(avatars) of Whatsapp and Viber users (at demonstration level have been
collected some millions, belonging to as many Italian mobile users
recorded on these systems). Once collected, the avatars were
cataloged and processed with facial recognition algorithms, comparison,
analysis involving other data sources on the web.
Among the different types of hacks the so nicknamed "voodoo doll exploit": the striker makes a photo to any person,
and the attack tool verifies whether the "doll" is comparable to one of the downloaded avatars and hence eventually traced back to the phone number of the person photographed.
Data collected from research
At the time of publication and having stopped the experimental phase, those shown later are the collected data (represented as
orders of magnitude) taking into account that they have been tested and reported only
those concerning the numbers with Italian operator prefixes.
200,000,000 Numbers tested
10,000,000 Contacts found
7,000,000 Avatars found
Numbered Tested: numbers entered to search for WhatsApp and / or Viber contacts:
about 200M. This number indicates the phone numbers that were given to the App to see if they were registered on the apps services.
Contact WhatsApp and / or Viber found: about 10M: This number indicates the total number of contacts (phone numbers) of
registered users on WhatsApp + Viber.
Stored Avatars: Stored User Images. Approx. 7M. This number indicates the
images (Avatar) indexed in the archives of NowISeeYou. The number is smaller
of the actual contacts found (the previous number) since
some users do not have/use
custom avatar, or it has been discarded, or it has not been collected
for other reasons.
NowIseeYou is the first and largest exploit on the violation of the
privacy known today and was conceived and realized by Federico
Ziberna, based on the idea elaborated in collaboration with Claudio Cavalera,
independent Italians researchers. The exploit is about potentially
identify an unknow person, recovering his name, surname, other personal data, and
connect them to his phone number, using as starting point his
avatar, freely available. The mechanism is based on massive collecting
phone numbers and avatars (little images
chosen by users on IM apps), then proceed as in the next synthetic schema.
NowISeeYou emulators in avatars collection phase (the avatars of the numbers found, along with the numbers themselves, are sent to one
example of facial analysis performed on an avatar (me, poor me!)
Value recovered from the avatar analysis
Caucasic - white
Hair (and color)
media about 75%
Cross social avatar exploit
(Estimated) Danger: ★★★★★ (Medium Low)
It is based on the possibility that a user uses the same avatar(s) on different
social network. The attack is automated and unattended: the attacker program
extracts the avatars in the archives and acts in sequence. Using one or
multiple reverse search lookup tools, the avatar (present in the
NowISeeYou archives) is searched on the web, scanning the results to see if
the avatar (the same image) is used and / or linked to other social, in
particular Facebook or LinkedIn...
It is an extension of the "cross social" attack but not limited to social and
can be applied when the previous one fails. This attack does not concern
only facial images, more easily used on other social instruments: rather, search the image on the web, comparing it with results
compatible. Can be supervised or not by the attacker. The program is
limited to reporting to the attacker (or saving for further processing) a list
of possible matches found.
This is the most paradoxical and most dangerous exploit, as it could
potentially expose any person using an avatar that contains his
face. This name ("voodoo dool") was chosen because it is
enough that the attacker gets one or more photos of the person he wants
sticking, perhaps retrieving it anywhere...
But we hypothesize some other scenarios, much more trivial. The attacker is not a good person.
And so he thought to invest 1k euros to buy (and / or recover: many
in fact were used) about 100 sim (of some specific countries in the world), with
some idea about how to exploit the fales that was going on
checking in privacy. And so, with a modest expense and some pc, he put it to work
in parallel 100 virtualized devices on a dozen PCs (also ones
virtualized), 24/24, 7/7 for say .. let's say: just over 3 months.
Let's make two accounts in the pocket to the attacker: NowISeeYou installed on a single
Emulated device can check, at regular, 100k numbers per day. Round counts:
we say that every single application has been turned almost
continuously for 100 days (the three months and a little more, mentioned above). At the end of the
period (the single app) will have checked about 100k * 100gg = 10,000k numbers
telephone numbers (= 10,000,000 numbers). But because of the parallelization of the
process on 100 devices, the total number of verified numbers was
previous * 100, or: 1 billion (1 billion numbers!). Suggestive.
100 virtual on 10 virtual 7/7 24/24 * 100gg = about 1,000,000,000 = 150,000,000
Of course a small percentage of these were real numbers, and in turn
only a percentage of these were linked to an IM tool. We say:
the 15%? Our little monster had thus collected 150M of avatars.
Yeah, but what do you do? To the devil, if you have time, the tail starts to shine: he has an idea
and he is ready.
Phone numbers "are verified": that is, first, those phone numbers
And this is a first advantage. Then the horns come out: thanks to the avatar, of whom
a 60-70% has a face, these numbers can be hypothetically
cataloged by sex, age and other ethnic dimensions. In short, the devil is likely to have about 100M of phone numbers
divided by selected group of countries, approximately cataloged by age group, skin color, hair color, sex etc. Now let's face
a question, rather rhetorical: according to you, could ever exist someone who
wants to buy these verified and cataloged numbers, paying the
miserable figure of 1 cent the one, then doing targeted spam / marketing?
The striker, in the face of 1k of investment, might be tempted to do 1M.
UNFORTUNATELY, THERE IS WORSE
1, 2, 3 .. many avatars
The worst is soon said. Most of the work done by our App was
to put phone numbers in the phone book and verify them, that could also
do not exist at all. NowISeeYou had to work "almost" blind. Where
"almost" means that the server dynamically monitored the percentage of
successes (positive feedbacks) on sequence of phone numbers created
starting with a real seed/number given. Which (translated) means that, if for example
in the last 10,000 sequential number tested the percentage dropped below one
some threshold, NowISeeYou moved to another "quadrant" or seed number,
going to "test" (or "carotage") other "shifted" sequences,
restarting when it found a promising new one.
But the fact remains that, as it was said, the greatest
part of the time and computing resources was wasted (from 1 billion
numbers tested, only "150M of avatars" were obtained).To extract 7M of avatars of Italian users were considered 200M+ of
numbers: clearly a waste of time. But beware of the trick: now
numbers in our possession are verified numbers. And are connected to
as many accounts and avatars. What if then we set up the system to make a second round, but using only those numbers? What
happens is very simple. The speed at which the entire process is carried out
is about a tenth.
But what's the purpose of doing all this? To what purpose is soon said: the fact is that the
most users change their avatar continuously and at regular intervals. NowISeeYou has collected the first avatar: the next round it will check whether
the avatar has changed and save it, together with the previous one, in the big db.. At this point we have a history, albeit partial, of the avatars
We decide to resume the search for avatars on the 7M
of Italian user accounts we have in db. counting on our array of devices, the time taken is about: one day ...
Let's wait a few days and repeat the procedure, again: purpose? collect new
avatars of the same users. In the end what we have: we have (on average) 2/3/4/5
different images for each account we own.
Let's take a little step back then make a big step forward.
Now we want to try a "vodoo doll" hack (finding a number in our
possession from an arbitrary photo) and we would have many more chances of success compared to before. Let's go over
and let's say we've provided our software of a minimum of intelligence: 1). it takes one phone number and compares all its avatars,
2). look if there are faces inside these avatars, 3).
(classifies them) and verifies if there are compatible data (or if they portray the
same person or a probabilistically compatible person). If so, those
images "are" the person.
What to do and small remedies
for the User
I know. It's all great. I really know, figured out. But it would be better not
use the same images on more social. Be original. If you use or
publish a photo of you in some social, do not use it on another. Not
use as avatar a photo too precise of you. The avatar.. it is you.
for IM Developers
the history of users
I know it is annoying. Someone with his findings forcing you to
change the code. But really: I know. And it is good for everyone to improve. The
IM clients are safe: they are made with the highest standards of
security. But what needs to be done is simple: some controls need to be introduced
on the server-side to keep track of some basic information,
like the total number of contacts a user has entered in his address book
(the history of contacts). History teaches. It is not reasonable for a user to have
uploaded 30,000 contacts, to say ..
for APP Developers
"I'm not a robot"
I know: what I am about to propose seems a ridiculous idea, but it is not.
Your application should be designed to prevent it from being used
from another application. It does not "have to protect its data" (what it is
made for a long time, when well when bad), but be careful: "must not
be used". The fact that "an App should not be able to be
used" means that it must not be piloted by another
application that behaves "as if" was a human. We are used to
CAPTCHA for web forms. Similar verification mechanisms need to be implemented on apps, but more sophisticated. It should not happen that an App is used by another "vampire" App: for example, all of the gambling platforms are mindful of these things.
NowISeeYou inserts thousands of numbers in the phonebook and then acts as a human,
scrolling the phone book and clicking on specific items.
In short, we need a new generation of CAPTCHA for mobile apps.