August scam

I constantly receive scam email, I even work with it, and it won’t be the latest scam email that will catch my attention.

This is what I though, until now!!! 😉

Below, you can find a scam email which I recently received: I read it from the beginning to the end, I laughed thoroughly, and I was even tempted to reply asking for more information. Well, this is what I got: a request for scientific collaboration on… unknown (until now), invisible, probably paranormal creatures, which our great Aircraft Engineer discovered by chance because their image was captured by his phone camera.

Not much else to say, I’ll leave you to this reading! 🙂


First of all I must apologize for sending this email to you. My name is Mr. Abdul Rahman and I am an Aircraft Engineer. I got your email from the internet and I am truly sorry if my email is bothering you. I have been trying to find a scientist that can help me in my research to the creatures that I have discovered. Actually I don’t know who to approach to help me with this discovery. I believe may be the biologist, physicist or any scientist may be able to help. Scientists who are interested in the discovery of new creatures. The images of the creatures I took cannot be a Pareidolia because of different background colour and shape. Pareidolia only supported if either the colour or shape of the background is almost the same as the mysterious image i.e images in cloud or trees.

I am suggesting scientists to carry out a research to invent a proper camera which can capture the image of these unseen creatures. I don’t believe in fairy tales and I only believe in real creatures, otherwise I would have sought help from ghost hunters or paranormal mediums. No, I am more interested to find fact, thus the only people who I believe can help me are the scientists.

My hand phone ability to capture these paranormal images which I have discovered accidentally is a crude way of carrying out the task. Only with a properly designed camera can the images be taken clearly.

As for the origin of these creatures, I have no idea. I don’t know what they are made of. My theory is that these creatures may have the ability to control humans mind causing hysteria, possession, suicide and mass killing and so on. Most scientist believe these behaviors were caused by chemical reaction in the brain. But then  something must have triggered it. Due to stress? It could be true, but how come most very stressed people are not affected with possession and hysteria? I need scientist to help me with the research and all I know is that these creatures are what people have identified as ghosts or demons.

I feel that scientists should look into these creatures more seriously as their existence can be seen by a camera, just like germs can be seen with a microscope. What if one day these creatures manage to possess a million humans at a same time? Do not think this is a hoax, it is not. I know it sounds silly but please don’t ignore me. I have tried contacting every universities but none has responded positively. Some even insulted me and accusing me of spamming. I am not advertising or selling a product here, I am seeking help to find a scientist who is willing to assist me. Please read my website and what I am telling is all true and I need scientist to verify it. I even offer US$10,000 to anyone who can proof the pictures I took are fake.

I am not trying to promote a religion here. What I believe is, good Christians, Jews and Muslims will go to heaven. So this is nothing about religions here which I want to point out. I am just trying to share important information about a species of creatures that cannot be seen by human eyes which most scientists believe do not exist. Hopefully we all can find the truth about these creatures. Below is the website. Thank you for willing to read my email.

Best Regards

Abdul Rahman
Brunei Darussalam
Tel: 673-8-725144

ICANN and new TLDs

I finally decided I will not write about ICANN’s latest decision of liberalising the “market” of generic TLDs. I will point you instead to Nominet’s company blog: Nominet is the .uk registry and this post is a very interesting insight in ICANN’s process and decision.

Web 2.0 and Databases, can the two worlds meet?

A few weeks ago, I had an interesting conversation with Paolo on why web 2.0 tools are still struggling to find their way in the academic world. Back in September last year I attended the panel What Web 2.0 Has To Do With Databases?, which investigated the reasons why the database community has left behind in the research in the field of web 2.0.

Following Paolo’ suggestion, I post the notes I took at the time. Having clear in mind that the two topics are different, I think they are somehow correlated, because those people that consider blogs, wiki, etc., a “waste of time” are also the ones that are missing the opportunity in doing research in such an interesting field.

  • Sihem Amer-Yahia (Yahoo!)
  • Alon Halevy (Google)
  • AnHai Doan (University of Wisconsin)
  • Gerhard Weikum (Max-Planck Institute for Informatics, Germany)
  • Gustavo Alonso (ETH, Zurich)

Abstract can be found here.
Here is Alon Halevy’s post on the panel: read, in particular these two comments (1, 2) which, in my opinion, summarise quite well the situation.
Is the database community ready to accept the new challenges that are coming from the Web 2.0 world? The risk of “missing the train” is very high, considering that the commercial interest on these technologies is leaving academic research behind.


  • Web 2.0 is about people, unstructured data, imprecise queries, information retrieval.
  • Web 2.0 is not about structure and quality.

Unstructured data and applications are pervasive, they are everywhere and companies greatly exploit them, but:

  • A “holistic approach” is lacking (all current solutions are ad-hoc solutions)
  • The “structured methodology”, typical of the database community, should be brought into the Web 2.0.

Database people were not fully convinced by Web 2.0 and the two worlds seemed quite distant. In general, they do not believe that databases as we know them (their structure, methodologies, best practices, etc.) will ever lose their cenrtrality in any information management application. Even web 2.0 is only a “cool application” that will eventually be substituted by something else, whereas databases will still be in place.

This is quite a conservative point of view and even those who say that “traditional DBMS’ are dead” (Michel Stonebraker among others, but he’s not the only one) seem, in practice, to be a bit sceptical about the loss of centrality of the databases.

Everybody seemed to agree that tight schema integration is a buzz word that does not work in the real world, and this despite the fact that it has been studied for several years both in the industry and in the academia.

Web 2.0 seems the good compromise to have “real” integration, though this happens at the data level (and should probably be called “data reconciliation” instead). From the schema point of view, someone argued a real integration is not possible because there are no strong stakeholders demanding for it (these will not be neither the people on the street nor Google or Yahoo).

Google pushes forward the concept of a dataspace (btw, Halevy’s dataspace) that includes all users’ data. The physical system is left in the background, almost a legacy from the past: data matters, databases are needed for storage, reliability, etc. (are we talking about cloud computing?).

Someone’s comment: companies are keen of groups that do research on Web 2.0 and even encourage them to do it. However, Web 2.0 is about people and data: if the big companies do not release the data they have, how can the DB community research on it (and what should they analyse?)?

The two worlds seemed very distant and the main reason probably relies in the different backgrounds: database are structure, metodology and algorithms. Web 2.0 is based on randomness (well, some form of), no predefined schema and, among all, unpredictable social interactions that are kept away from databases. It is no surprise that the communication between the two is particularly difficult.

Italian TLD and malicious web sites

Mapping the Mal Web, Revisited (McAfee, June 4).

A new security report from McAfee has just been released on the spread of malicious web sites among different TLDs. Very informative and detailed, the report integrates last year report. Some of the key findings:

  • .ro (Romania) and .ru (Russia) are the most risky European TLDs, i.e., the probability of finding a malicious web site is higher if surfing one of those TLDs.
  • Risk related to .biz (business) and .cn (China) is also increasing (if compared to last year)
  • .it (Italy) has worsened, but is still “a safe place”
  • .hk (Hong Kong) is the riskiest TLDs

The “Hong Kong” case, in particular, is worth a closer attention:

Bonnie Chun, an official [from the .hk] TLD, acknowledged that they had made some decisions that inadvertently encouraged the scammers:
1 . “We enhanced our domain registration online process thus making it more user-friendly. Instances include the capability for registering several domains at one time, auto-copying of administrative contact to technical contact and billing contact, etc. Phishers usually registered eight or more domains at one time.
2 . We offered great domain registration discounts, such as buy-one, get-two domains.
3 . Our overseas service partners promoted .hk domains in overseas markets.”

In a previous post I talked about the recent increased phishing activity in the .uk registry, which, in that particular case, has taken advantage from Nominet’s automatic registration process.

Other country, other problem: the .it registry will implement automatic registration procedures by the end of the year; and I read, a couple of weeks ago on Swartzy’s blog, that the IIT/CNR is also launching an advertisement campaign for .it domains.

I am curious to see if, in analogy to what happened in Hong Kong, we will see an increase of the malicious activity in the .it TLD.

DNS Ops Workshop

As promised, I post a report of the DNS Ops workshop I attended last week. The workshop has been very interesting, though a few talks were a bit too technical for me, which I only have a partial knowledge of DNS operations. Following, then, you will find a non-comprehensive list of “impressions” rather than a detailed report.

A Statistical Approach to Typosquatting
Of course 😉 I will start from my talk, which reports the preliminary results of the research on typosquatting I have been conducting recently. The slides can be found here (and here as well, as I gave the same talk at the Centr technical meeting in May).

The talk seems to have generated a bit of interest in the audience, though I think it suffered a bit from the fact that these are “early results” and much work still needs to be done before we can claim we really understand what typosquatting is (at least from a technical point of view). The talk also raised a bit of questioning about Nominet’s involvement in typosquatting. Just to be clear, at the moment Nominet is interested in my work only from a research point of view and is not taking any position in favour or against any registrar, registrant or any other party that might think to be the object of my work.

DNS monitoring, use and misuse
According to Sebastian Castro (CAIDA), in 2007 only 510 unique IP addresses generated 30% of the traffic at the root servers and 144 of them (called Heavy Hitters) sent more than 10 queries/sec and in 11 cases more than 40 queries/sec.

This are impressive numbers which might tell something about the kind of traffic that daily takes place in the Internet.

Later on, Shintaro Nakagami from NTT Communications, one of the major ISPs in Japan, reported that only 15% of the queries hitting their name servers were legitimate. This doesn’t mean that the other are necessarily malicious, for example, many of them are simply malformed queries or are generated by misconfigured web servers, however…

Finally, Young Sun La (NIDA, Korea) showed an impressive tool that they use at NIDA for monitoring queries to the .kr name servers in real time. It even sends sms’ to sysadmins if an urgent problem arises. Have a look at the slides for an idea of how it works. I might have heard that the software will be released for download, but I might have misunderstood.

How do you conveniently represent the IPv4 space? With a Hilbert Curve, for example, or, as Roy Arends (Nominet) suggests, with a Z-order curve. The resulting graph is more intuitive to read and can easily be extended to work in a 3D space.

Check out his interactive tool (from Nominet website) and his slides. In particular, go to slide number 9 and watch the heatmap of… women below 30 and earning more 100000$/year in Manhattan!!

Privacy issues in DNS
Karsten Nohl (University of Virginia) talked about the privacy issues related to the use of DNS caches. When users query the DNS they leave pieces of information in many caches and they have to trust several entities, ISPs, registries, backbone operators, etcc, that their information will not be released, sold, etc.

DNS operators cache the results of user queries, i.e., the IP corresponding to certain URLs in order to retrieve them more efficiently. This information is anonymous, i.e., they do not register the IP who made the query (in theory), but in practice certain URLs are used only by one (or a small subset of) person(s). At present, it is relatively easy for a malicious party to trace the online behaviour of some user by querying specific DNS servers only and check whether a specific URL is present in their cache.

Such an attack can be used to identify the individuals that access a specific web site: knowing the IP gives the geographic localisation of a user, but knowing his/her online behaviour might disclose much more personal information. Alternatively, it might be possible to track a specific user.

This scenario might become even more critical with the large-scale deployment of RFIDs. RFIDs have unique identifiers but are too small to store information (e.g., product information, price, etc) and they will use the DNS to look up for this data. Then, RFIDs (which have unique identifiers) will be indexed by the DNS and it will be easy to identify single users.