experiment, three

New Feeds

16 February 2009 · Leave a Comment

As you requested we have moved you feeds…

I just updated to the new FeedBurner platform. The new feeds can be found at:

http://feeds2.feedburner.com/EsperimentoTre (esperimento tre)

http://feeds2.feedburner.com/ExperimentThree (experiment, three)

→ Leave a CommentCategories: Uncategorized
Tagged: ,

Snow in UK

6 February 2009 · Leave a Comment

We woke up with some snow today… check the image on top of this blog!

→ Leave a CommentCategories: Uncategorized
Tagged: ,

New accounts

15 December 2008 · Leave a Comment

A week-end full of “social”: I now have my own accounts on Last.fm, anobii.com and Facebook.

Last.fm will probably prove to be the most useful. Or maybe the less useless! ;-)

→ Leave a CommentCategories: Uncategorized
Tagged: , , , , ,

The Sound Diaries Project

3 December 2008 · 4 Comments

A long time since I last came here and a few things happened. The occasion (or excuse) to start writing again comes from Sound Diaries, a curious project from the Sonic Art Research Unit at Oxford Brookes University:

The Sound Diaries initiative is focused around sound-recordings and sound-texts and the ways in which we can use sound as a document of our lives (from the project’s webiste).

We hear “sounds” every day, hour and minute, but only seldomly we do listen to them. Our mind is full of images and thoughts, but we often loose memory of the sounds that crossed our lives.

I like this comment, because it explains the motivations behind the idea of a “Sound Diary”:

…If my sounds are taken by you and then remixed to form tracks, I think there is a danger of the sounds becoming completely decontextualised.

The purpose of keeping a sound diary or creating one is to document life in sound…

I think recording your own sounds is almost the most important aspect of developing a sound diary project…

The act of listening to the ever-unfolding soundscape around us … (just imagine all the sounds that are happening in the world right now as I type this!!!) is an essential element within the process of creating a sound diary… (by Felicity)

I did a little experiment with “my” sounds, taken at home, while working. I did a three minutes recording, then I edited the file and shrinked it to less than a minute. It’s like “listening to you from the outside”.

(I would add my soundscape, if only WordPress allowed my to do so)

→ 4 CommentsCategories: Uncategorized
Tagged: , , , ,

Goodnight

21 November 2008 · Leave a Comment

this post has been voluntarily backdated.

→ Leave a CommentCategories: Uncategorized
Tagged:

A long day with Unicode

23 September 2008 · 1 Comment

Last week I attended a training course on Unicode by Jukka K. Korpela. It was interesting, though the subject is… “tough”! ;-)

Following are a few (absolutely non exhaustive) notes I took during the course:

Introduction

Computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers. No single encoding could contain enough characters: for example, the European Union alone requires several different encodings to cover all its languages. Even for a single language like English no single encoding was adequate for all the letters, punctuation, and technical symbols in common use. (from What is Unicode)

Unicode is an international standard that, due to its complexity, is still not fully accepted. However, it is the default in several applications (e.g., XML applications).

Unicode is a “unified coding system” that contains more than 100,000 characters. It is dynamic, because new characters are continually added in order include “all” possible human characters. From a theoretical point of view, we could say that it tries to preserve cultural diversity while giving a universal interpretation of all human languages (it is arguable whether it is successful on this: for example, some Chinese characters are still not part of Unicode).

It’s about encoding, not fonts
There is an important difference between a font and the underlying encoding. Unicode is about “encoding glyphs”: given a sign representing a character in a human language, Unicode describes it univocally. Fonts, on the other hand, are a visual representation (a rendering) of those glyphs.

A font usually supports a small subset of Unicode. Western languages fonts, for example, do not support Chinese characters.

In general, only glyphs can be encoded, not abstract ideas. This simple concept has been and still is a matter of discussion whenever a new character needs to be included.

About characters

  • Unicode is a 32-bits characters set. Each character has only one encoding, with some exceptions (compatibility reasons with older encoding systems)
  • Some characters are obtained as a composition from other characters. The accented letter “è”, for example, is a composition of two characters: è = e + ` (see here for more details)
  • The name of a character is its identifier: it contains letters, numbers, spaces, hypens.

A few definitions:

  • Code point. A value in the 32-bits space. Each char has a code point, not all code points are assigned to chars. This is the numeric representation (usually hexadecimal) of a character.
  • Blocks. Blocks are groups of characters. The assignment to characters to blocks, however, seems a bit confusing: for example, there is a block called “Greek and Coptic”, but it does not include all Greek characters.
  • Categories. Each character has a set of properties which can be used for classification. For example, a letter category is anything used to write words in any language. There are properties which distinguish the script to which a character belongs to. There is a math symbols category.
  • Normalisation. Technique to translate a complex character in two or more simpler characters.
    • In western languages it is usually used to remove diacritics (accents, etc.) by substituting them with apostrophes
    • Used for compatibility purposes (eg, to translate to ASCII)
    • It might create problems if the process has to be reversed

Unicode in real life and a bit of IDNs

Using Unicode might lead to lots of confusion and extra care should be used when dealing with it:

  • ASCII punctuation is different from Unicode punctuation, for example when dealing with quotes: “ ‘ ’ ” ‘ ” but to many the difference is not clear
  • Certain characters are repeated in different scripts
    • The Latin character A and the Cyrillic character A, for example, look/are the same.
    • Different sets of numbers are presents in different scripts
  • Sometimes the same punctuation characters can be found in different scripts with different logical meaning (this is the case of math symbols)
  • Compatibility characters: they are used to make Unicode compatible with older encodings, it is a very vague concept that may easily induce in confusion. For example, K (Kelvin symbol) is different from K (letter) but identical in their representation.
  • To make things worse, characters do not have a property that identify compatibility chars. People “know” which they are only from reading the big books containing the standards.

We discussed a bit about the problem of Internationalised Domain Names (IDNs), which open the doors to typosquatting and phishing. One policy might be to disallow mixing different scripts when registering IDNs. In certain languages it is common practice to use characters or words from the Latin alphabet as part of the sentence and such a solution would constitute a big limitation.

A partial solution, which might work for the most common cases, is to allow mixing any script with the “common” Latin script.

References

→ 1 CommentCategories: Uncategorized
Tagged: , , , , ,

August scam

1 September 2008 · 4 Comments

I constantly receive scam email, I even work with it, and it won’t be the latest scam email that will catch my attention.

This is what I though, until now!!! ;-)

Below, you can find a scam email which I recently received: I read it from the beginning to the end, I laughed thoroughly, and I was even tempted to reply asking for more information. Well, this is what I got: a request for scientific collaboration on… unknown (until now), invisible, probably paranormal creatures, which our great Aircraft Engineer discovered by chance because their image was captured by his phone camera.

Not much else to say, I’ll leave you to this reading! :-)

Hi,

First of all I must apologize for sending this email to you. My name is Mr. Abdul Rahman and I am an Aircraft Engineer. I got your email from the internet and I am truly sorry if my email is bothering you. I have been trying to find a scientist that can help me in my research to the creatures that I have discovered. Actually I don’t know who to approach to help me with this discovery. I believe may be the biologist, physicist or any scientist may be able to help. Scientists who are interested in the discovery of new creatures. The images of the creatures I took cannot be a Pareidolia because of different background colour and shape. Pareidolia only supported if either the colour or shape of the background is almost the same as the mysterious image i.e images in cloud or trees.

I am suggesting scientists to carry out a research to invent a proper camera which can capture the image of these unseen creatures. I don’t believe in fairy tales and I only believe in real creatures, otherwise I would have sought help from ghost hunters or paranormal mediums. No, I am more interested to find fact, thus the only people who I believe can help me are the scientists.

My hand phone ability to capture these paranormal images which I have discovered accidentally is a crude way of carrying out the task. Only with a properly designed camera can the images be taken clearly.

As for the origin of these creatures, I have no idea. I don’t know what they are made of. My theory is that these creatures may have the ability to control humans mind causing hysteria, possession, suicide and mass killing and so on. Most scientist believe these behaviors were caused by chemical reaction in the brain. But then  something must have triggered it. Due to stress? It could be true, but how come most very stressed people are not affected with possession and hysteria? I need scientist to help me with the research and all I know is that these creatures are what people have identified as ghosts or demons.

I feel that scientists should look into these creatures more seriously as their existence can be seen by a camera, just like germs can be seen with a microscope. What if one day these creatures manage to possess a million humans at a same time? Do not think this is a hoax, it is not. I know it sounds silly but please don’t ignore me. I have tried contacting every universities but none has responded positively. Some even insulted me and accusing me of spamming. I am not advertising or selling a product here, I am seeking help to find a scientist who is willing to assist me. Please read my website and what I am telling is all true and I need scientist to verify it. I even offer US$10,000 to anyone who can proof the pictures I took are fake.

I am not trying to promote a religion here. What I believe is, good Christians, Jews and Muslims will go to heaven. So this is nothing about religions here which I want to point out. I am just trying to share important information about a species of creatures that cannot be seen by human eyes which most scientists believe do not exist. Hopefully we all can find the truth about these creatures. Below is the website. Thank you for willing to read my email.

Best Regards

Abdul Rahman
Brunei Darussalam
Tel: 673-8-725144

http://abdulrahman180.googlepages.com/abdulrahman180

→ 4 CommentsCategories: Uncategorized
Tagged: ,

ICANN and new TLDs

7 July 2008 · Leave a Comment

I finally decided I will not write about ICANN’s latest decision of liberalising the “market” of generic TLDs. I will point you instead to Nominet’s company blog: Nominet is the .uk registry and this post is a very interesting insight in ICANN’s process and decision.

→ Leave a CommentCategories: Uncategorized
Tagged: , , ,

Web 2.0 and Databases, can the two worlds meet?

2 July 2008 · Leave a Comment

A few weeks ago, I had an interesting conversation with Paolo on why web 2.0 tools are still struggling to find their way in the academic world. Back in September last year I attended the panel What Web 2.0 Has To Do With Databases?, which investigated the reasons why the database community has left behind in the research in the field of web 2.0.

Following Paolo’ suggestion, I post the notes I took at the time. Having clear in mind that the two topics are different, I think they are somehow correlated, because those people that consider blogs, wiki, etc., a “waste of time” are also the ones that are missing the opportunity in doing research in such an interesting field.
***
Panellists:

  • Sihem Amer-Yahia (Yahoo!)
  • Alon Halevy (Google)
  • AnHai Doan (University of Wisconsin)
  • Gerhard Weikum (Max-Planck Institute for Informatics, Germany)
  • Gustavo Alonso (ETH, Zurich)

Abstract can be found here.
Here is Alon Halevy’s post on the panel: read, in particular these two comments (1, 2) which, in my opinion, summarise quite well the situation.
***
PROBLEM
Is the database community ready to accept the new challenges that are coming from the Web 2.0 world? The risk of “missing the train” is very high, considering that the commercial interest on these technologies is leaving academic research behind.

INTRODUCTION

  • Web 2.0 is about people, unstructured data, imprecise queries, information retrieval.
  • Web 2.0 is not about structure and quality.

Unstructured data and applications are pervasive, they are everywhere and companies greatly exploit them, but:

  • A “holistic approach” is lacking (all current solutions are ad-hoc solutions)
  • The “structured methodology”, typical of the database community, should be brought into the Web 2.0.

WEB 2.0 IS FASHION, DBMS’ RULE
Database people were not fully convinced by Web 2.0 and the two worlds seemed quite distant. In general, they do not believe that databases as we know them (their structure, methodologies, best practices, etc.) will ever lose their cenrtrality in any information management application. Even web 2.0 is only a “cool application” that will eventually be substituted by something else, whereas databases will still be in place.

This is quite a conservative point of view and even those who say that “traditional DBMS’ are dead” (Michel Stonebraker among others, but he’s not the only one) seem, in practice, to be a bit sceptical about the loss of centrality of the databases.

SCHEMA INTEGRATION FAILED, WEB 2.0 MIGHT BE THE ALTERNATIVE
Everybody seemed to agree that tight schema integration is a buzz word that does not work in the real world, and this despite the fact that it has been studied for several years both in the industry and in the academia.

Web 2.0 seems the good compromise to have “real” integration, though this happens at the data level (and should probably be called “data reconciliation” instead). From the schema point of view, someone argued a real integration is not possible because there are no strong stakeholders demanding for it (these will not be neither the people on the street nor Google or Yahoo).

Google pushes forward the concept of a dataspace (btw, Halevy’s dataspace) that includes all users’ data. The physical system is left in the background, almost a legacy from the past: data matters, databases are needed for storage, reliability, etc. (are we talking about cloud computing?).

OTHER COMMENTS
Someone’s comment: companies are keen of groups that do research on Web 2.0 and even encourage them to do it. However, Web 2.0 is about people and data: if the big companies do not release the data they have, how can the DB community research on it (and what should they analyse?)?

***
SUMMARY
The two worlds seemed very distant and the main reason probably relies in the different backgrounds: database are structure, metodology and algorithms. Web 2.0 is based on randomness (well, some form of), no predefined schema and, among all, unpredictable social interactions that are kept away from databases. It is no surprise that the communication between the two is particularly difficult.

→ Leave a CommentCategories: Uncategorized
Tagged: , , , , , ,

Italian TLD and malicious web sites

12 June 2008 · Leave a Comment

Mapping the Mal Web, Revisited (McAfee, June 4).

A new security report from McAfee has just been released on the spread of malicious web sites among different TLDs. Very informative and detailed, the report integrates last year report. Some of the key findings:

  • .ro (Romania) and .ru (Russia) are the most risky European TLDs, i.e., the probability of finding a malicious web site is higher if surfing one of those TLDs.
  • Risk related to .biz (business) and .cn (China) is also increasing (if compared to last year)
  • .it (Italy) has worsened, but is still “a safe place”
  • .hk (Hong Kong) is the riskiest TLDs

The “Hong Kong” case, in particular, is worth a closer attention:

Bonnie Chun, an official [from the .hk] TLD, acknowledged that they had made some decisions that inadvertently encouraged the scammers:
1 . “We enhanced our domain registration online process thus making it more user-friendly. Instances include the capability for registering several domains at one time, auto-copying of administrative contact to technical contact and billing contact, etc. Phishers usually registered eight or more domains at one time.
2 . We offered great domain registration discounts, such as buy-one, get-two domains.
3 . Our overseas service partners promoted .hk domains in overseas markets.”

In a previous post I talked about the recent increased phishing activity in the .uk registry, which, in that particular case, has taken advantage from Nominet’s automatic registration process.

Other country, other problem: the .it registry will implement automatic registration procedures by the end of the year; and I read, a couple of weeks ago on Swartzy’s blog, that the IIT/CNR is also launching an advertisement campaign for .it domains.

I am curious to see if, in analogy to what happened in Hong Kong, we will see an increase of the malicious activity in the .it TLD.

→ Leave a CommentCategories: Uncategorized
Tagged: , , , , ,