NOBAL'S ARTICLES: 2010

Magnetic Nanoparticles

When my friend mentioned that he did an experiment with Magnetic Nanoparticles recently, I was curious about it. He briefly introduced me with them. Then we discussed the Physics behind it. Next, I searched the use of Magnetic Nanoparticles in YouTube and I got the following video that explains how Magnetic Nanoparticles is used to treat cancer. It's very interesting :

Why is democracy black ?

Following video won the first prize in The world democracy video challenge. The video is a creation of Anup Paudel, a Nepali guy. It is awesome ! Many many congratulations to Anup !!

A Chinese concept for Non-Stop Trains, wow !

I liked this idea. Watch it carefully.

Basically, the train is made up of the regular train compartment and also the ‘boarding/unboarding’ compartment. Passengers may board the ‘boarding’ compartment while waiting for the train to arrive at the station. As the train arrives, it slows down and is located beneath the ‘boarding’ platform, latching on and begining to carry the now boarded compartment. Passengers are then able to go down into the actual train and continue their journey. Moreover, as the train arrived at the station and picked up a new compartment, the previous station compartment was unlatched and was left at this current station, allowing passengers to board at their own time-constraint and leisure.

Power of punctuation

Punctuation plays a great role to give meaning for a sentence. Here is an example from Wikipedia:

Sentence1: "woman, without her man, is nothing"
Sentence2: "woman: without her, man is nothing"

Both of these sentences have the same words. However, the meanings of these two sentences are completely different ! What makes the meaning different ? The answer is the punctuations ! Thus, they are very powerful !

Srilankan temple music

My Srilankan friend shared me this music. He said that it is played in temples. I'm a Hindu but I've not noticed this music in our temples. ... perhaps bcoz I don't go to temples often... Anyway, I liked this music.

Celine Dion - Because You Loved Me

Baghban - A good hindi movie

Recently I watched Baghban, a hindi movie released in 2003 (available here). Although the movie is 3 hours long, I didn't get bored even a single second ! Truely a great movie ! Clearly sends a good message... it is all about a couple who spend everything for their 4 sons but ultimately gets nothing (no respect) from them, after the retirement of the husband. The man who helped you take your first steps, will you help him to take his last ones, is the question Ravi Chopra asks through Baghban (read more review here).

Will I be treating my parents as the four sons did in the movie ? Never ! I'll never ever try to become like them! Here is a song from the same movie where the husband and wife who are forced to be separated by their sons to reduce the burden of caring their parents are singing over the phone to express their love to each other...

Voice Translation

First time when I came to France, I was thinking of some devices to translate my voice into French simply because French (and only French) is everywhere from restaurant to department ... ... umm. I had neither a good knowledge of French nor such devices. Thus, I would have bought some devices that recognize one's voice and play into target languages if they were available.

I'm aware of the difficulties in translations. However, like other fields, this area is also making some progress. As an example, Google's Android phone application, shown in following video, understands and translates a person's voice in English to French. It's interesting, isn't it ?

How mothers deliver a baby

This is a video heavily shared in Facebook. A little girl explains how mothers deliver a baby :

Devices make people feel lonely

I was reading an article about text only friend, a good article indeed. I liked some of the points the writer mentioned in his article. One of them was:

"But the tragic, isolating thing is that we reach for our devices because we don’t want to seem lonely — which is causing us to avoid our peers and actually be lonely."

We say that we feel lonely if we don't have phones, laptops or PDAs. And we use them to avoid that. Actually, if we hadn't had such devices, we would have looked around and communicate with people. Moreover, I've seen many times when we go to meet a friend in his home, we start using his computer to browse Internet rather than sharing things each other. Thus, these devices are hindering us from becoming more social.
Another point the writer has mentioned is:

“Everyone wants everyone else to say hi but doesn’t want to be the person saying hi"

I fully agree with this sentence too. I think no reason is needed to justify it. I take it as a premise sentence :).
And finally I knew another new word “fauxting’’ which means nothing but a fake texting.

Writing Emails in French ?

I receive many emails each day. No doubt, they are 99% French ones. Unlike at the beginning, I now understand French emails (at least I can guess the meaning :) ). In the worst case, I use online translation services.

These days, I write emails in French too. I remember the hardest day when I tried to write email in French to get my transcript from Telecom SudParis. It was little bit urgent. As the lady there doesn't understand English, I must write it in French to get quickly processed. So terrible time ... I still remember. I laughed a lot when I knew transcript in English is "relevés de notes" in French :))...coz I was thinking a single word in French corresponding to that English one. I had so much funs :)) and pains too :((. Now such practice is making my life easier...

Below I'll present the most commonly used French phrases in French emails so that I can reuse or anyone who wants to write email in French can use them. Note that the bold words are in English.
-----------------------------------------------------
Normally we start writing email with a greeting 'Bonjour' (Hello) in the email.

Hello
Bonjour,

Hello Nobal,
Bonjour Nobal,

Technical Blog kicks off

Now I've lunched my Techincal blog. I have two more blogs: Phulbari (Nepali), Angrejee (English & French). My original intention was to write all English stuffs in Angrejee . However, I found its difficult... Thus, I lunched Techincal blog purely for Technical stuffs. I'll use Angrejee for non-tech stuffs.

The Sixth Sense Technology

Social Media Traffic Changes

Here are some graphs that show the change in social media's traffic. All pictures are taken from mashable.com.

Well-Educated- My definition

Well-Educated : "Some see just water in river, others see electricity; some see nothing in air, others see power; some see just pollution in waste, others see energy; some see frustration in failures, others see vehicles to the success; some see Facebook, Twitter, YouTube, Email and Chat in Internet, others see the possibilities and future. If you belong to 'others' you are well-educated." :)

Introducing Google Translate for Animals

Have FUN Guys :)

Google search tips

Searching is almost compulsory to get the job done. One can use Google, Yahoo!, Bing and other search engines to search stuffs in web. Personally, I use Google more often than any other.

The faster one can search things, the more productive he becomes. To find things quickly, we need to know search tips. Here I'm providing some URLs which talk about the tips in searching web using Google.

Actually, I'm not using many of these tips till today... However, I now try to use these tips. Hope I'll be more productive :) !

Tips for using Google Search:

Better Search using Solr and Lucene

Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Apache Tomcat.

A good tutorial for beginner: Better Search with Apache Lucene and Solr
Tutorial at Solr HomePage
Slides:

Apache Solr

Lucene revisited

Lucene is an open-source full-text search library which makes it easy to add search functionality to an application or website. Want to understand Lucene in 5 minutes ? Go here. The following slide provides a quick review of Lucene.

Figure: Steps in building applications using Lucene [Source: IBM ]

Lucene Introduction

Why Lucene ? From this DOC.

Incremental versus batch indexing
Data sources
Indexing Control
File Format
Content Tagging
Stop Word Processing
Stemming
Query Features
Concurrency
Non-English Support

HTML5 - The Future of the Web

This post provides a quick introduction to HTML5- the future of the web.

1. VIDEO

2. SLIDES:

CAPTCHA Ads

What is a CAPTCHA ?

As the name implies, CAPTCHAs were created as a way for websites to differentiate a real human visitor from a bot. CAPTCHA forms allow webmasters to display an image which contains a random string of letters and numbers. Visitors to websites utilizing CAPTCHAs are then be prompted to correctly enter the text displayed in the image in order to proceed with certain actions such as registering a new account, or leaving a comment on a blog or forum.

This is done in order to prevent bots from mass registering accounts, automatically posting spammy comments, and sending spam messages to a large amount of registered users among other things. CAPTCHAs have advanced over time to become less vulnerable to bots and scripts attempting to solve the codes while striving to remain user-friendly.

How CAPTCHA Advertising Works

The evolution of CAPTCHAs has inevitably led to a form of advertising. In essence, the core concepts and purpose of CAPTCHAs will remain unchanged. Don’t worry, you’ll still be shown an image that displays a line of text which must be entered correctly in order to proceed. The difference, however, lies in the presentation of the CAPTCHA. Instead of seeing a distorted image that contains randomly generated characters, you will see an image containing text that has been carefully selected by an advertiser.

These advertisers, which will likely span a number of big name national and international corporations, will submit their ads to a company capable of displaying them. Webmasters looking to monetize their CAPTCHA forms will also sign up with this company, and will be given a script that places the customized CAPTCHA on select portions of their website. The advertiser will then pay the said company a set amount of cash every time the CAPTCHA is successfully filled out, and the webmasters will be paid a cut of what the advertiser is paying. The company, acting as an intermediary, then collects the change remaining after paying out the webmaster.

Source
Techi.com

Future computer may understand you

Days are coming...one day computer will understand our emotions ( anger, sadness, happiness, surprise and frustration) and treat us accordingly . Read more here.

Google Goggles: A Visual Search Engine

Browser market Share

Pic: Browser Market Shares. Source: www.tomshardware.com

Living VS Dead person in Facebook

I'd written an article (in Nepali) about the policies regarding dead person's email and their contents in email and social networking providers (available here) . Facebook, a popular social networking site, has a policy of making a person's page into memorial page after his death. When a person dies, his friends can make his page a memorial page after which the sensitive information like phone number, status updates etc. will be hidden.

As Facebook is new, number of living people in Facebook are higher than the dead people. But as it grows older, the trend will change. In an article it is mentioned that: "Perhaps someday there will be more memorial pages than pages for living people". So there will be more dead people in Facebook than living people huh !

Interesting Findings: Twitter text analysis

1. Verbs are much more common in their gerund form in Twitter than in general text. “Going”, “getting” and “watching” all appear in the top 100 words or so.

2. “Watching”, “trying”, “listening”, “reading” and “eating” are all in the top 100 first words, revealing just how often people use Twitter to report on whatever they are experiencing (or consuming) at the time.

3. Evidence of greater informality than general English: “ok” is much more common, and so is “f***”.

Source
Oxford-Twitter Analysis

Regarding Twitter

All the contents in this blog posts are taken from this paper.

Twitter.com is a online social network used by millions of people around the world to stay connected to their friends, family members and coworkers through their computers and mobile phones. The interface allows users to post short messages (up to 140 characters) that can be read by any other Twitter user.

Users declare the people they are interested in following, in which case they get notified when that person has posted a new message. A user who is being followed by another user does not necessarily have to reciprocate by following them back, which makes the links of the Twitter social network directed.

Twitter users are able to post direct and indirect updates. Direct posts are used when a user aims her update to a specific person, whereas indirect updates are used when the update is meant for anyone that cares to read it.

Even though direct updates are used to communicate directly with a specific person, they are public and anyone can see them.

FRIEND : Here, a user’s friend is a person whom the user has directed at least two posts to.

Research Findings :

the number of posts initially increases as the number of followers increases but it eventually saturates.
the number of posts increases as the number of friends increases
the users who receive attention from many people will post more often than users who receive little attention.
in order to predict how active a Twitter user is, the number of friends is a more accurate signal than the number of his followers.
most users have a very small number of friends compared to the number of followees they declared.
the cost of declaring a new followee is very low compared to the cost of maintaining a friends (i.e. exchanging directed messages with other users). Hence, the number of people a user actually communicates with eventually stops increasing while the number of followees can continue to grow indefinitely.
users with more followers and friends will be more active at posting than those with a small number of followers and friends.
a link between any two people does not necessarily imply an interaction between them. in the case of Twitter, most of the links declared within Twitter were meaningless from an interaction point of view. Thus the need to find the hidden social network; the one that matters when trying to rely on word of mouth to spread an idea, a belief, or a trend.

Conclusion:

In conclusion, even when using a very weak definition of “friend” (i.e. anyone who a user has directed a post to at least twice) we find that Twitter users have a very small number of friends compared to the number of followers and followees they declare. This implies the existence of two different networks: a very dense one made up of followers and followees, and a sparser and simpler network of actual friends. The latter proves to be a more influential network in driving Twitter usage since users with many actual friends tend to post more updates than users with few actual friends. On the other hand, users with many followers or followees post updates more infrequently than those with few followers or followees.

Real-time web search

I loved this article because it provided me an information about real-time web searching which is at its infancy. Real-time web searching means searching the real time content. For example, if a great politician dies, people generate content exponentially. Providing relevant information in real time is not so easy. Here I'm listing some of the points that I liked in the article.

Now a delay of minutes on a breaking news story is unacceptable
Real-time search starts by determining that something important is happening in, well, real time.
Real-time search today is in its infancy, but it's the next stage in the evolution of Internet search.
RT Searching should address how can the explosion of instant content produced by news organizations, blogs, and social-media users be organized so that results can be provided instantly
what is "real-time" content?: -it centers on the concept of microblogging, or instant publishing of content to the open Web from social-media services. But in practice, "real-time search is still primarily Twitter search
two components to real-time information: the actual content of the status update or post, and the link that is being shared within that update.
Why web search providers want to buy Twitter's 'Firehouse' ?... Why spend the money? It's simply too difficult to crawl Twitter the way traditional search engines crawl the Web. All three major search engines (Y,G,B) at this point have inked deals to have Twitter push its content directly to them, saving those companies (and Twitter) time, energy, and money.
deadlines are dead in the real-time world.
So if search engines are to remain relevant themselves, they'll need to make sense of this content. And unless social-media networks are able to make their content discoverable, they won't turn into the types of content-discovery engines that their public-relations people like to imagine are already here.
Expect the importance of real-time search to only grow over the next several years. For example, Yahoo's search deal with Microsoft does not include real-time indexing and ranking efforts, as the company believes that it's too important to give away.

Interesting Links:

Oneriot.com - Assumes that the content based on on the premise that the link being shared within the status update is more relevant than the message itself.
Wowd.com - An example search engine of real-time web searching

Text Normalizer - Dealing with ascents

I had to sort French texts in alphabetical order. It was not as simple as we compare English strings because we must deal with the French ascents such as é and à.

If we don't process anything and use the simple string comparison function, we get équipement after zebra. However, we need équipement between words starting from 'd' and 'f' i.e. we want équipement as if it were equipement. In order to solve the problem, we must compare strings after we normalize and remove Diacritic:

String normalizedStr1=Normalizer.normalize(Text1, Normalizer.Form.NFD).replaceAll("[\u0300-\u036F]", "");

String normalizedStr2=Normalizer.normalize(Text2, Normalizer.Form.NFD).replaceAll("[\u0300-\u036F]", "");

Now we make comparison between normalizedStr1 and normalizedStr2 instead of Text1 and Text2.

The Great Walls

You might have noticed the word 'Walls' in the title. Actually, I was reading a technical article, China's Great Firewall spreads overseas. I noticed a new phrase "Great Firewall" for the first time. So, in world there are two great walls: The Great Wall and The Great Firewall. Both lie in China :).

Facebook wants to be the main river ?

A tributary is a stream or river which flows into a main stem (or parent) river. Facebook wants every site on the web to be a tributary. And it wants to be the main river using Open Graph API.

Basically, the Open Graph API is a way for Facebook to allow other companies, sites, services, etc to interact with Facebook without having to create a dedicated Facebook Page.
With the Open Graph API, Facebook wants to allow anyone to take their own site and essentially wrap it in a Facebook blanket. This doesn’t necessarily mean in a visual way, but rather that these sites which use the APIs will be able to replicate many of the core Facebook functionality on their own sites.
So you can imagine that you might be able to create a Facebook-style Wall to include on your site, but able to update your statuses from your site, leave comments, like items, etc. Again, it’s like a Facebook Page, but it would be on your site. And you can only include elements you want, and leave out others.

Text 2.0 - Interesting

One of my friends has been keeping his messenger status : "Everything is 2.0" since last few months back. It is because he is working in Web2.0 and communication systems and he thinks that everything is changing. Text was in 1.0 but now it is approaching to 2.0. This fact supports him :). Watch the video below, it is really interesting one !

Facebook beats Google

Here is an interesting news copied from consumerist.com:

It's official -- playing Farmville and tagging friends in photos (and consequently untagging embarrassing photos of yourself from your friends' photos) has become more popular than actually trying to find things on the internet, as a new report shows Facebook edged out Google as the most-visited site on the internet last week.

According to Hitwise, Facebook accounted for 7.07% of all web traffic for the week ending March 13. That barely edges out Google's 7.03%.

This is huge news for Facebook, who only a year ago accounted for around 2% of U.S. web traffic.

XML Namespace: Attributes are a little different

An attribute can appear in a different namespace than the element that contains it. For example, <movie:title xml:lang="fr"> has an attribute that is not from the movie namespace. If an attribute name has a prefix, its name is in the namespace indicated by the prefix. However, if an attribute name has no prefix, it has no namespace. This is true even when the default namespace has been assigned. The W3C Namespaces in XML Recommendation makes that point with this example:

<x xmlns="http://www.w3.org" xmlns:n1="http://www.w3.org">
  <good a="1" n1:a="2" />
</x>

The elements are affected by the declaration of a URI for the default namespace. That is, both x and good are associated with the URI "http://www.w3.org" because it's the default namespace. The attribute n1:a is also associated with that namespace, due to its use of the n1 prefix, which is associated with the same URI. There is no conflict that the a attribute is being declared twice, because while n1:a is in the http://www.w3.org namespace, the unprefixed a is not; the latter is not in any namespace.

Reference:

Copied from XML Namespace
Another Interesting tutorial for XML NS: Here

Dependency Trees

A dependency tree is a graphical representation of a sentence parsed using a dependency grammar. The nodes in the tree correspond to words in the sentence being parsed (and sometimes to special synthesized nodes). The arcs correspond to dependency relations between a "head" word, at the upper end of an arc, and the dependent words at the lower ends of the arcs connected to the head word. The grammatical relations between head and dependent words are such things as subject, object, modifier, etc.

TPTP - A Java Profiling Tool

In software engineering, program profiling, software profiling or simply profiling, a form of dynamic program analysis (as opposed to static code analysis), is the investigation of a program's behavior using information gathered as the program executes. The usual purpose of this analysis is to determine which sections of a program to optimize - to increase its overall speed, decrease its memory requirement or sometimes both.

The set of profiling tools provides software developers or testers with the ability to analyze the performance of a Java program or to gain a comprehensive understanding of the overall performance of an application. Eclipse Test and Performance Tools Platform (TPTP) is such a tool used for profiling. A good tutorial is here: Tutorial.

Ontology learning

Ontology learning also known as ontology extraction, ontology generation or ontology acquisition is a semi-automatic way of information extraction which is used to build an ontology from scratch (finding concepts and their relations), enriching or adopting an existing ontology.

C'est tres interessant

Apres long temps, j'ai ecrit une article en francais. En fait, j'ai une nouvelle interessante. La nouvelle est ici:

Aujourd'hui encore, il n'y a pas l'electricite et les routes dans ma ville. La vie est difficile. Par example, je peux appeller mes parents une fois de mois ! Donc on peux imaginer la situation et la vie là bas. En consequence, les personnes de ma ville veux partir le village pour trouver travailles et pour la vie mieux.

La tendance migration est normal. Mais la tendance donne resultats interessants. Par example, aujourd'hui j'ai trouve une personne qui j'ai vu beaucoup d'annes avant. J'ai vu lui dans ma ville 15 anne avant ! Merci web, merci mes articles blog et merci Google. A cause d'eux, il a trouve moi and donc j'ai trouve ma frere de ma ville. C'st tres interesterant....n'est pas? ;)

Federated Search

Federated search is the simultaneous search of multiple online databases or web resources and is an emerging feature of automated, web-based library and information retrieval systems. It is also often referred to as a portal or a federated search engine.

Quotes that I like

"It is not enough to have a good mind; the main thing is to use it well." - Rene Descartes
"Education is the great engine of personal development. It is through education that the daughter of a peasant can become a doctor, that the son of a mineworker can become the head of the mine, that a child of farmworkers can become the president of a great nation. It is what we make out of what we have, not what we are given, that separates one person from another" ~ Nelson Mandela.
"It always seems impossible until its done" ~Nelson Mandela.

Finding 'a word' using Regular Expression

We frequently need to test where a word (rather than pattern) exists in other string or not. To illustrate more, consider the following two strings:

String1: This fact is very important to understand.
String2: port

Now two interesting cases arise:

Find whether pattern "port" appears in String1: In this case the regular expression would be: String regExp=".*"+ String2 +".*"; Clearly, we don't care what comes before and after the pattern. It would be true because port pattern in there in String1 (the word important contains it)
Find whether a word "port" appears in String1. Regular expression in this case is : String regExp=".*\\b"+ String2 +"\\b.*"; Following Java code is used to test this:

String regExp=".*\\b"+String2+"\\b.*";
Pattern p = Pattern.compile(regExp);
Matcher m = p.matcher(String1);
if( m.matches())
{

}

It fails here because "port" as a word doesn't appear in String1. It just appears as a pattern. If String1="The port was far" then the pattern matches because port appears as a word.

Key role is played by the \b of regular expression which is used to find word in a string.

Named Entity Recognization

Named Entity Recognition (NER) is also known as entity extraction and entity recognition. NER, a subtask of information extraction, is a process of finding mentions of specified things in the given text. In other words, it seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.

Most research on NER systems has been structured as taking an unannotated block of text, such as this one:
Jim bought 300 shares of Acme Corp. in 2006.

And producing an annotated block of text, such as this one:

<ENAMEX TYPE="PERSON">Jim</ENAMEX> bought <NUMEX TYPE="QUANTITY">300</NUMEX> shares of <ENAMEX TYPE="ORGANIZATION">Acme Corp.</ENAMEX> in <TIMEX TYPE="DATE">2006</TIMEX>

In this example, the annotations have been done using so-called ENAMEX tags that were developed for the Message Understanding Conference in the 1990s.

Performance: state-of-the-art NER systems for English produce near-human performance.
Tools: Wikipedia lists a number of open source tools such as MALLET

Digg: Social News Website

Digg is a social news website where people share links and stories. Each registered user can vote and comment on the shared items. The contents are ordered based on the user's voting: the more people vote, the higher its ranking will be. Wikipedia mentions that social networking website are motivated by Digg's idea of sharing and voting features.

About Me

Blog Archive

Labels

Favourite Links

Number of Visitors