Archive

Author Archive

B2B approach based on data mining vs social networking approach ?

August 10, 2010 Leave a comment

Thenewlead is implementing their own B2B marketing data/service system based on data mining while the rest is doing the opposite things: playing around with social networks.

Technically it’s interesting things to work on. However will it be monetized in marketing B2B is a different story.

Will review more details  soon.

Advertisements
Categories: Uncategorized

asking the right question: interview with tech comp

August 10, 2010 Leave a comment

Will summary all my experiences with Facebook, Google, Amazon, BrightEdge and other giants.

Categories: Uncategorized

a new paper P # NP, it might be an unsurprised huge milestone of the CS

August 10, 2010 Leave a comment

It might take weeks to finish reading this one:

Millions of ppl is reading it too ;).

Categories: Uncategorized

Klout and basic categorization ML approach

July 12, 2010 Leave a comment

The classical Machine Learning problems called categorization or clustering is not only academically beautiful but also practically efficiency. The behind motivation is simple, “tell me who your friend are I will tell who you are”.There are a number of algorithms: Naive Bayes, SVM i.e.

The Klout scoring problem is similar yet the answer is slightly different. It scores your influence based on what you read, what you tweet/retweet other than who are you following.

I now focus on a number ways to compute klout score based on clustering/classification ML approach.

Categories: technology

why twitter call it SnowFlake

July 12, 2010 Leave a comment

the name told its meaning.

There is a theorem states that there is not existed two identical snow flake, neither the sequence ID in the database.This is the first reason.

The second reason is the way they generate the sequence ID. The snow flake generated by the combination of various conditions: wind, temperature, moisture, atmosphere and so on. All these conditions are geographical and time factors, so are the sequence ID.Twitter generate the sequence ID based on server I and the generated ID.

This is the old trick for yet a new problems.

More on the k-sorted algorithm later.

Categories: technology