Blast from Past: ExtractBib.pl

Tuesday, January 21st, 2014 -- By ET

A number of years ago, I wrote a program to extract the right subset of bib entries for a tex file from a huge bibtex file.

Today I received the following email from Dr. Florian Kluge of Universität Augsburg, Institut für Informatik:

Dear Professor Zhang,

Some years ago I found your extractbib.pl script – it was a great alleviation
for me. Thank you for the gread work!
In the meantime, my requirements changed a bit, and thus I extended the script
to able to handle multiple .tex and .bib file.
I attached the extended version to this mail, please use/distribute it if you

Best regards,
Florian Kluge

This is a delightful surprise. I’m very happy that my work could be reused.

I attached Dr. Kluge’s new script here, hopefully it will benefit more people.

Updated Script


Why SOPA is Bad to Innovation

Wednesday, January 18th, 2012 -- By ET

SOPA stands for Stop Online Piracy Act, which is a legislation introduced by the US House of Representatives. Websites like Google and Wikipedia believe this is Internet Censorship and it will cripple the Internet.

Wikipedia showed a blackout page today to protest, and it will last for 24 hours.

In my paper with Feng Zhu (You can download the paper from here.) published in June 2011 in the American Economic Review (AER), we studied the effect of blocks of the Chinese Wikipedia on incentives to contribute.

A direct effect is obviously that many people can no longer contribute to Wikipedia.

The following figure suggests that contributions from China reduced significantly after the block.

The bar chart further compares the contributions from different regions before and after the block.

An indirect effect is that people who were not directly blocked also reduced their contribution significantly. In the paper we estimate that the reduction in contribution from those who are outside China is more than 40% within the 4 weeks after the block.

The next figure shows the number of new contributors. It can be clearly seen that each block (shown in shaded areas) significantly reduced new contributors.

The lesson learned: information needs to be free. Legislation not well thought out may bring unintended consequences.

Tuesday, May 17th, 2011 -- By ET

Thoughts on Survival of the Fittest

Wednesday, March 30th, 2011 -- By ET

The theory of the survival of the fittest suggests that species adapt to the environment to survive. For example, insects may develop colorful patterns on their wings to scare off potential predators. The biological foundation for such changes and adaptations can be gene mutation.

Then a question arises: suppose mutation can trigger some effects so that an animal suddenly becomes poisonous. Why the poisonous feature cannot stay like the colorful pattern on insects? Fortunately not, if yes, then many things we see today will be poisonous!

One explanation is that the colorful patterns are salient features, so they can immediately be effective to protect the insects. In the long run, this feature can be passed onto future generations. The poisonous feature is not salient. So a wolf will continue to eat that poisonous rabbit and there is no chance for the poor rabbit to pass on this great feature.

In a sense, these poisonous rabbits are like experience goods. Before you consume them, there is no way for you to know their quality. You only learn about their quality after you eat them, but then it is often too late for you to regret. It is a blessing for us that there is no review systems (e.g., those similar to Yahoo movies, Yelp, etc.) in nature. If yes, then the poisonous rabbits can pass on the feature and today we will have to face many more poisonous animals.

Sunday, February 13th, 2011 -- By ET

Friday, October 15th, 2010 -- By ET


Friday, August 27th, 2010 -- By ET

I came across a senior professor in Statistics in the hall way. He noticed that I moved my office.

I told him that the new office has a window, and my research productivity increased by 5 times.

He then said:”Wonderful, they should have installed 2 windows for you.”


German Publicity

Thursday, August 26th, 2010 -- By ET

This blog recently saw a big surge in the number of visitors from Germany. With a little tracing, I found the following website: http://www.handelsblatt.com/politik/wissenswert/studie-wie-wikipedia-markttheorien-widerlegt;2640653;0

The website is called “Handelsblatt Economy Newsletter” that reports “New trends in economics and business administration”.

It’s purely in German and the article specifically talks about the forthcoming AER paper Feng and I wrote about Wikipedia.

I used Google to translate the article, and it looks quite nice to me.

========Translated Article Below=========

LONDON. What a mistake. In the summer of 2002 reported the “Berliner Zeitung” as one of the first German media over the Internet Wikipedia – fascinated, but also skeptical: “But like the reservoir of knowledge also have large audiences and continually grow: In the near future it will not succeed, well, works of reference such as the Brockhaus outdo. “No six years later, told the Brockhaus publishing the end for the printed dictionary with – and Wikipedia now one of the most frequently visited Internet sites worldwide. Tens of thousands are working for free, voluntarily and without any fee.

A success story that brings economists provide explanations. Their traditional theories suggest that it would not even have the rise of the Internet lexicon may be. Why should rational individuals make the effort to write encyclopedia articles free of charge for an anonymous audience? Any Internet user can use the Online Encyclopedia, without himself contributes articles.

Thus, Wikipedia is what economists call a “public good” – an offer that will benefit all the people and by the use of which no one can be excluded. Classic examples of this are dikes and street lights. For public goods, so budding economists learn in basic, there is a big dilemma: There are strong incentives to freeloaders – to seize the offer without providing anything in return. The traditional economics postulates: the greater the number of potential beneficiaries, the more problems arise with free riders.

At least with Wikipedia is exactly the opposite is the case, shows a new study that appears in the upcoming “American Economic Review: The greater the number of potential readers, the more people are willing to devote their working hours for the online encyclopedia – probably because they draw mental satisfaction from the fact that their text be read by many others.

The scientists Xiaoquan Zhang (Hong Kong University of Science and Technology) and Feng Zhu (University of Southern California) have this effect after the example of the Chinese Wikipedia page. They use the fact that the government in Beijing has repeatedly censored the site due to politically unwelcome information. From October 2005, for example, were Internet users in China, the Wikipedia page does not call for nearly a year. By blocking the target audience of Wikipedia has reduced drastically over night. Million Internet users were suddenly excluded. For Chinese people in Taiwan, Hong Kong and the rest of the world, the page remained available, however.

What impact that had on the activities on the website? The researchers focused on the behavior of users outside the People’s Republic of China – people who, despite barring further access to the site and could change them. Zhang and Zhu use the fact that all changes are recorded in detail in texts on the Wikipedia page, and conceivably relate to the country in which the authors live. They compared the activities in Wikipedia immediately before and after the lock. They noted: With the start of the blockade itself have Chinese-language Internet users outside the People’s Republic of considerably less interest in Wikipedia – suddenly they wrote fewer new contributions and extended existing texts much rarer.

“The participation of authors is not blocked by the blockade decreased on average by 42.8 percent,” the economists note. The reason: The level of cooperation in Wikipedia procure satisfaction of the individual authors – researchers are referring to “social benefit”. “The shrinking group size reduces this benefit,” they write.

One indication of this is precisely the authors, where the social aspect of Wikipedia was very important and intensively romped in the discussion boards of the lexicon, wrote at the beginning of the barrier significantly less. “Our study provides empirical evidence that social effects may be stronger than the tendency to freeloaders,” the bottom line.

The study shows once again: economists make a mistake when they explain to the people to pure egoists – they can not explain many phenomena of real life properly.

Fixing CrossOver in Snow Leopard 10.6

Wednesday, August 18th, 2010 -- By ET

An update of the system broke my almost-perfect installation of Bakoma Tex in Mac OS X 10.6 through Crossover.

It took me a few months to suffer from this tragedy. Each time when I need to work with LaTeX, I need to load my Windows 7 from bootcamp. I tried to reset the Java Virtual Machine and so on, but it could not fix the problem.

Then today I thought about the error message it gave when I tried to open CrossOver. It says “can’t load ‘/system/library/perl/extras/5.10.0…”, so it strikes me that maybe it is Perl that needs fixing.

I went to /usr/bin to list perl versions:
/usr/bin$ ls -l perl*
lrwxr-xr-x 1 root wheel 9 Aug 18 10:50 perl -> perl5.10.0
-rwxr-xr-x 1 root wheel 86000 Jun 24 2009 perl.old
-rwxr-xr-x 1 root wheel 51200 Jun 24 2009 perl5.10.0
-rwxr-xr-x 1 root wheel 34816 Jun 24 2009 perl5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlbug
-rwxr-xr-x 1 root wheel 38307 Jun 24 2009 perlbug5.10.0
-rwxr-xr-x 1 root wheel 45068 Jun 24 2009 perlbug5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlcc
-rwxr-xr-x 1 root wheel 17983 Jun 24 2009 perlcc5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perldoc
-rwxr-xr-x 1 root wheel 255 Jun 24 2009 perldoc5.10.0
-rwxr-xr-x 1 root wheel 254 Jun 24 2009 perldoc5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlivp
-rwxr-xr-x 1 root wheel 12309 Jun 24 2009 perlivp5.10.0
-rwxr-xr-x 1 root wheel 12304 Jun 24 2009 perlivp5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlthanks
-rwxr-xr-x 1 root wheel 45068 Jun 24 2009 perlthanks5.8.9

Perl 5.10.0 is a 64 bit version. So I downgraded the perl to perl 5.8.9 by the following commands:

sudo rm perl
sudo ln -s perl5.8.9 perl

Then it worked like a charm.

Traits of Successful Business Executives

Friday, August 13th, 2010 -- By ET

I’m doing some literature review for a paper of mine. I came across the following paper:

The Business Executive: The Psychodynamics of a Social Role

By: William E. Henry

The American Journal of Sociology, Vol. 54, No. 4, Industrial Sociology (Jan., 1949), pp. 286-291.


It was written in 1949, and talks about the common characteristics of successful business executives. Typically I do not find these descriptive papers useful, but it is interesting to see how people in 1949 perceive what executives should do to be successful.

The paper listed the following personality patterns that are common for success:

    Achievement Desires
    Mobility Drive
    Idea of Authority
    Ability to Organize Unstructured Situations
    Strong Self-Structure
    Apprehension and the Fear of Failure
    Activity and Aggression
    Strong Reality Orientation
    Different Interpersonal Relations with respect to Superiors and Subordinates
    Broken Tie with his own Parents
    Dependency Feelings and Concentration Upon Self

That was a long list, if you check these on people we know, say Steven Jobs, you would probably be amazed how accurate these items can “predict” his success. I’m constantly suspicious of this type of work because they obviously miss the sample of failed cases. It could be the case that people who share these traits fail more, but due to the sample selection problem, we cannot observe them. What if some other factors are driving the success of these people, and they just learned to behave in this way (i.e., behaving in this way does not produce success.)?

This brings back to the argument of my paper: when people assume social roles, they behave according to the perceived traits of these roles. In many situations, the list of characteristics is a result of being successful, not a source of it.

