Warning: include_once(/home/nullvoid/blog.mikezhang.com/wp-content/plugins/wordpress-support/wordpress-support.php): failed to open stream: Permission denied in /home/nullvoid/blog.mikezhang.com/wp-settings.php on line 217

Warning: include_once(): Failed opening '/home/nullvoid/blog.mikezhang.com/wp-content/plugins/wordpress-support/wordpress-support.php' for inclusion (include_path='.:/usr/local/lib/php:/usr/local/php5/lib/pear') in /home/nullvoid/blog.mikezhang.com/wp-settings.php on line 217
NullVoid » Technology NullVoid » Technology


Blast from Past: ExtractBib.pl

Tuesday, January 21st, 2014 -- By ET

A number of years ago, I wrote a program to extract the right subset of bib entries for a tex file from a huge bibtex file.

Today I received the following email from Dr. Florian Kluge of Universität Augsburg, Institut für Informatik:

Dear Professor Zhang,

Some years ago I found your extractbib.pl script – it was a great alleviation
for me. Thank you for the gread work!
In the meantime, my requirements changed a bit, and thus I extended the script
to able to handle multiple .tex and .bib file.
I attached the extended version to this mail, please use/distribute it if you

Best regards,
Florian Kluge

This is a delightful surprise. I’m very happy that my work could be reused.

I attached Dr. Kluge’s new script here, hopefully it will benefit more people.

Updated Script


Why SOPA is Bad to Innovation

Wednesday, January 18th, 2012 -- By ET

SOPA stands for Stop Online Piracy Act, which is a legislation introduced by the US House of Representatives. Websites like Google and Wikipedia believe this is Internet Censorship and it will cripple the Internet.

Wikipedia showed a blackout page today to protest, and it will last for 24 hours.

In my paper with Feng Zhu (You can download the paper from here.) published in June 2011 in the American Economic Review (AER), we studied the effect of blocks of the Chinese Wikipedia on incentives to contribute.

A direct effect is obviously that many people can no longer contribute to Wikipedia.

The following figure suggests that contributions from China reduced significantly after the block.

The bar chart further compares the contributions from different regions before and after the block.

An indirect effect is that people who were not directly blocked also reduced their contribution significantly. In the paper we estimate that the reduction in contribution from those who are outside China is more than 40% within the 4 weeks after the block.

The next figure shows the number of new contributors. It can be clearly seen that each block (shown in shaded areas) significantly reduced new contributors.

The lesson learned: information needs to be free. Legislation not well thought out may bring unintended consequences.


Friday, August 27th, 2010 -- By ET

I came across a senior professor in Statistics in the hall way. He noticed that I moved my office.

I told him that the new office has a window, and my research productivity increased by 5 times.

He then said:”Wonderful, they should have installed 2 windows for you.”


Fixing CrossOver in Snow Leopard 10.6

Wednesday, August 18th, 2010 -- By ET

An update of the system broke my almost-perfect installation of Bakoma Tex in Mac OS X 10.6 through Crossover.

It took me a few months to suffer from this tragedy. Each time when I need to work with LaTeX, I need to load my Windows 7 from bootcamp. I tried to reset the Java Virtual Machine and so on, but it could not fix the problem.

Then today I thought about the error message it gave when I tried to open CrossOver. It says “can’t load ‘/system/library/perl/extras/5.10.0…”, so it strikes me that maybe it is Perl that needs fixing.

I went to /usr/bin to list perl versions:
/usr/bin$ ls -l perl*
lrwxr-xr-x 1 root wheel 9 Aug 18 10:50 perl -> perl5.10.0
-rwxr-xr-x 1 root wheel 86000 Jun 24 2009 perl.old
-rwxr-xr-x 1 root wheel 51200 Jun 24 2009 perl5.10.0
-rwxr-xr-x 1 root wheel 34816 Jun 24 2009 perl5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlbug
-rwxr-xr-x 1 root wheel 38307 Jun 24 2009 perlbug5.10.0
-rwxr-xr-x 1 root wheel 45068 Jun 24 2009 perlbug5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlcc
-rwxr-xr-x 1 root wheel 17983 Jun 24 2009 perlcc5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perldoc
-rwxr-xr-x 1 root wheel 255 Jun 24 2009 perldoc5.10.0
-rwxr-xr-x 1 root wheel 254 Jun 24 2009 perldoc5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlivp
-rwxr-xr-x 1 root wheel 12309 Jun 24 2009 perlivp5.10.0
-rwxr-xr-x 1 root wheel 12304 Jun 24 2009 perlivp5.8.9
-rw-rw-rw- 34 root wheel 807 Jun 24 2009 perlthanks
-rwxr-xr-x 1 root wheel 45068 Jun 24 2009 perlthanks5.8.9

Perl 5.10.0 is a 64 bit version. So I downgraded the perl to perl 5.8.9 by the following commands:

sudo rm perl
sudo ln -s perl5.8.9 perl

Then it worked like a charm.

My Experience with Kindle DX

Tuesday, July 6th, 2010 -- By ET

Kindle DX dropped its price by about $100, I guess at the pressure of iPad and other ebook devices. This is good news to a lot of people who do not really want to read books on LCD/LED-based technologies. I recently lost my Kindle in a trip, so it is time to think about the next device to get to replace it.

Wei Sun recently sent me an email asking about my opinions about DX, I will put my thoughts below, as my review of DX after using it for about 6 months.

(1) How many academic papers do you read on it, and on printed form?

I put quite some papers into it, it indeed reduced my printing. But journals like MgtSci, MktSci, ISR have too small fonts (kindle’s font is slightly smaller than the real journals), so in the end, it is still not too comfortable to read on DX. Once you rotate it to read in landscape mode, the fonts are larger (larger than the real journals), but you need to do a lot of page-ups and -downs to read them. If the new version can improve the speed of page flipping, it will be better, otherwise, it is a headache to read journal papers. AER and JPE are fine because they have large margins and kindle can crop them automatically.

(2) Do you find it helpful for academic research?

Yes. I mostly read ebooks on it. It is best for ebooks and I love it. For PDF versions of ebooks, I can use a software called calibre to convert to ebook format so I can change the font on the device. DX has 3GB, I put hundreds of books into it, there’s plenty of space.

(3) In your opinion, are there any better alternatives? How about Apple ipad?

ipad is not based on E-ink technology, so it does not make a big difference from reading on a laptop screen. The biggest reason for me to buy kindle is that it feels like reading from paper on kindle, and it’s good to the eyes. For reading papers, I would not use ipad. An alternative to kindle is a device called QUE, you can google for it. I learned about it after I bought my kindle last december, but it then pushed off the release date, the most recent news was that it announced it would release on June 25. I don’t think many people have evaluated it now, but this half year delay for sure put a big damage on the product. For one, ipad is out, many people simply get ipad, second, kindle dx has dropped price, QUE’s price at $500 is too high now. But a nice thing is it has the largest possible E-ink screen on the market. The monitor is similar to normal letter-sized paper, so it will be like exactly reading a printed paper.

(4) Overall, do you recommend it at $400 as a research tool?

Bottomline: I would say if you read a lot of books like I do now (I did not have time to read books when I was a student), it will be really useful. Otherwise, hold for a little while to see how QUE performs.

Computing Power Skirmish

Wednesday, January 6th, 2010 -- By ET
I have to make some heavy computations these days.  It gives me a chance to compare the computing power of my two computers.
On both machines, I have the same version of MySQL.  The program, written in Perl, conducts some complex calculations based on MySQL data.
Dell-Mac Skirmish
It is interesting to observe that my Macbook Pro actually performs better than my Dell Server.  The above figure shows 5 random sampling points, at which I count how many data records have been processed since last sampling.  Macbook constantly beats Dell.
Here are the specs:
Dual Quad-Core Xeon CPU 2.33GHz, 32GB Memory, 1T HDD, Windows 7 64 Bit
Macbook Pro:
Core 2 Duo Intel CPU 2.8GHz, 8GB Memory, 500G HDD, Snow Leopard 64 Bit
Due to the nature of the program, only 1 CPU-Core can be used, this may explain the disadvantage of Dell. (Although it has 8 cores, 4 times that of the Macbook Pro.)
A few more hours of data keep supporting the advantage of Macbook:


Fully Utilizing My Computing Power

Tuesday, January 5th, 2010 -- By ET

Before tweaking MySQL: it used 5% of the CPU and 250MB of the memory.

After tweaking, it uses 94.3% of the CPU and 2.33GB of the memory.

I could increase the memory use even more if I needed it. :-D


Join Multiple PDF files to a Single One on Mac

Monday, December 28th, 2009 -- By ET

I downloaded an ebook, it contains hundreds of one-page pdf files.  I certainly don’t want to upload all of these one-pagers to my Kindle. To concatenate these files, there is no simply and easy way.  Commercial software packages are available from $14.00 to 20Euro.

A simply perl hack does the job:

  1. Install the perl module:
    perl -MCPAN -e 'install("PDF::Reuse")'
  2. Create a perl program, call it “catpdf.pl”:

    use strict;
    use PDF::Reuse;


    for(@ARGV) {


  3. To concatenate, call it in two possible ways:
  • perl catpdf.pl a.pdf b.pdf, or
  • perl catpdf.pl *.pdf


As a side note, it is really easy to concatenate mp3 files on mac, just do:
cat *.mp3>output.mp3

A Complicated SQL Query

Monday, August 17th, 2009 -- By ET

Just wrote the most complicated SQL query I’ve ever encountered.  Since the table “clicks” has more than 5 million records, I’m going to wait till tomorrow to get the result.

The reason I have to write this is that I was trying to use a PERL program to do the same, and so far, after two days, it is still running with very slow progress.

create table sqlhistory as
SELECT a.inc, a.clicktime, (SELECT MAX(clicktime) FROM clicks b
where a.keyid=b.keyid and a.adid=b.adid and b.clicktime<a.clicktime
) as bctime, a.keyid, a.adid, a.bid, a.price, a.rank, a.reserve_price, a.reserve_price_new
FROM clicks a
where a.bid<>(select c.bid from clicks c
where a.keyid=c.keyid and a.adid=c.adid and c.clicktime=(SELECT MAX(clicktime) FROM clicks b
where a.keyid=b.keyid and a.adid=b.adid and b.clicktime<a.clicktime

limit 1


Sun Wei suggested a new way to do it, and the productivity increased almost 1000 fold


create table clickorder as
select * from clicks
order by keyid, adid, clicktime


create an auto_increment field called ‘inc’, and a new field called lastbid


update clickorder a, clickorder b
set a.lastbid=b.bid
where b.inc=a.inc-1 and a.keyid=b.keyid and a.adid=b.adid

The Easiest Way to Download Youtube Video

Monday, December 15th, 2008 -- By ET

Browse youtube normally.  When you find a good video, simply add “kick” in front of the url: youtube.com.

e.g. http://youtube.com/vid=SuS76sksa

can be turned to:


You are visitor number several since September 1, 2001

Copyright Xiaoquan (Michael) Zhang, 2004-2020. All rights reserved.
All trademarks property of their owners.