Monthly Archives: June 2008

My setup 20080614 update

A lot of the stuff have either been retired or they have failed. The PSC1350 printer has been replaced with a HP Photosmart C6180 which is a network AIO. The C6180 has fax capabilities and it has certainly been very useful. It being networked means that scanned documents can be send to any Windows or Mac machine. Pretty neat.

The HP Color Laserjet 2550 has died on us. Pretty upset about it since it was really expensive when we brought it and it has only been 3 years. Seeing that some old Laserjet 4s are still around and kicking, we would have thought that the 2550 would have lasted a fair bit longer. What am I to do with those toners πŸ™ When we got a Brother HL-4050 to replace that and boy is it good to have a built-in duplexer. And the prints look pretty good. Driver installation was also a breeze compared to the HP ones. Certainly changed my views on Brother for laser printers.

The home server has been completely replaced by a spanking new one running Ubuntu 8.04. The mainboard is an Abit IP35Pro which is based on the Intel P35 chipset with ICH9R so I get 6 SATA and 2 PATA connectors. The processor is now a Q6600 and it is paired with 2GB of RAM. Graphics is handled by a cheap Nvidia GeForce 7300 SE. The Coolermaster Elite 330 was a bit too small for the large number of drives. Harddisk temperatures were reaching over 50Β°C. Way too high. So forked out the cash for what I believed to be the first of the many Coolermaster Stacker, STC-T01. The case has 11 5.25″ drive bays at the front which can be converted for use with harddisk using a 3 to 4 converter. The plus point of this is that the converted comes with a 120mm fan to help cool the harddisk. And with 11 bays, I can put in 12 harddrives in 9 bays with 2 bays for optical drives. Right now, it is stuffed with 7 harddrives and 1 optical. This has required the use of the ITE8212 which is working well under Ubuntu 8.04 to connect one of the PATA drives. I have one last SATA port on the mainboard and 3 more PATA connections left.

– 3 x 500GB Western Digital WD5000AAKS-65YGA0
– 1 x 200GB Maxtor 6L200P0
– 1 x 300GB Seagate ST3300622A
– 1 x 250GB Maxtor 6L250R0
– 1 x 120GB Maxtor 6Y120M0
– 1 x DVD Writer LG DVDRAM GH20NS10

There are also 2 external harddisk in 3.5″ casings attached to the server via Firewire in a chain configuration.

– 1 x 160GB Western Digital WD1600JB-00GVA0
– 1 x 160GB Maxtor 6Y160P0

The power supply is now an AcBel M8 Power 670 which is my first modular power supply and it is certainly neat πŸ™‚ And it should provide enough power to the drives.

The Windows machine has also been completed changed. It is currently using a E8400, has 2GB of RAM, using a MSI K7N Platinum (Nvidia 750i chipset) mainboard, a Nvidia 8800GT and a Antec 550W power supply. I use it mainly for gaming and it is good. It is currently housed in the old Coolermaster Elite 330 that the server was previously using.

My wife has gotten herself another tablet laptop, a HP TX1220. In addition to her workplace laptop, she has in all, 3 laptops. It is cluttering up her table big time. Speaking of table, we finally got a table from Ikea to be her work space.


The BIND XML format is a very deep one. Just open any of the XML file and you will understand what I mean.

All the XML files can be obtained from here. Just get all the XML files divided by division as well as all the daily updated and you should have all of the data in BIND.

Due to the sheer size of each record and the number of records. Parsing the data efficiently is going to be a bit of work. I tried c implementation of elementTree which is supposedly to be very fast. I tried the iterparse function call since I do not need the complete DOM. I also tried the native expat parser. Guess what, expat was quite a fair bit faster. I did not do any detailed benchmarkings, just wrote 2 short script each parsing the XML file without doing anything. I timed each script using the unix time command.

So expat it was. Wrote the script and tested it against the BIND data, so far so good πŸ™‚ The script is availableΒ for download. As you can see, this is the 3rd version and supposedly the best πŸ™‚ There are a bit of codes there that rely on my own MySQL wrapper. But one should be able to get the classes to dump their data in any format that they want.

Bioinformatics resources

NCBI Gene resource

This is a resource that I heavily use. Very useful as it allows one to operate on the gene as a unit instead of multiple individual dna sequences (some of which are incomplete). The following are some material which are useful on NCBI Gene.

  • NCBI Handbook – this is a link to the relevant section on the NCBI Handbook.
  • Gene Help – link to the help page for NCBI Gene. Very helpful resource especially for programmers.
  • README – local copy of the Gene readme file on the NCBI ftp site. Describes the various files available for download at the FTP site.

Here are some searching tips (see Gene Help for more information):

  • To search for a gene using a accession, use this search term accession[accn]].
  • To limit by organism use this organism[orgn]].
  • To limit to the symbol, use this symbol[sym]].

Standardized human gene names

Standardized human gene names at HGNC (HUGO Gene Nomenclature Committee) can be found here.


GFF is a frequently used file format for genome annotations. See here for the specifications. It is basically a tab delimited file with specific fields. Very simple to use.

Strange gene names

This is not a resource, just something interesting. Have a look here for some interesting gene names.


Click here for a page on a BIND XML parser that I wrote.

PubMed search fields

Click here for the fields searchable in pubmed.

Enhancer Element Locator (EEL)

This is a good program to locate for transcriptional factor binding sites.

JASPAR Transcriptional Factor database

This is a good resource for motifs of transcriptional factors and it can be used by EEL.

Leaking coke can

See the following picture of a coke can that was leaking. One fine day, I opened the cupboard and found that there was a pool of brownish liquid. Upon further checking, I found that one of the coke cans was leaking. The hole was very small resulting in a fine stream of liquid that was very hard to see. How the hole got there is really interesting. The hole is definitely very small and the can is made of metal. Really weird.


Interesting gene names

In my job as a HTP at blueprint, I deal with a lot of genes. During data collection, I would be collecting information from several data sources and these data would be scrolling on my terminal (tells me that things are still moving). Sometimes, some really interesting gene names or gene description would scroll by. Here are some of them:

  • Heartbroken – The researcher that named this probably just broke up.
  • Brother of odd with entrails limited – I do not know what to say.
  • And there is always sevenless who has a whole family including boss (bride of sevenless, how adeptly named being the boss of the husband), sos (son of sevenless, in need of help) and dos (daughter of sevenless, she was there before windows).

Talking about interesting stuff, here is a page on a leaking coke can and another page on some drooping bananas.

Baby Esther

On 10 Match 2006 at 1305, baby Esther was born. Measuring 50cm and weighing in at 3.4kg, she is a chubby one πŸ™‚

20060310 17-04-40 016802

Her name is a play on Ernest's name and mine.


Ernest's name is almost the same as mine except for a single insertion and one mismatch while Esther's name is a perfect subsequence match with Ernest while the unmatched portion is HER meaning a girl πŸ™‚