posted on 11:50 AM on Saturday 14 June 2008
The BIND XML format is a very deep one. Just open any of the XML file and you will understand what I mean. All the XML files can be obtained from here. Just get all the XML files divided by division as well as all the daily updated and you should have all of the data in BIND. Due to the sheer size of each record and the number of records. Parsing the data efficiently is going to be a bit of work. I tried c implementation of elementTree which is supposedly to be very fast. I tried the iterparse function call since I do not need the complete DOM. I also tried the native expat parser. Guess what, expat was quite a fair bit faster. I did not do any detailed benchmarkings, just wrote 2 short script each parsing the XML file without doing anything. I timed each script using the unix time command. So expat it was. Wrote the script and tested it against the BIND data, so far so good :-) The script is available for download. As you can see, this is the 3rd version and supposedly the best :-) There are a bit of codes there that rely on my own MySQL wrapper. But one should be able to get the classes to dump their data in any format that they want.