This post details how I developed a Call Detail Record (CDR) generator for my final year project on telecoms fraud. The generator creates realistic CDRs using a simplified data model, focusing on caller, callee, call type, start time, and duration. Various user models (high, low, business, etc.) are configured with parameters like average call cost, standard deviation, and call frequency for different call types (local, national, etc.), along with likely call times. Random numbers are then generated within these parameters to create a diverse set of CDRs that accurately reflect the modeled behavior.
David Sifry of Technorati reports impressive blogosphere growth, doubling every 5 months and reaching 19.6 million blogs by October 2005. Around 70,000 new blogs are created daily, though spam blogs account for 2-8%. Sifry's data also shows a staggering posting rate, with 700,000-1.3 million new posts daily. While acknowledging Sifry's valuable contribution, I'd like to see more discussion regarding Technorati's API strategy and how they plan to leverage it for future development. Specifically, I'm interested in how Technorati is engaging with the community and incorporating user feedback, especially regarding feature requests on their Wiki.