so-called speak unintentionally, the listener intentionally.
premise is: a netizen found the BBS list of members of a medium-sized e-commerce website is open.
process is: this netizen regards this forum as the negative teaching material that does not pay attention to user privacy, and written in soft wen.
coincidence: most of the accounts in this forum are registered with Email.
pain heart is: BBS administrators until about a month later to solve this vulnerability.
cloud fly technical content is not very high, so use Xunlei (thunder) to get 10 thousand pages containing Email files. Because the web page uses a unified template, cloud fly mentally handicapped unlimited use of Dreamweaver search and replace function, delete the excess HTML tag.
at this time I came into contact with regular expressions, and I began to worship the greatness of the canonical form. With the regular formula, this tool makes data processing and data statistics very easy.
in the near 160 thousand Email address statistical analysis of the process, found some interesting things.
is extracted to "*@163.com" 44 thousand times; "*@126.com" 20 thousand times;
"*@sina.com" 10 thousand times; "*@sohu.com" 4 thousand;
"*@qq.com" 39 thousand times; "*@yahoo.com" 12 thousand times.
The above data show that QQ
mailbox forces can not be ignored, almost to shake the dominance of NetEase mailbox; also shows that 4 of households in the door, the mailbox is not what big Sohu loved; also shows the Yahoo mailbox in "a starved camel is bigger than a" stage, although Yahoo China several owners, but does not prevent we use the previous registration for the internationalization of YAHOO mail.
also extracts 9 thousand MSN numbers (hotmail mailboxes) from the data, 4 thousand Gtalk numbers (Gmail mailboxes), 1 thousand phone numbers (139 mailboxes), and, of course, 40 thousand QQ numbers (QQ mailboxes).
if the adoption of ‘*196’, ‘*197’, ‘*198’, ‘*199’ is to match the users born in 60s, 70s, 80s and 90s respectively. Can get 200 after 60, 900 after 70, close to 5000 after 80, and 700 after 90.
additionally, the number of Email to be extracted from the full date of birth is 2000. The name of the mailbox contains a full birthday, only indicating that the user is a computer novice, and therefore its mailbox password may also be very simple, for example, may be their home phone number or license plate number.