Wikipedia is one of the biggest DB-based website around, with over 700Gb of data.
Both data and schema are available:
http://download.wikimedia.org/ (data)
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/maintenance/tables.sql?view=markup (schema)
moreover the result of a profiler running on the Wikipedia installation of MediaWiki is made available at:
http://noc.wikimedia.org/cgi-bin/report.py
this means having workload and queries of the actual Wikipedia. I spent in the last months quite a lot of time working on this dataset. Soon i will post the result of my analysis, which has been accepted for publication at ICEIS 2008.