User:Niyogi

=magazine= Next steps:
 * downloaded 615 (385 .com) raw content (bz2 format)
 * build feature lists using new wikipedia lexicon

=category= Next steps:
 * have amazon and shopping for lexicon.txt
 * need ebay; figure out soap/php interface to ebay and get
 * rebuild cat maps

=dmoz= Next steps:
 * have 120K/174K front pages; 1link.csv has "key features" now
 * build corpus of key features for each category in 1link.csv

=ontok/ExtractAttributesfromText= Next steps:
 * prototyped code, seen it work for "thinkpad laptops"
 * test out search_by_product/brand on "600x ipod nano" etc.
 * write search_by_model code

=ontok/ExtractLocations=
 * use new city/state features to detect city/state combos quickly on "contact us" pages

=ontok/wikipedia/products= Next steps: foreach ($titlearr as $title) { expand the associations on     productbrand:   any product-brand combo appearing brandmodel:    anything that looks like a model (alphanumeric or 00 or short) productfeature: any product-feature combo appearing productunit:   any product-unit mapping } foreach ($brandarr as $brand) { // determine product associations } foreach ($brandmodel as $brand => $modelarr) { foreach ($modelarr as $model => $n) { // determine product associations } }  how to determine product associations read in the productbrand table read yhoo search response, google suggest reponse detect "ma" features from output for brand links, check the productbrand table for brand-model links, check the productbrand table
 * have wikipedia and product lexicon merged