User talk:Sadhanareddy.p

'''CLASSIFICATION SOCIETY AUTOMATED SEARCH SERVICE Nov 3rd Meeting Minutes 2 P.M to 3.30 P.M :Perkin Bldg,Harvard'''

Members Present: Sadhana Palugulla, Dr.Michael Kurtz, Shuai shun

Main Agenda : To meet the client and discuss the requirements of the project

Proceedings : Meeting is called to gather the requirements for the CLASS project. We had two meetings with the client I believe we gathered enough details to start with. Scope of some of the feature client asking seems to be very big, we need your help to  finalize the requirements.

The proceedings of the discussion are as follows:

•	There are 70,000 metadata records available, there will be addition of around 10,000 records each year which is collected on weekly basis. The metadata will be sent to the programmer mail each week.

•	Goal of the project is to  automate the process of collecting metadata, parse the metadata and find  relevant  data related to each record and index the collected data.

•	Our project going to have four major  modules

1. Collecting meta data automatically from the mail or FTP server. Metadate must be parsed according to the given criteria eg.

scraping author name or  subject title etc. Any errors while  parsing the metadata must be notified to the user for further action. 2. For each term in the parsed metadata, we must find relevant data from  Internet, there is no particular location or format for

the data. One of the criteria for searching relevant data:

•	Seacrch in cross ref site using document title  or author etc, and get DOI, using DOI  our  search service must find the data from any data source.

•	If DOI doesn’t exits then data must be fetched from Internet, quality of the data is main goal of the project.

3. From the relevant data, text content must be scraped to present to lucene for indexing. Data must to be indexed for all the fields available in metadata. 4. UI with single search and advanced search options must be designed to  show the search results. 5. This search engine will be hosted on the University of Illinois.

•	We have decided to implement entire project using java, and for search indexing using Lucene. Although finding the relevant data from the Internet without any fixed location or format seems to be very challenging. We would really appreciate if you can guide us about any design criteria we should follow.

•	Michael has given us the sample metadata for reference and explained meta data format.

•	We haven’t decided on how to display the search results, which requires a further discussion.