Amit Sheth

Amit Sheth is a computer scientist at University of South Carolina in Columbia, South Carolina. He is the founding Director of the Artificial Intelligence Institute, and a Professor of Computer Science and Engineering. From 2007 to June 2019, he was the Lexis Nexis Ohio Eminent Scholar, director of the Ohio Center of Excellence in Knowledge-enabled Computing, and a Professor of Computer Science at Wright State University. Sheth's work has been cited by over 48,800 publications. He has an h-index of 117, which puts him among the top 100 computer scientists with the highest h-index. Prior to founding the Kno.e.sis Center, he served as the director of the Large Scale Distributed Information Systems Lab at the University of Georgia in Athens, Georgia.

Education
Sheth received his bachelor's in engineering from the Birla Institute of Technology and Science in computer science in 1981. He received his M.S. and Ph.D. in computer science from Ohio State University in 1983 and 1985, respectively.

Semantic interoperability/integration and semantic web
Sheth has investigated, demonstrated, and advocated for the comprehensive use of metadata. He explored syntactical, structural, and semantic metadata; recently, he has pioneered ontology-driven approaches to metadata extraction and semantic analytics. He was among the first researchers to utilize description logic-based ontologies for schema and information integration (a decade before W3C adopted a DL-based ontology representation standard), and he was the first to deliver a keynote about Semantic Web applications in search. His work on multi-ontology query processing includes the most cited paper on the topic (over 930 citations ). In 1996, he introduced the powerful concept of Metadata Reference Link (MREF) for associating metadata to hypertext that links documents on the Web and described an RDF-based realization in 1998, before RDF was adopted as a W3C recommendation. Part of his recent work has focussed on information extraction from text to generate semantic metadata in the form of RDF. In his work, semantic metadata extracted from biological text is made up of complex knowledge structures (complex entities and relationships) that reflect complex interactions in biomedical knowledge. Sheth proposed a realization of Vannevar Bush's MEMEX vision as the Relationship Web, based on the semantic metadata extracted from text. Sheth and his co-inventors were awarded the first known patent for commercial Semantic Web applications in browsing, searching, profiling, personalization, and advertising, which led to his founding of the first Semantic Search company, Taalee.

In 1992, he gave an influential keynote titled "So far (schematically) yet so near (semantically)", which attested to the need for domain-specific semantics, the use of ontological representation for richer semantic modeling/knowledge representation, and the use of context when looking for similarity between objects. His work on using ontologies for information processing encompassed the approach for searching for an ontology-automated reasoning for schema integration, semantic search, other applications, and semantic query processing. The latter involved query transformations using different ontologies for user queries and resources and federated queries—a concept with associated measures and techniques for computing information loss when traversing taxonomic relationships.

Workflow management and semantic web services
In the early 1990s, he initiated research in the formal modeling, scheduling, and correctness of workflows. His METEOR project demonstrated the value of research with real-world applications; its tools were used in graduate courses in several countries, and its technology was licensed to create a commercial product and was followed up by METEOR-S. He led the research (later joined by IBM) that resulted in the W3C submission of WSDL-S (Semantic Annotation of WSDL), the basis for SAWSDL, a W3C recommendation for adding semantics to WSDL and XML Schema.

For both SAWSDL and SA-REST, he provided leadership in the community-based process followed by the W3C. He coauthored a 1995 paper in the Journal of Distributed and Parallel Databases, which is one of the most cited papers in the area of workflow management literature, with more than 2,330 citations, as well as the most cited among over 430 papers published in that journal. His key technical areas of contribution in workflow management include adaptive workflow management, exception handling, authorization and access control, security, optimization, and quality of service.

Information integration, database interoperability/integration, and database federations
In the 1980s, large organizations wanted to couple multiple autonomous databases to accomplish certain tasks, but how this could be accomplished from a technical perspective wasn't understood. Starting in 1987, Sheth gave a number of tutorials at ICDE, VLDB, SIGMOD, and other major conferences in the area of distributed (federated) data management and developed scientific foundations and architectural principles to address these issues of database interoperability. He developed a clean reference architecture, covered in his most cited paper on federated databases. It provides an architecture consisting of a range of tightly (i.e., global as a view) to loosely coupled (i.e., local as a view) alternatives for dealing with three dimensions: distribution, heterogeneity, and autonomy. Later, he led the development of a schema integration tool in the USA.

Sheth analyzed the limitations resulting from the autonomy of the individual databases and worked towards deep integration by developing specification models for interdatabase dependencies, allowing for a limited degree of coupling to ensure global consistency for critical applications. Together with Dimitrios Georgakopoulos and Marek Rusinkiewicz, he developed the ticketing method for concurrency control of global transactions that need to see and preserve a consistent state across multiple databases. This work, which was recognized with a best paper award at the 1991 International Conference on Data Engineering Conference, was awarded a patent and resulted in progress on multidatabase transactions by other researchers.

His work continued in the areas of the integration and interoperability of networked databases in enterprises to Web-based database access. He has also helped to characterize metadata and develop the techniques that extract and use metadata for integrated access to a variety of content, ranging from databases to multimedia/multimodal data.

Richer relationship identification on linked open data
Sheth has been a strong proponent of identifying a richer and broader set of relationships, such as meronomy and causality, on the Semantic Web. His idea of a "relationship web" is inspired from the vision of memex given by Vannevar Bush. Since the inception of linked data he emphasized the utilization of schema knowledge and the information present on the Web and in linked data for this purpose. These ideas led to a system called BLOOMS for the identification of schema-level relationships between datasets belonging to linked data. Another related system called PLATO allowed for the identification of partonomical relationship between entities on linked data.

Semantics and knowledge-empowered information extraction/NLP/ML, search, browsing, and analysis
In 1993, he initiated InfoHarness, a system that extracted metadata from diverse content (news, software code, and requirements documents) using a Mozilla browser-based faceted search. This system transitioned into a product by Bellcore in 1995 and was followed by a metadata-based search engine for a personal, electronic program guide and Web-based videos for a cable set-top box. He licensed this technology he developed at the University of Georgia for his company Taalee in the same year that Tim-Berners Lee coined the term Semantic Web. In the first keynote on Semantic Web given anywhere, Sheth presented Taalee's commercial implementation of a semantic search engine, which is covered the patent "System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising".

This 1999–2001 incarnation of semantic search (as described in the patent document) started with extensive tooling to create an ontology/WorldModel (today's knowledge graph) to design a schema and then automatically extract information (through knowledge extraction agents) and incorporate knowledge from multiple high-quality sources to populate the ontology and keep it fresh. This involves machinery for disambiguation to identify what is new and what has changed.

Then the data extraction agents which supported diverse content either pulled (crawled) or pushed (e.g., syndicated news in NewsML), called upon a nine-classifier committee (using bayesian, HMM, and knowledge-based classifiers) to determine the domains of the content, identify the relevant subset of the ontology to use, and perform semantic annotation. "Semantic Enhancement Engine: A Modular Document Enhancement Platform for Semantic Applications over Heterogeneous Content" is one of the earliest publications demonstrating the unusual effectiveness of knowledge-based classifiers compared with more traditional ML techniques. The third component of the system utilized ontology and metadata (annotation) to support semantic search, browsing, profiling, personalization, and advertising.

This system also supported a dynamically generated "Rich Media Reference" (a.k.a. Google's Infobox) which not only displayed metadata about the searched entity pulled from the ontology and metabase but also provided what was termed "blended semantic browsing and querying". He also led efforts in other forms/modality of data, including social and sensor data. He coined the term "Semantic Sensor Web" and initiated and chaired the W3C effort on Semantic Sensor Networking that resulted in a de facto standard. He introduced the concept of semantic perception to reflect the process of converting massive amounts of IoT data into higher level abstractions to support human cognition and perception in decision making, which involves an IntellegO ontology-enabled abductive and deductive reasoning framework for iterative hypothesis refinement and validation.

Real-time scalable social media analytics
In early 2009 he initiated and framed the issue of social media analysis in a broad set of semantic dimensions he called "Spatio-Temporal-Thematic" (STT). He emphasised the analysis of social data from the perspective of people, content, sentiment analysis and emotions. This idea led to a system called Twitris, which employs dynamically evolving semantic models produced by the Semantic Web project Doozer for this purpose. Twitris system can identify people's emotions (such as: joy, sadness, anger, fear, etc.) from their social media posts by applying machine learning techniques with millions of self-labeled emotion tweets.

Entrepreneurship
Sheth founded Infocosm, Inc. in 1997, which licensed and commercialized the METEOR technology from the research he led at the University of Georgia, resulting in distributed workflow management products, WebWork and ORBWork. He founded Taalee, Inc. in 1999 based on licensing VideoAnywhere technology based on the research he led at the University of Georgia. The first product from Taalee was a semantic search engine. Taalee became Voquette after merger in 2002, and then Semagix in 2004. In 2016, Cognovi Labs was founded based on the Twirtis technology resulting from the research he led at the Kno.e.sis Center of the Wright State University. He also served as its chief innovator and serves on the board. The technology was successfully used to predict Brexit and the US 2016 presidential election.

Awards

 * IEEE Computer Society W. Wallace McDowell Award 2023 “For pioneering and enduring contributions to information integration, data and service semantics, and knowledge-enhanced computing.
 * Selected 2020 ACM Fellow “For contributions to data semantics and knowledge-enhanced computing.
 * IEEE TCSVC Research Innovation Award, “in “recognition of his pioneering and enduring research, applications and adoption of distributed workflow processes and semantics in services computing,” 2020.
 * Distinguished Alumni Award for Academic Excellence, College of Engineering, Ohio State University, 2019.
 * Elected AAAS Fellow (Class of 2018) for his pioneering and enduring contributions on information integration, distributed workflow, and semantics and knowledge-based big data analytics.
 * Elected AAAI Fellow (Class of 2018) for significant and enduring contribution to semantics and knowledge-based techniques to transform diverse data into insights and actions.
 * 2017 Ohio Faculty Council Technology Commercialization Award (runner-up)
 * Elected IEEE Fellow (Class of 2006) for contributions to information integration and workflow management.
 * Received the Trustees Award for Faculty Excellence, the highest award given by Wright State University.
 * IBM Faculty Award 2004.