Google Kythe

Google Kythe is a source code indexer and cross-referencer for code comprehension which describes itself as a "pluggable, (mostly) language-agnostic ecosystem for building tools that work with code".

The entirety of the Google team working on Kythe was laid off in April 2024, as part of a company push to move certain roles overseas.

Overview
The core of Google Kythe is in defining language-agnostic protocols and data formats for representing, accessing and querying source code information as data. Kythe relies on an instrumented build system and compilers that produce indexing information, semantic information and metadata in Kythe specified format. This information obtained from running an instrumented build is stored in a language-agnostic graph structure. Finally, this graph structure can be queried to answer questions about the code base.

Google Kythe is an open-source project being developed by Google. It is licensed under an Apache licence 2.0.

Grok
Google Kythe originates from an internal project called Grok.

Grok had been proposed by Steve Yegge in 2008. Yegge observed that software projects routinely use more than 3 programming languages, yet development tools tend to be language specific and don't handle multiple programming languages well. Adding support for a language to an IDE is hard and the ad hoc analysis tools in IDEs tend to be inferior to real parsers and compilers.

Some parts of Grok were publicly released even before Google Kythe was announced. In 2010, Google released a Python static analyzer which has been developed as part of Grok.

In 2012, C++, Java, Python, JS and "2 internal languages" were supported by Grok. There was a browser client with support for querying the database and visually navigating through the source code. There was an Emacs client.

Chromium Code Search Browser uses Grok index to provide quick links to definition for every symbol in the source code.

Grok

 * Notes from the Mystery Machine Bus, blog
 * Steve Yegge and Grok, blog
 * Stanford Seminar - Google's Steve Yegge on GROK, lecture
 * Project Grok - Steve Yegge - Emacs Conference 2013, talk
 * Steve Yegge on Scalable Programming Language Analysis, talk

Kythe

 * Kythe (Google Kythe Homepage)
 * Indexing Large, Mixed-Language Codebases, talk

Similar projects

 * Facebook pfff
 * srclib
 * Oracle Frappé
 * Microsoft Language Server Protocol designed as part of Visual Studio Code, with implementations for several languages and integrated by several other development tools.