Wikipedia talk:Size of Wikipedia

Embeddings Size
Would be nice to get a summary on vector embeddings with all-MiniLM-L6-v2, as well as discussion of potential tradeoffs regarding partial decompression, and compression algorithms... Wesxdz (talk) 02:33, 4 October 2023 (UTC)


 * It's about 120GB, roughly the same size as Wikipedia text currently.
 * https://huggingface.co/datasets/Cohere/wikipedia-22-12-en-embeddings?ref=txt.cohere.com Wesxdz (talk) 23:34, 16 October 2023 (UTC)