Knowledgator

About Us

Knowledgator is an open-source machine learning research company founded in 2021

Our team brings together former computational biologists, AI researchers, and software engineers. We've experienced firsthand how invaluable high-quality data is for scientific progress, and how difficult and time-consuming it can be to gather, curate, and interpret it.

Our original mission was to build the most comprehensive, high-quality knowledge base in biology and medicine. While working toward that goal, we realized that the technology required for reliable information extraction simply didn't exist yet. So we shifted our focus to developing machine learning systems from first principles - systems that are efficient, accurate, interpretable, and capable of zero-shot knowledge transfer.

Between 2022 and 2023, we developed a series of multitask zero-shot models for text classification, named entity recognition (NER), and relation extraction, including our Universal Token Classification (UTC) model and comprehend-it. We also built a specialized T5-based encoder-decoder model for large-scale information extraction.

Once our technologies matured, we began releasing them openly. Open-source is core to our mission: to help humanity better manage, navigate, and expand scientific knowledge. Our models have since been downloaded more than 5 million times on Hugging Face.

Since 2024, we have been one of the core contributors to the GLiNER ecosystem. We refactored its core architecture for production stability and developed new bi-encoder and encoder-decoder variants, including the first multitask GLiNER implementation.

We are grateful to our community of contributors and supporters. Together, we are building tools that make scientific knowledge more accessible, more structured, and easier to build upon.

We are supported by:

[ Contact form ]

Let's build the future of open information together

Have questions, feedback, or collaboration ideas? Fill out the form - we'll get back to you soon.