TensorFlow #Project#: TensorFlow is an open source software library for numerical computation using data flow graphs.
Pytorch #Project#: Tensors and Dynamic neural networks in Python with strong GPU acceleration
scikit-learn #Project#: scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.
SciPy #Project#: SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.
2019-NNI #Project#: An open source AutoML toolkit for neural architecture search, model compression and hyper-parameter tuning.
2019-Thinc #Project#: A refreshing functional take on deep learning, compatible with your favorite libraries.
2019-Streamlit #Project#: Streamlit’s open-source app framework is the easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours! All in pure Python. All for free.
2021-Kedro #Project#: Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code. It borrows concepts from software engineering and applies them to machine-learning code; applied concepts include modularity, separation of concerns and versioning.
2020-Otto #Project#: Otto is an intelligent chat application, designed to help aspiring machine learning engineers go from idea to implementation with minimal domain knowledge.
2020-Spyder #Project#: Spyder is a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts. It offers a unique combination of the advanced editing, analysis, debugging, and profiling functionality of a comprehensive development tool with the data exploration, interactive execution, deep inspection, and beautiful visualization capabilities of a scientific package.
2014-Jupyter #Project#: Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages.
SPTAG #Project#: A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario.
2021-Kats #Project#: Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
2018-BERT #Project#: BERT is method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). 海量中文预训练 ALBERT 模型。
2019-GPT2 #Project#: Code and models from the paper "Language Models are Unsupervised Multitask Learners".
2021-gpt neo #Project#: An implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow.
2016-FastText #Project#: FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices.
Syntax & Semantic Analysis
Snips NLU #Project#: Snips NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts structured information.
Word2Bits #Project#: Word2Bits extends the Word2Vec algorithm to output high quality quantized word vectors that take 8x-16x less storage/memory than regular word vectors.
2020-TTS #Project#: TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pretrained models, tools for measuring dataset quality and already used in 20+ languages for products and research projects.