DataScienceAI OpenSource List

Universal Toolkits

  • TensorFlow #Project#: TensorFlow is an open source software library for numerical computation using data flow graphs.

  • Pytorch #Project#: Tensors and Dynamic neural networks in Python with strong GPU acceleration

  • scikit-learn #Project#: scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.

  • SciPy #Project#: SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.

Business Intelligence


  • TensorSpace.js #Project#: Neural network 3D visualization framework, build interactive and intuitive model in browsers, support pre-trained deep learning models from TensorFlow, Keras, TensorFlow.js

  • Curve #Project#: An Integrated Experimental Platform for time series data anomaly detection.

Data Analysis

Feature Engineering

Machine Learning

  • NumPy #Project#: NumPy is the fundamental package for scientific computing with Python.

  • pandas #Project#: pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

  • Matplotlib #Project#: Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

  • feature-selector #Project#: Feature selector is a tool for dimensionality reduction of machine learning datasets

Deep Learning

  • tfjs #Project#: A WebGL accelerated, browser based JavaScript library for training and deploying ML models.

  • brain.js #Project#: brain.js is a library of Neural Networks written in JavaScript.

  • neurojs #Project#: neurojs is a JavaScript framework for deep learning in the browser. It mainly focuses on reinforcement learning, but can be used for any neural network based task. It contains neat demos to visualise these capabilities, for instance a 2D self-driving car.

Natural Language Processing

  • SnowNLP #Project#: SnowNLP 是一个 Python 写的类库,可以方便的处理中文文本内容,是受到了 TextBlob 的启发而写的,由于现在大部分的自然语言处理库基本都是针对英文的,于是写了一个方便处理中文的类库,并且和 TextBlob 不同的是,这里没有用 NLTK,所有的算法都是自己实现的,并且自带了一些训练好的字典。

  • nlp_compromise #Project#: a cool way to use natural language in javascript

  • flair #Project#: A very simple framework for state-of-the-art Natural Language Processing (NLP)


  • 2016-FastText #Project#: FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices.


  • 2019-Botpress #Project#: The ultimate open-source conversational platform with built-in natural language processing (NLU), easy-to-use graphical interface and dialog manager.

Syntax & Semantic Analysis

  • Snips NLU #Project#: Snips NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts structured information.

  • Word2Bits #Project#: Word2Bits extends the Word2Vec algorithm to output high quality quantized word vectors that take 8x-16x less storage/memory than regular word vectors.

  • ansj_seg #Project#: ansj 分词.ict 的真正 java 实现.分词效果速度都超过开源版的 ict. 中文分词,人名识别,词性标注,用户自定义词典。

  • gensim #Project#: topic modelling for humans

Knowledge Graph | 知识图谱

Dialogue System


  • Common Voice #Project#: The Common Voice project is Mozilla's initiative to help teach machines how real people speak.

  • DeepSpeech #Project#: Project DeepSpeech is an open source Speech-To-Text engine. It uses a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

  • wav2letter #Project#: wav2letter is a simple and efficient end-to-end Automatic Speech Recognition (ASR) system from Facebook AI Research.

Computer Vision

  • 2017-Detectron #Project#: Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN.

  • 2018-Faceswap #Project#: Faceswap is a tool that utilizes deep learning to recognize and swap faces in pictures and videos.

  • 2018-FastPhotoStyle #Project#: This code repository contains an implementation of our fast photorealistic style transfer algorithm.

  • 2018-videoflow #Project#: Python framework that facilitates the quick development of complex video analysis applications and other series-processing based applications in a multiprocessing environment.

Face Recognition



Distributed Training

  • BytePS #Project#: BytePS is a high performance and general distributed training framework.

Integrated Tools

  • Deepo #Project#: Deepo is a Docker image with a full reproducible deep learning research environment. It contains most popular deep learning frameworks: theano, tensorflow, sonnet, pytorch, keras, lasagne, mxnet, cntk, chainer, caffe, torch.

  • 2017-Turi Create #Project#: Turi Create simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app.

  • Ludwig #Project#: Ludwig is a toolbox that allows to train and test deep learning models without the need to write code.