Skip to content

Commit

Permalink
[docs] initialize sphinx
Browse files Browse the repository at this point in the history
  • Loading branch information
tswsxk committed Aug 7, 2021
1 parent 84dae59 commit e879f14
Show file tree
Hide file tree
Showing 19 changed files with 395 additions and 0 deletions.
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
29 changes: 29 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
EduNLP document and tutorial folder
===================================

Requirements
------------
See the requirements `docs_deps` in `setup.py`:
```sh
pip install -e .[doc]
```


Build documents
---------------
First, clean up existing files:
```
make clean
```

Then build:
```
make html
```

Render locally
--------------
```
cd build/html
python3 -m http.server 8000
```
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
Binary file added docs/source/_static/EduNLP.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions docs/source/api/formula.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
EduNLP.Formula
=======================

.. automodule:: EduNLP.Formula.ast
:members:
:imported-members:
6 changes: 6 additions & 0 deletions docs/source/api/i2v.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
EduNLP.I2V
============

.. automodule:: EduNLP.I2V.i2v
:members:
:imported-members:
2 changes: 2 additions & 0 deletions docs/source/api/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
EduNLP
======
39 changes: 39 additions & 0 deletions docs/source/api/sif.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
EduNLP.SIF
==============

SIF
----------
.. automodule:: EduNLP.SIF.sif
:members:
:imported-members:


Segment
----------
.. automodule:: EduNLP.SIF.segment
:members:
:imported-members:


Parser
--------
.. automodule:: EduNLP.SIF.parser
:members:
:imported-members:


Tokenization
---------------

tokenize
^^^^^^^^^^
.. automodule:: EduNLP.SIF.tokenization.tokenization
:members:
:imported-members:


formula
^^^^^^^^^
.. automodule:: EduNLP.SIF.tokenization.formula
:members:
:imported-members:
68 changes: 68 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('../'))

import sphinx_rtd_theme

# -- Project information -----------------------------------------------------

project = 'EduNLP'
copyright = '2021, bigdata-ustc'
author = 'bigdata-ustc'


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.viewcode',
'sphinx.ext.napoleon',
'sphinx.ext.mathjax',
'sphinx_toggleprompt',
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = 'en'

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
100 changes: 100 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
.. EduNLP documentation master file, created by
sphinx-quickstart on Sat Aug 7 19:55:39 2021.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
===============================================
Welcome to EduNLP's Tutorials and Documentation
===============================================
.. Logo
.. image:: _static/EduNLP.png
:width: 200px
:align: center

.. Badges
.. image:: https://img.shields.io/pypi/v/EduNLP.svg
:target: https://pypi.python.org/pypi/EduNLP

.. image:: https://github.com/bigdata-ustc/EduNLP/actions/workflows/python-test.yml/badge.svg?branch=master
:target: https://github.com/bigdata-ustc/EduNLP/actions/workflows/python-test.yml

.. todo: add all badges in EduNLP/REAMD.md
`EduNLP <https://github.com/bigdata-ustc/EduNLP>`_ is a library for advanced Natural Language Processing in Python and is one of the projects of `EduX <https://github.com/bigdata-ustc/EduX>`_ plan of BDAA.
It's built on the very latest research, and was designed from day one to be used in real educational products.

EduNLP now comes with pretrained pipelines and currently supports segment, tokenization and vertorization. It supports varies of preprocessing for NLP in educational scenario, such as formula parsing, multi-modal segment.

EduNLP is commercial open-source software, released under the `Apache-2.0 license <https://github.com/bigdata-ustc/EduNLP/blob/master/LICENSE>`_.

Install
---------
EduNLP requires Python version 3.6, 3.7, 3.8 or 3.9. EduNLP use PyTorch as the backend tensor library.

We recommend installing EduNLP by ``pip``:

::

pip install EduNLP

But you can also install from source:

::

git clone https://github.com/bigdata-ustc/EduNLP.git
cd EduNLP
pip install .



Getting Started
------------------
For absolute beginners, start with the `Tutorial to EduNLP <tutorial/en/index>`_ `(中文版) <tutorial/zh/index>`_.
It covers the basic concepts of EduNLP and
a step-by-step on training, loading and using the language models.


Contribution
--------------
EduNLP is free software; you can redistribute it and/or modify it under the terms of the Apache License 2.0.
We welcome contributions. Join us on GitHub and check out our `contribution guidelines <https://github.com/bigdata-ustc/EduNLP/blob/master/CONTRIBUTE.md>`_ `(中文版) <https://github.com/bigdata-ustc/EduNLP/blob/master/CONTRIBUTE_CH.md>`_.

.. toctree::
:caption: Introduction
:hidden:

self

.. toctree::
:maxdepth: 1
:caption: Tutorial
:hidden:
:glob:

tutorial/en/index
tutorial/en/sif

.. toctree::
:maxdepth: 1
:caption: 用户指南
:hidden:

tutorial/zh/index
tutorial/zh/sif
tutorial/zh/seg
tutorial/zh/parse
tutorial/zh/tokenize
tutorial/zh/vectorization
tutorial/zh/pretrain


.. toctree::
:maxdepth: 2
:caption: API Reference
:hidden:
:glob:

api/index
api/i2v
api/sif
api/formula
2 changes: 2 additions & 0 deletions docs/source/tutorial/en/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Get Started
===========
2 changes: 2 additions & 0 deletions docs/source/tutorial/en/sif.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Standard Item Format
====================
12 changes: 12 additions & 0 deletions docs/source/tutorial/zh/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
入门
=====

.. toctree::
:maxdepth: 1
:titlesonly:

sif
seg
parse
tokenize
vectorization
10 changes: 10 additions & 0 deletions docs/source/tutorial/zh/parse.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
语法解析
=========

在教育资源中,文本、公式都具有内在的隐式或显式的语法结构,提取这种结构对后续进一步的处理是大有裨益的:

* 文本语法结构解析
* 公式语法结构解析

公式语法结构解析
--------------------
19 changes: 19 additions & 0 deletions docs/source/tutorial/zh/pretrain.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
预训练
=======

在自然语言处理领域中,预训练语言模型(Pre-trained Language Models)已成为非常重要的基础技术。
我们将在本章节介绍EduNLP中预训练工具:

* 如何从零开始用一份语料训练得到一个预训练模型
* 如何加载预训练模型
* 公开的预训练模型


训练模型
---------

装载模型
--------

公开模型一览
------------
18 changes: 18 additions & 0 deletions docs/source/tutorial/zh/seg.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
成分分解
=========

由于教育资源是一种多模态数据,包含了诸如文本、图片、公式等数据结构;
同时在语义上也可能包含不同组成部分,例如题干、选项等,因此我们首先需要对教育资源的不同组成成分进行识别并进行分解:

* 语义成分分解
* 结构成分分解

语义成分分解
------------

结构成分分解
------------




2 changes: 2 additions & 0 deletions docs/source/tutorial/zh/sif.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
标准项目格式
============
Loading

0 comments on commit e879f14

Please sign in to comment.